Faculty of Engineering & Information Technology ANU Crest
The Australian National University
Department of Computer Science, FEIT  
 

Previous Honours Topics

These projects topics were offered in 1999. While these - or similar - topics may not offered this year, they do serve as a guide to the diverse range of topics that will be available.

If you wish you could work on a topic like one of these, then why not write to the person who offered it. They might be persuaded to let you pursue a similar project under their supervision this year.

1999 Proposed Project Topics


Biometrics

Supervisor: Roger Clarke
E-mail: Roger.Clarke@anu.edu.au
Phone: 6288 1472

Analysis and prototyping of tools that enable organisations to identify individuals on the basis of some anatomical or physiological feature. Further sources of information relevant to the topic are available on the web.


Technological Means for Intellectual Property Protection

Supervisor: Roger Clarke
E-mail: Roger.Clarke@anu.edu.au
Phone: 6288 1472

Analysis and prototyping of tools that enable copyright owners to address various risks that their content is subject to, such as interception in transit, unauthorised or unpaid use by the intended recipient, unauthorised provision by an intended recipient to another party, and unauthorised or unpaid use of the content as a basis for producing new works.

It would be useful to have some prior understanding of copyright law and/or multimedia content and/or water-marking/content-fingerprinting and/or public key cryptography. But I can provide access to materials and people who can provide such information. Some sources of relevance to the topic are provided on the web.


Optimisation of Java Programs by Byte-code Transformation

Supervisor: Alonso Marquez
E-mail: Alonso.Marquez@anu.edu.au
Phone: 6279 8193

Java's high level byte-code makes very easy to implement interesting program transformations like specialisation, partial evaluation, in-lining, cloning, etc. This byte-code modification allows to implement many sort of optimisation and semantic transformation in a transparent way. Our group have been applying this technique as a way to add orthogonal persistence and object versioning to Java programs. We also have developed a first prototype of Object In-lining in Java. In this project we would like to explore several byte-code transformations for optimisation of Java code. In particular, we want to complete the Object in-lining prototype and evaluate different folding strategies.


Keyword-Based Approaches To Text Comparison

Supervisor: Peter Strazdins
E-mail: Peter.Strazdins@anu.edu.au
Phone: 6279 5041

Text documents, which include computer programs, reports and essays, can often be characterised by the types and distribution of the (key) words that they contain. In the case of computer programs, these words can include identifiers, which have no predefined meaning. Key word characteristics can thus be used to define some metric of similarity between documents. Practical uses of this include determining whether the two documents have the same main subject, and/or whether they have a common origin.

Most existing means of computing the similarity between such documents rely purely in the spelling of the words, without utilising any predefined meaning that a word may have. Another method (so far only partially developed) which is also reasonably successful uses the frequency counts of key words, which have some predefined meaning. In this case, the concepts of `synonyms' of the key words may be useful in certain contexts.

However, much more information can be extracted from the words in a document. One is applying the frequency counts idea to words without predefined meaning, ie program identifiers, which introduces some new challenges. Others include the occurrences of small groups of words (with some reasonable definition of locality), and the distribution of particular words throughout the document.

This project seeks to investigate simple but useful methods to analyse and compare documents based on their composition of (key) words, and to prove their usefulness over collections of documents from a variety of different areas. Its relation to text retrieval methods, which are also key word based, and have concepts of locality, frequency and synonyms, should also be investigated. As such it has a strong research component, and the diversity of its application means that there will be a strong interest in the academic community in any significant results.


Practical Approaches to Multi-cast Algorithms on the Fujitsu AP1000

Supervisor: Peter Strazdins
E-mail: Peter.Strazdins@anu.edu.au
Phone: 6279 5041

Multi-cast operations arise in many parallel computing applications, including backtracking problems, and parallel Fortran. Much research has been done on general multi-cast implementations, which can work well on random collections of nodes. However, the `optimal' algorithms are complex and hence there is considerable software overhead in both deriving the optimal routing for the multi-cast, and in the multi-cast itself. This detracts from the practical significance of this work. Also, the current research mainly focuses on start-to-end latencies to measure the performance of a multi-cast. However, there are some applications in which a more useful performance measure is the maximum time any participating nodes has to spend in the multi-cast.

However, in many applications that require multi-cast, there are regularities in the collections of participating nodes. In the case of parallel Fortran arrays, the multi-cast will involve a portion of a regular data structure with a regular distribution over a regular (logical) processor grid. Hence the participating nodes must also be regular. Possibly, a simple multi-cast algorithm with very low overheads can suffice for this purpose, and will yield far better performance than general multi-cast approaches.

Furthermore, the Fujitsu AP1000, whose nodes are arranged in a physical network with a torus configuration, has several hardware broadcast mechanisms which are very efficient. For example, the `C-net' broadcast could be modified to perform very efficient multi-cast when the collection of number in the multi-casts is more than half the total of nodes (this would faster than sending an ordinary message!) . It also has a row or column broadcast column using the net, and this could be fairly easily extended to perform a multi-cast over any rectangular sub-array of the processor grid (at the cost of sending an ordinary message). Also, whether the pipelining of large messages can be effectively implemented on the AP1000, even though its hardware lacks the capability to simultaneously send and receive messages, which is believed to be necessary for this purpose.

This project will firstly investigate the applications that require multi-cast operations, in order to determine the regularity of the multi-cast patterns and what is the most appropriate performance measure for that application. Then it will investigate what (semi-) specialised algorithms are most appropriate for the application with regular multi-cast patterns, with some versions utilising the AP1000's special hardware capabilities. The performance of these will then be compared with the general multi-cast implementations, both simple and sophisticated (some these have already been implemented on the AP1000), and an evaluation of these approaches will then be made.


Assembly Language Level Analysis of Programs

Supervisor: Peter Strazdins
E-mail: Peter.Strazdins@anu.edu.au
Phone: 6279 5041

While assembly language code appears to be monolithic and unstructured, much useful information can be extracted from the code. This includes:

  • recovering the algorithmic `control structure' (a la high level languages) of the code
  • highlighting (non-obvious) dependencies between instructions
  • identifying, in some formal way, the `function' of an instruction
  • identifying, in practice, redundant instructions and code sections

Such information is tedious to generate by hand, and is useful in the reading of assembler programs and also in measuring the similarity between assembler programs.

This project seeks to investigate simple and reliable methods to analyse assembler programs to isolate their `essential' algorithmic properties, over a space of input data of interest. The assembler code can either be source code or decoded object codes.

Methods include using static instruction analysis to recover the control structures, and then using dynamic instruction analysis to determine how instructions have been used and what instructional dependencies exist. Also, information on the `function' of an instruction can be deduced from which sets of input data the instruction was used. From this extra information, redundant condition evaluations and code sections can be identified, and the control structures deduced from the static analysis can be revised accordingly.

This project involves both substantial research and software engineering components. A program analysis system is first to be produced to implement these methods. This system must be flexible so that the assembly language format and (essential) semantics can be table-driven. Example assembly language code can be taken from the PeANUt and SPARC assembly languages. Its output should be (ultimately) an augmented version of the assembler code, in which it can present its analysis in a readable way side-by-side of the corresponding instructions. Such a system will have immediate use to assist tutors in the rather tedious task of marking assembler programs.

The second stage of this project is to connect the output of the program analysis system to a program comparator. Here, the concept of `instruction' equivalence (a 1-1 or 1-many mapping), must be added; from this, and information on the control structure nesting and `function' of each instruction, enables the comparator to determine what instructions from the two programs can be matched against each other (and how strongly). Again, information on `instruction' equivalence and matching `function' and control structure nesting should be table-driven. The output of the comparator should not only be an overall rating of similarity, but to also be able to `explain' its results, in other words, to give the mapping of matchable instructions from one program to the other.


Dynamic Web Searching

Supervisor: Paul Thistlewaite David Hawking
E-mail: Paul.Thistlewaite@anu.edu.au David.Hawking@cmis.csiro.au
Phone: 6279 3003 6216 7060

A dynamic web search system takes as input a site's home page and a query. It then enters the site by following its link structure (this is also known as spidering/crawling) and employs some dynamic search algorithm in order to find pages relevant to the query as quickly as possible. When searching within a known site, dynamic search systems produce superior results to those from search engines in terms of freshness (no index to become out of date) and coverage (search engine coverage may not be that good for any given site [Bharat, Lawrence]).

The quality of results from such a system depends on its guidance algorithm, and this may be measured according to how many documents are found within a certain time or bandwidth limit. The ACSys WAR (Web and Associated Research) project already has a TREC style [TREC] web data testbed suitable for adaption to the dynamic search case. Good research outcomes for an honours project would be:

  • To set up a new TREC style testbed for dynamic search algorithms,
  • To implement existing algorithms, in particular fish-search and shark-search from IBM Labs [Hersovici],
  • To experiment with new algorithms, with different heuristics and strategies, possibly incorporating new WAR project relevance scoring techniques, and
  • To evaluate all the algorithms using the testbed.

References

Krishna Bharat and Andrei Broder.
A technique for measuring the relative size and overlap of public Web search engines. WWW7.
Steve Lawrence and C Lee Giles.
Searching the World Wide Web. Science Magazine, Apr 1998, Vol 280, Number 5360
TREC. Text REtrieval Conference.
Michael Hersovici, Michal Jacovi, Yoelle S. Maarek, Dan Pelleg, Menachem Shtalhaim, and Sigalit Ur.
The shark-search algorithm. WWW7.

Getting Rid of Dynamic Dispatch

Supervisor: Richard Walker
E-mail: Richard.Walker@anu.edu.au
Phone: 6249 3785

Object-oriented (O-O) languages provide a `dotting' syntax for calling methods: e.g. in Java:

X ob;

ob = new Y();

ob.f(arg1,arg2);

Here, ob is defined to be of type X. But by the time the method call is reached, ob actually refers to an object of type Y (assumed to be a subclass of X). So it is Y's implementation of f that will be invoked, not X's. The run-time mechanism that chooses the particular implementation is called `dynamic dispatch', and it's one of the major run-time overheads of an object-oriented system. In this example, if the compiler could somehow `figure out' that ob will always be a Y, it could instead generate a call to Y's f, and perhaps in-line it, making further optimisations possible.

A number of algorithms have been proposed to analyse O-O programs to eliminate dynamic dispatch where possible. Some are fast, but extremely conservative. Others take a long time to run but are supposedly more successful. But are they worth the trouble?

In this project, you'll create a framework for implementing, testing, and evaluating these algorithms.

This is a `red hot topic'; results will be suitable for submission to one of the major compiler conferences.

The main prerequisite is a good understanding of how a compiler works.


User Interfaces for Display Logic Proofs

(two separate projects)

Supervisor: Rajeev Goré Jeremy Dawson
E-mail: Rajeev.Gore@anu.edu.au Jeremy.Dawson@anu.edu.au
Phone: 6279 8603 6249 5138

Formal (rigorous) logical proofs are needed in the proof of the correctness of hardware design (hardware verification) or of software programs (software verification). These proofs are themselves so complex that they have to be done with computer-based assistance. For any particular problem area (e.g., software verification), the right choice of an appropriate logical system helps in making a proof practicable. Thus the logical systems available, and the methods for computer based proofs in them, are an important area of computer science.

Display Logic is a paradigm for setting out certain logical systems, and for obtaining proofs in those systems. In most cases it cannot give an effective decision procedure, so a interactive computer-based proof system is required.

Some Display Logic systems have been implemented in Isabelle, an interactive theorem prover based on Standard ML. There is much further work that could be done. Below we have proposed two projects but please come and speak with either of us if you wish to work in this area.

The aim of the project is to design and implement a graphical user interface so that non-logicians can also use these proof systems. There are two alternatives, one theoretically minded, one application minded, and we would be happy to have two students working on one each.

Project 1
This may be done using XIsabelle, an existing GUI-based front-end for Isabelle, developed by DSTO. It is written using Expect/Tcl/Tk. In this case the project would involve tailoring and enhancing XIsabelle to make it suitable for the proof techniques used in Display Logic.
Project 2
An alternative project would be to produce a GUI front-end specifically for the Display Logic system for relation algebras. In this case it should be possible to provide a representation of statements in the relational logic as graphs (networks), and to relate steps in the proof to incremental changes in the graph.

The student will need to become acquainted with the design and implementation of GUIs.

For project 1: also Standard ML, Isabelle, basic proof theory and some propositional non-classical logics; project (1) will also involve Expect/Tcl/Tk.

Project 2 will require working out the details of how to portray relational assertions as graphs, and will involve programming the display of graphs on the screen. This project will also need basic proof theory and some propositional non-classical logics but at a lower level than the first.


Implementing a Term Rewriting System for Relation Algebras

Supervisor: Rajeev Goré Jeremy Dawson
E-mail: Rajeev.Gore@anu.edu.au Jeremy.Dawson@anu.edu.au
Phone: 6279 8603 6249 5138

A colleague from Germany has recently invented a term rewriting system called NRA for relation algebras using insights gained from the display logic calculus for relation algebras. The project involves implementing this system using existing theorem provers based on term rewriting e.g Larch. Rewrite Rule Laboratory, Isabelle etc.

The student would need to become acquainted with logic, term rewriting, and the chosen theorem proving system. Expertise in both relation algebras and the term rewriting system NRA is available locally. The system NRA is ideally suited to implementation using the abovementioned theorem provers, but the project is non-trivial for two reasons. First, some of the NRA rules have side-conditions associated with their applicability, and the student would need to find a way to express these constraints in the chosen theorem prover. And second because the decision problem in relation algebras is known to be undecidable, so no implementation is going to be perfect.


The BSP Programming Paradigm

Supervisor: Brian Molinari
E-mail: Brian.Molinari@anu.edu.au
Phone: 6249 3850

BSP (bulk synchronous parallel) is a platform independent parallel computing model, along with PVM and MPI. Recently a library version of the model, called BSPLib, has been defined and has been implemented on a number of platforms. A number of demonstrator programs have been published.

The project will involve the porting of BSPLib to the AP1000+ and the AP3000 systems. This can be first done using the MPI library available on those systems. After benchmarking measurements appropriate optimisations (more direct calls of the AP-series communication primitives) can be considered.

The public domain demonstrator programs can then be run. It should then be possible to write BSP versions of some local application programs developed for the AP-series machines.

This project would suit a student who is interested in systems programming and applications programming in novel situations (here parallel programming). The outcome (thesis and system documentation) would be very useful when applying for a software engineering position. If a more formal approach was taken it would suit a student aiming towards graduate studies.


Parallel Extensions to the FISh Programming Language

Supervisor: Brian Molinari
E-mail: Brian.Molinari@anu.edu.au
Phone: 6249 3850

The FISh programming language has been designed and implemented by Dr Barry Jay of UTS. It can be thought of as a language with functional and imperative aspects where data structures are restricted to where their shape can be determined statically. It can thus be translated into very efficient programs. More information is available from the FISh home page.

A considerable motivation for FISh is that the static knowledge of the shape of the data structures can potentially allow the partitioning of the computation across several processes. In other words, it can possibly allow compiler support for parallel programming in a way that is not possible with standard parallel programming languages.

This project will first involve extending the FISh implementation by appropriate parallel primitives. The FISh implementation is written in an object-oriented version of ML (with lex-yacc support) so the extension should not be problematic. It is not clear a priori what the extensions should look like. Options are

BSP
bulk synchronous parallel
MPI
message passing interface
PVM
parallel virtual machine

It will then involve assessing the extended FISh dialect as a parallel programming language, by writing a range of parallel programs for `canonical' problems and interpreting the situation.

This experiment focuses on explicit program construction by the programmer. Barry Jay has hypothesised that the detailed data structure knowledge in FISh might allow some automatic decomposition into parallel program fragments.

The design of FISh is strongly driven from modern type theory, and a sensitivity to these issues would be an advantage. Although some implementation work is involved, this is not primarily a constructive project. It would best suit a student looking towards graduate studies.


Experiments in Problem Solving Environments

Supervisor: Brian Molinari
E-mail: Brian.Molinari@anu.edu.au
Phone: 6249 3850

High performance computing has been driven by developments in architectures and programming languages. Essentially users have to interact with a HPC system on the system's terms. Recently it has been argued that this interaction should be on the user's terms. The user should be presented with a problem solving environment (PSE) for the problem involved, rather than a general programming environment. Modern communication technologies (Internet, world-wide web) can allow the physical decoupling of the user and the computational engine itself.

A research group at Imperial College has developed a PSE in the context of the AP-series machines (in the problem domain of data mining) that can be regarded as representative of this work. More information is available on the web.

This project will seek to build a demonstrator PSE in the context of the department's HPC resources (AP-series systems, computational science and engineering laboratory). The problem domain has not yet been selected. The Imperial College demonstrator should be available to the project and this will serve as a starting point for a local experiment.

Although the project will be `grounded' in a demonstrator system, the intellectual focus will be on the architecture of such PSE systems. The thesis will be constructed as a technology review, and will try to identify architectural issues that are common to such systems. A new PSE for a different problem domain should not require the wheel to be re-invented.

This project would suit a student who is interested in design and architecture issues in software engineering. The thesis would have to wrestle with the problem of describing such issues and perhaps the concepts of design patterns might need to be used. The outcome (thesis and system documentation) would be very useful when applying for a software engineering position. It would also suit a student intending to pursue graduate work in some aspect of experimental computer science.


Development of a Collaborative Virtual Environment

Supervisor: Brian Corrie Henry Gardner
E-mail: Brian.Corrie@anu.edu.au Henry.Gardner@anu.edu.au
Phone: 6249 5694 6249 8181

The enhanced feeling of `presence' that is provided by virtual environments motivates their incorporation into future communications technology. In fact, prototype collaborative virtual environments have already begun to be developed and have been demonstrated across continents (as was the case with an international hook-up involving the ANU in November 1998). This project will seek to design and implement a simple collaborative environment that will link the Barco Baron facility of the ACSys VE program with the WEDGE two screen system at the Research School of Physical Sciences and Engineering and ANU Supercomputer Facility. The issues which need to be addressed include the different operating environments of the two systems, the various lags in the communication links across campus and the demands of the target applications themselves. It is likely that the first target application will be a simple interactive game.


Design by Contract for Scientific Computing

Supervisor: Henry Gardner
E-mail: Henry.Gardner@anu.edu.au
Phone: 6249 8181

Design by Contract (DBC) is a major feature of the Eiffel programming language and is meant to facilitate the construction of correct object oriented code. It was going to be a feature of Java before it was left out to the `major regret' of the author. Phil Maker of the University of the Northern Territory has designed a system to implement a form of DBC for C/C++ which you can read about at the GNU Nana Home Page. This project will seek to evaluate the usefulness of such a system in scientific computation and will start by translating Maker's ideas (with his help!) to FORTRAN90. It will then proceed with a re-draft of a `living' scientific code using these methods and its usefulness will be evaluated in a reasonably scientific way.


Java Program Verification

Supervisor: Malcolm Newey
E-mail: Malcolm.Newey@cs.anu.edu.au
Phone: 6249 4506

Nothing beats formal methods as a means for ensuring trust in any engineered artifact, whether it is a bridge, an amplifier or avionic software. If a program is formally specified and appropriate proofs are constructed then properties of termination, safety, timing, space usage and correctness then the program can be guaranteed to have these properties with mathematical certainty.

If Java is to become the dominant language for engineering software and if it is to be used for trusted systems then the ability to construct proofs of properties must be there. Ultimately there must be good tools but these must be based on solid foundations, which is where we are right now.

Some work has been done on formalising the JVM (the Java Virtual Machine) in HOL (Higher Order Logic) but there is more to do and then there must be decision procedures written to make the burden of theorem proving as automatic as possible.

This project is about building on these foundations with the aim of proving some real properties of Java applets.


PMML Models for the Data Miner's Arcade

Supervisor: Graham Williams Rajehndra Nagappan
E-mail: Graham.Williams@cmis.csiro.au Raj.Nagappan@cs.anu.edu.au
Phone: 6216 7042 6279 8179

As we work towards the development of standards for the description of various types of models we build in data mining, a toolkit is required to support the visualisation and execution of those models. The Extensible Markup Language (XML) is proposed as a standard for this, leading to the development of PMML, a Predictive Modelling Markup Language. Parsers for XML have been developed in Java. The ACSys Data Miner's Arcade includes a number of tools that need to generate PMML, including Gamming, BMars, and CBos. This project will develop models for each of these types of predictive modelling tools, leading to specifications and Java implementations of XML models.


Evolutionary Hot Spots

Supervisor: Graham Williams
E-mail: Graham.Williams@cmis.csiro.au
Phone: 6216 7042

The ACSys Data Miner's Arcade supports the process of identifying Hot Spots in very large datasets. A new approach is under development, using ideas from evolutionary computation. An initial prototype has being developed and will form the basis of this project which will research the concept of interestingness and work towards capturing the concept in a formal language. Expressions of interestingness will then be evolved to explore for measures which match the requirements of the Data Mining exercise.


Virtual Environments for Data Mining

Supervisor: Graham Williams
E-mail: Graham.Williams@cmis.csiro.au
Phone: 6216 7042

Data Mining is inherently interactive, relying on the domain expertise to drive the data mining process, which in turn provides information to guide the domain expert's exploration. Visualisation is a very effective mechanism for interacting with domain experts but is limited to what can be displayed on the screen. Virtual Environments provide a lot more real estate for the presentation of information. This project explores the use of virtual environments for data mining, in general, and for the support of the ACSys Hot Spots methodology in particular.


Internet ACE

Supervisor: David Wolfram
E-mail: David.Wolfram@anu.edu.au
Phone: 6249 5690

ACE (Abstract Clause Engine) enables rapid prototyping and experiments with a wide variety of logical languages. This approach has a rigorous basis. It has been tested very successfully via a smaller-scale forerunner that was written in Standard ML and used internationally by industrial and academic research groups for several languages.

The core code of ACE has recently been re-implemented in Java, and the CUP Parser Generator has been used for the front-end. This project entails adding to the core to produce an ACE applet.

In particular, you will be designing and implementing a Java graphical user interface, implementing search procedures, and experimenting with the system.

References

Copies of the following papers are available on request.

  1. D. A. Wolfram.
    Semantics for Abstract Clauses. In Henk Barendregt and Tobias Nipkow, editors, Types for Proofs and Programs: International Workshop TYPES'93, Selected Papers, Lecture Notes in Computer Science 806 (Springer, Berlin, 1994) 366-383.
  2. D. A. Wolfram.
    ACE: the abstract clause engine. In Mark E. Stickel, editor, Proceedings of the Tenth International Conference on Automated Deduction (CADE-10), Lecture Notes in Artificial Intelligence 449 (Springer, Berlin, 1990) 679-680.

User Interface Design for Virtual Environments

Supervisor: Rajehndra Nagappan Brian Corrie
E-mail: Raj.Nagappan@cs.anu.edu.au Brian.Corrie@anu.edu.au
Phone: 6279 8179 6249 5694

For many years the desktop computer has enjoyed the concept of a windowing system to partition the display into multiple independent task spaces that the user can operate in parallel. To date virtual environments have offered no such generic task space partitioning system, relying on a single monolithic scene with a few widgets around the periphery.

This project will focus on the design of a user interface for virtual environments that is similar in spirit to a windowing system on a desktop. It is based on the idea of controlling a complex task by using independent task spaces to control simpler tasks. Issues such as navigation of the interface, task space occlusion, and VE 'desktop' design will be addressed. It is likely that a projection table based terrain visualisation will be used as the testbed for this work, but other modes of VE operation could also be explored.


Modelling Agent Behaviour Using Senses

Supervisor: Matthias Fuchs
E-mail: Matthias.Fuchs@anu.edu.au
Phone: 6279 8632

Agents are small, independent and automonous programs that attempt to intelligently work towards a goal often on behalf of a human. This project will attempt to model and direct the behaviour of a number of agents in an economics simulation. The agents should have a number of senses that provide information about changes in the world. The aim of the project will be to make the agents reason and direct their behaviour based on the input from their senses in a local way. That is, the agents will not require global knowledge of the system to act on some small part of it.


[DCS Home] [News & Events] [What is CS & IT] [Education] [People] [Mailing Lists] [Research] [StReaMS] [Contact Details] [FEIT Home] [ANU Home]


Authorised by the Dean, FEIT
© The Australian National University
ANU CRICOS Provider Code - 00120C
Last modified Feb 01, 2001
General enquiries: 6125 4043
For more information or to make comments please email webmaster@cs.anu.edu.au