Skip navigation
The Australian National University

Honours topics

Half of your time as an honours student is spent working on a project. But first you have to find a project topic.

The "official" reference for projects proposed by potential supervisors is the CECS projects database.

There are projects there available for all levels of research, including short projects, summerscholarship projects, Honours projects, Masters and PhD projects. All potential research students at any level are urged to browse it.

If you see a project that looks interesting, email the potential supervisor about it. Don't be afraid to discuss possible variations to the listed project: what appears on the web site is generally more of a suggestion than a rigid specification.

You don't have to be constrained by what you see on the project website. If you have something that you would like to work on as a project, then feel free to discus it with the honours convener to see if it could form the basis of an honours project and to identify a possible supervisor. Look at the web pages of Computer Science staff to find out their research interests. Remember that projects may also be supervised by people outside the College, or even ontside the University: from CSIRO or NICTA, for instance.

Former Project Topics

For your interest, here is an archive of Honours project proposals from previous years. Some of them are quite ancient now, of course, but they may help to give you ideas of the kind of thing which is suitable.
  • 2009 Proposed Project Topics
  • 2008 Proposed Project Topics
  • 2007 Proposed Project Topics
  • 2006 Proposed Project Topics
  • 2005 Proposed Project Topics
  • 2004 Proposed Project Topics
  • 2003 Proposed Project Topics
  • 2002 Proposed Project Topics
  • 2001 Proposed Project Topics


    2001 Honours project proposals



    Cluster Computing

    • implementing a distributed, redundant file-system for bunyip using existing disk storage devices on each node
    • implementing a streamlined TCP stack for the special case network found in bunyip, optimised for speed
    • utilising the parallel ports of the bunyip nodes to provide a low-latency MPI barrier operation (requires h/w skills)
    • investigate utilising low-cost fire-wire (IEEE 1394) PCI cards to implement a fast low-latency interconnect for a COTS cluster.
    (More details to appear here soon.)

    Parallel Algorithms for Predictive Modelling

    Contact: Peter Christen, Markus Hegland

    Data Mining applications have to deal with increasingly large data sets and complexity. Only algorithms which scale linearly with data size are feasible for successful data mining applications.

    In our group we are developing algorithms for predictive modelling of high dimensional and very large data sets that are both scalable with the number of data records as well as number of processors if implemented in parallel. These algorithms are based on techniques like finite elements, thin plate splines, wavelets, additive models and clustering. Prototype implementations have been developed in Matlab, Python, C and MPI, and we have used them successfully in several real world data mining projects.

    This research project will involve the further development of parallel data mining algorithms based on the available mathematical algorithms and prototypes. Besides scalability other important aspects are data distribution, load balancing and integration into a larger data mining framework currently being developed in our group.

    To our disposal we have a 196 processor Beowulf Linux cluster, a 12 processor Sun Enterprise 4500 shared memory multiprocessor at the CSIRO CMIS and a 13 processor Fujitsu VPP 300 vector-processor at ANUSF. Data mining is one of the core projects of APAC at the ANU.


    Parallel Scalable Clustering Algorithms

    This topic was taken by a student.

    Contact:
    Peter Christen, Markus Hegland

    Clustering large and complex data sets is a core algorithm used in data mining and machine learning. To be able to handle very large data sets parallel high-performance computing is needed. In the past few years several parallel clustering algorithms for different architectures have been developed and published.

    This research project aims in a first phase at comparing various parallel clustering algorithms. Two topics of interest are scalability (I/O, data distribution and communication) and load balancing (adapting the program dynamically to changing load situations). I a second phase a scalable parallel clustering algorithm should be developed and evaluated.

    To our disposal we have a 196 processor Beowulf Linux cluster, a 12 processor Sun Enterprise 4500 shared memory multiprocessor at the CSIRO CMIS and a 13 processor Fujitsu VPP 300 vector-processor at ANUSF. Data mining is one of the core projects of APAC at the ANU.


    Developing E-Business Applications

    This topic was taken by a student.

    The Internet has forever changed the requirements for enterprise software systems. The very nature of the Internet brings to bear new pressures on applications that are not commonly experienced by traditional networked information systems. The impact of well-known non-functional requirements such as manageability, scalability, security, reliability and transactions is dramatically increased when applications are opened up to potentially limitless numbers of concurrent users.

    These users require efficient service every second of every day in a year. In Internet and e-business environments, the cost of failures is magnified, as each minute of downtime equates to a minute when customers cannot use the business' services. On the Internet, a competitor's site is just a simple mouse-click or two away.

    In response to the demands of the increased scope of enterprise systems across the Internet, a new genre of software product has evolved, known as Application Servers. These products provide a ready-made distributed system infrastructure for building high-performance, enterprise-scale information systems.

    The aims of the proposed project are:

    1. To find out how EJB and Web technology can be combined and used to build a typical on-line transaction example application;
    2. To come up with an architecture that is best suited to the proposed application;
    3. To run performance testing using the example application against a middleware product, to collect and analyse the performance data, to find out how well the middleware product cope with this kind of application under heavy load.

    Similar work has been carried out using CORBA technology within our group (Software Architecture and Component Technology) at CSIRO. The reports can be found at http://www.cutter.com. The published work only concentrated on the CORBA middleware technology and did not include the web technology.

    Some relevant information can be found in the following sites:
    http://java.sun.com/products/ejb
    http://www.javasoft.com/j2ee.

    Contact: Shuping Ran, CSIRO   Shuping.Ran@CMIS.CSIRO.AU


    Real-time Multi-channel Low-latency High-performance Audio Signal Processing

    This project is offered for Honours CS students and would be supervised by Bob Williamson in the Department of Engineering. All the necessary hardware referred to below is available in the Department.

    The aim of this project is to implement a multi-channel very high performance low-latency real-time audio signal processing solution on an Intel Pentium III based PC running Linux. It would suit a student interested in high performance signal processing coding techniques, linux hacking, and digital audio. The aim would be to release the final code under GPL.

    To the best of my knowledge this sort of thing has not been done before. It involves some fancy kernel programming (the RTLinux), plus some tricky programming for the SSE core.

    You will need to be able to patch and rebuild linux kernels to do this project.

    Background

    Multichannel signal processing can be used for beamforming microphone arrays and ambisonics and other surround sound systems. The computational demands and data communication demands are quite high. Traditionally reserach into multichannel signal processing has made use of specialised hardware, such as the Creamware Scope system. We have used one of these systems in the Department of Engineering. It has the considerable disadvantage that it needs to be programmed in a (rather arcane) assembly language.

    Modern commodity CPUs such as an Intel Pentium III have additional SIMD instructions specifically designed for multimedia processing. These instructions are the basis for the fast code used on bunyip.

    As well as being able to compute the desired functions quickly, one needs to get the data in and out of the machine. One can now buy for under $1000 a RME Project Hammerfall DIGI9652 which allows for 24 simultaneous inputs and outputs at 96kHz/24 bits in real time. (You do need outboard AD/DA converters, but these are now widely availabe) An ALSA driver exists for this card.

    The final ingredient is low-latency real-time scheduling. There exists a clever add-on the the Linux kernel called Real time linux which allow for very low latencies and guaranteed real-time scheduling.

    Project Specifics

    There are a few steps in the project. In essence it is to combine the above ingredients to demonstrate a high performance low-latency glitch free multi-channel digital audio processing solution built entirely on GPLable components.

    Build Native ALSA applications for the RME Hammerfall DIGI 9656

    Although there is a device driver that would now appear to work for the Hammerfall card (version 0.6pre1 of the ALSA drivers), the existing native ALSA applications (such as ALSAPLAYER do not work yet). The aim would be to get a simple multichannel recording and playback system working using the Hammerfall card.

    Implement Multichannel filtering exploiting the SSE instruction set

    The next step would be to implement some actual signal processing on the multiple data streams (for example FIR/IIR filtering) using the SSE extensions. This Application Note might be a good starting point.

    Ensure Low-Latency By Using a Real-Time Linux Kernel

    The next step would be to do all this under a real-time kernel. This would involve patching the kernel, following these instructions and making the signal processing tasks run as real-time priority.

    Fancy Interface

    In case this was not enough work, one can easily envisgae all sorts of fancy interfaces and applications built on the above code base. For instance one could glue it all together as plugins for Arts Builder or Alsaplayer (for example).

    Agent Negotiation

    Supervisor: Roger Clarke
    E-mail: Roger.Clarke@anu.edu.au
    Phone: (02) 6288 1472

    Investigate the practicability of implementing agents, and negotiations between agents, in particular by measuring the increase in the complexity of code as the complexity of the interactions between the two agents increases.

    The suggested implementation context is the W3C's Platform for Privacy Preferences (P3P) specification. This defines how client-side software can store people's privacy preferences, and server-side software can store privacy policy statements by corporations and government agencies. The two agents can then negotiate with one another in order to permit a transaction to be entered into (such as the provision of shoe-size and credit-card details), to prevent the transaction, or to refer a mismatch to the consumer for a decision.

    It is envisaged that a succession of prototypes of increasing completeness would be implemented, and key process and product factors would be measured. This would depend on a thorough appreciation of theories relating to agents, P3P, software development and maintenance, software complexity, and development and maintenance productivity.

    Some background reading is at:


    Conception, Design and Implementation of Nyms

    Supervisor: Roger Clarke (Visiting Fellow)
    http://www.anu.edu.au/people/Roger.Clarke
    E-mail: Roger.Clarke@anu.edu.au
    Phone: (02) 6288 1472

    Most network protocols deal in Entities and Identifiers.

    An Entity (which covers things as diverse as a person, a company, a network-connected device, and a process) has precisely one Identity (which is, very roughly speaking, its 'essence'. In the case of a device or process, that might be operationalised as the specification of the functions it performs).

    An Entity has one or more Identifiers, each of which is a data-item or group of data-items which reliably distinguish it from other Entities, especially those of the same class.

    In complex networks, this model is too simplistic, and two additional concepts are necessary.

    A Role is a particular presentation of an Entity. An Entity may have many Roles; and a Role may be associated with more than one Entity. As examples, think of a SIM Card as an Entity, and the multiple Mobile-Phone housings into which it is successively placed as Roles; and then there are the many Roles that you play yourself, as student, worker, sportsperson, voter, dole-bludger, scout-master, tax-payer, lover ... There are various ways in which you can accidentally or on purpose enable someone else to adopt one of your Roles (e.g. give them your password; although there are juicier examples than that).

    A Nym is a data-item or group of data-items which reliably distinguishes a Role. However, because a Role is not reliably related to an Entity, there is no reliable mapping between a Nym and the underlying Entity or Entities (i.e. the mapping is not only m:n, but it's not determinable).

    I'm particularly interested in Nyms because of the vital part they are going to play in keeping us reasonably sane and reasonably free, as the State and the Hypercorps increasingly abuse personal data in order to impose themselves more and more on individuals. But of course they could turn out to be important within open networks too, quite independently of privacy concerns. I'd like to see some serious research done, drawing on the emergent literature, and performing some laboratory experimentation.

    Here are some starting materials (these happen to be on my own site, but they point to plenty of other sources as well):
    Concepts; human identity; tools; digital persona; PKI; Notes on the relevant section of the Computers, Freedom & Privacy Conference in 1999; some references; Intro (needs some updating); Inet; tracking crims.


    File-Sharing Technologies

    Supervisor: Roger Clarke (Visiting Fellow)
    http://www.anu.edu.au/people/Roger.Clarke
    E-mail: Roger.Clarke@anu.edu.au
    Phone: (02) 6288 1472

    If you didn't do COMP3410 - Information Technology in Electronic Commerce in Semester 2, 2000, the assignment that I set was: "Enormous tensions currently exist between, on the one hand, the need for musicians and music publishers to earn revenue and, on the other, the desire of consumers to get their music for free. Provide constructive suggestions as to how technology might be used to address these problems".

    There are some leads in the slides for Lecture 5, at: http://www.anu.edu.au/people/Roger.Clarke/EC/ETIntro.html#LOutline.

    File-sharing technologies started out as centralised repositories, then became centralised directories of dispersed repositories (the Napster model), and are rapidly maturing into forms in which both the repositories and the directories are dispersed (the Gnutella model).

    But do they work? On the one hand, are there still choke-points? And on the other, is it feasible to run both anarchic, revenue-denying schemes and paid services all using the one architecture? Are there significant differences among the emergent products? Is a taxonomy feasible? And does such a taxonomy lead to the discovery of variants that no-one's implemented yet?

    Here are some starting materials:

    http://www.anu.edu.au/people/Roger.Clarke/EC/FDST.html; http://www.anu.edu.au/people/Roger.Clarke/EC/KingEP.html.

    A (still growing, probably incomplete) catalogue of technologies: http://www.anu.edu.au/people/Roger.Clarke/EC/FDST.html#Friends.

    http://www.anu.edu.au/people/Roger.Clarke/EC/Bled2K.html.

    Centralised Storage of Biometrics

    This topic was taken by a student.

    Supervisor: Roger Clarke
    E-mail: Roger.Clarke@anu.edu.au
    Phone: (02) 6288 1472

    Investigate the extent to which the centralised storage of biometric measures of humans creates the risk of masquerade.

    Biometrics is a generic term encompassing a wide range of measures of human physiography and behaviour. Measures of relatively stable aspects of the body include fingerprints, thumb geometry, aspects of the iris and ear-lobes, and DNA. Dynamic measures of behaviour include the process (as distinct from the product) of creating a hand-written signature, and the process of keying a password.

    Schemes can be devised that apply biometrics in such a manner that the measure is only ever known to a chip held by the individual, and the device currently measuring the person concerned. (This is analogous to the mechanism used for protecting secure PINs input on ATM and EFT/POS keyboards).

    It is very common, however, for proposals for biometric schemes to involve central storage of the biometrics, as police fingerprint records do now, and as current proposals by the Australian Government would do in relation to DNA records. This raises the question as to whether a person who gains access to the store could masquerade as that individual. Possible uses would be to gain access to buildings, software or data, to digitally sign messages and transactions, to capture the person's identity, to harm the person's reputation, or to `frame' the person.

    It is envisaged that this would involve an analysis of a range of literature relating to identification and biometrics, and investigation of the extent to which abstract representations of biometric data can be used to produce artefacts that will satisfy biometric measurement devices. Because the task is likely to be partly dependent on information that is not in the public domain, it is probably necessary to conduct interviews with biometrics technology providers, and perhaps also physicists, biologists and engineers.

    Some background reading is at:


    Image Watermarking

    One of the important copyright issues in Electronic Commerce is the copyright of images downloadable from the Internet. Copyright law could prevent people from copying other's work, but one could claim a downloaded image to be his/her own created by modifying it a little bit.

    The techniques of watermarking can be used to identify whether two different copies of images are substantially the same, and can also have originator's information. A well watermarked image should remain its watermark for small modifications on the image, unless the modification makes substantial distortion on the image.

    There have been study on image watermarking, but mainly on raw images and the theory is far from being well developed. This project will mainly focus on watermarking images which can be put on web pages, such as gif and jpg files.

    Extensive work can include watermarking on Internet based animations, or more general, on multimedia which has a strong commercial impact. This is however not considered to be covered by this project but for future consideration after finishing this project.

    Candidates should master image programming, theory for gif and jpg coding, or have the enthusiasm to study these on their own.

    Competitive candidates should have good programming skills and good knowledge in mathematics.

    Contact: Chuan-Kun Wu
    Chuan.Wu@cs.anu.edu.au   (02) 6125 5692


    Image Cryptography

    A new direction in cryptographic research has appeared in recent years. The idea is to split an image into two or more pieces of garbage images. Instead of copying or implementing what has been done, this project will focus on (1) splitting an image into two or more with one or more split images meaningful, and (2) how to reduce the "key", the piece of image to be kept secret. A further step is to migrate the technique to jpg and gif images.

    Candidates should have image programming skills, knowledge of gif and jpg coding, or have the enthusiasm to study these on their own.

    Competitive candidates should have good programming skills and good knowledge in mathematics.

    (Note: Candidate doing this project can collaborate with the one, if there is one, who's doing "Image Watermarking" for image programming and gif/jpg coding stuff).

    Contact: Chuan-Kun Wu
    Chuan.Wu@cs.anu.edu.au   (02) 6125 5692


    Processing XML in ML

    Supervisor: Clem Baker-Finch
    E-mail: Clem.Baker-Finch@cs.anu.edu.au
    XML (Extensible Markup Language) is a fairly recent means of adding structural information to the text of a document. It is extensible, meaning that the vocabulary of the markup is not fixed - each document can reference a meta-document, called a DTD (Document Type Definition), which describes the particular markup capabilities used.

    The use of XML is not restricted to the traditional idea of a document. Many organisations propose to use XML as an interchange format for pure data produced by applications like graph-plotters, spreadsheets and relational databases.

    The general tree structure of XML documents can be conveniently represented as algebraic data types in functional programming languages like ML or Haskell. Similarly, DTDs can be translated into ML or Haskell data type definitions.

    The aim of this project is to develop and embed a domain-specific language for XML in ML (or Haskell), and to investigate the application of functional programming language features such as strong typing and higher-order functions to the processing of XML documents.


    Random Testing of ML Programs

    Supervisor: Clem Baker-Finch
    E-mail: Clem.Baker-Finch@cs.anu.edu.au
    Random data sets have been shown to be an effective means of conducting conformance and regression tests on software systems.

    In functional programming languages such as ML and Haskell, it is possible to specify laws or correctness conditions that the software engineers expect to be true of functions, directly in the programming language itself. It is also possible to write a (higher-order, polymorphic) function that exercise the program components on a set of random data, checking whether the specified correctness conditions are satisfied.

    The aim of this project is to develop and embed in ML, a domain-specific language of testable specifications which the tester uses to define expected properties of the function under inspection.


    Multi-format Document Standards

    Contact: Tom Worthington , Ian Barnes, Roger Clarke, Ramesh Sankaranarayana

    Investigate open standards to allow an academic "paper" and accompanying audio-visual presentation to be prepared as one electronic document. Implement an open source software prototype demonstrating similar features to a word processor, web tool, presentation package and AV package. All functions should work on the one document, rendered as a typeset printed document, as a web page, a live "slide show" and pre-recorded audio-visual presentation with audio, video and synchronised slides. The software should generate documents incorporating accessibility features for the disabled in conformance with the W3C Web Content Accessibility Guidelines. Document text, images and other content would be shared by all tools (for example the text of the WP document would be the default notes for the slide show and note the default captions for the deaf on the video).

    Part of the Scholarly Communications System Prototype. See: http://www.tomw.net.au/2000/scsp.html


    Server/Browser Protocols for Available Bandwidth

    Contact: Tom Worthington , Ian Barnes, Roger Clarke, Ramesh Sankaranarayana

    Investigate open standards for web servers and browsers to negotiate content formats to suit the user's requirements and bandwidth available. Implement an open source demonstration. Implement content translation tools, where servers and browsers do not support suitable formats. As an example the resolution of images would be reduced to suit small screens and low bandwidth links, video would be converted to low resolution still key frames and synchronised audio. The system would be capable of displaying a multi-media presentation with audio and "talking head" video in real time on a hand held device with a medium speed wireless Internet connection and on a set-top box web browser, as well as more conventional desktop computers. Accessibility features, as described in W3C Web Content Accessibility Guidelines, would be integrated with bandwidth and multimedia features (for example the notes of a live presentation would be the default closed caption for the video presentation and also replace the audio where bandwidth was limited).

    Part of the Scholarly Communications System Prototype. See: http://www.tomw.net.au/2000/scsp.html


    Automatic Web Page Layout

    Contact: Tom Worthington , Ian Barnes, Roger Clarke, Ramesh Sankaranarayana

    Investigate artificial intelligence algorithms for automatically laying out web pages and produce an open source prototype software. Document layout "hints" for different renderings of the document (print, web, slideshow and AV) would be explicitly encoded in the document (using XML or similar format) or would be inferred from an existing screen layout. Documents would be rendered to suit the user's requirements and the capabilities of their display device and communications link, through features in the display device and/or in a server (for low capability display devices). As an example multiple frames would be used on large screens and one frame with links on small screens. The software would generate documents incorporating accessibility features for the disabled as described in W3C Web Content Accessibility Guidelines. Multiple renderings of information objects (for example multiple language versions for text, text captions for images) would be available.

    Part of the Scholarly Communications System Prototype. See: http://www.tomw.net.au/2000/scsp.html


    Site and Search Engine Summarisation

    Potential supervisors: Nick Craswell or David Hawking

    Write and evaluate a site summariser. Optionally apply the same principles to subject-specific search services.

    Yahoo! [1] is an example of a successful company which performs manual site summarisation: all their listed sites have hand-written summaries. InvisibleWeb [2] do the same job as Yahoo!, but summarising subject-specific search services. Either type of company might find automatic summarisation techniques useful.

    There are many document summarisation algorithms, including those based on the content of one or many documents, and those based on hyper-links [3]. The project is to find out which summarisation methods are best at describing the content of a site (and/or subject-specific search service). Evaluation experiments would build on the good work of a 2000 ANU DCS honours student.

    [1]
    http://www.yahoo.com
    [2]
    http://www.invisibleweb.com
    [3]
    http://www.mri.mq.edu.au/~einat/incommonsense/

    Web information acquisition

    Potential supervisors: Nick Craswell or David Hawking

    Technically, the first stage in Web search is finding some candidate documents. Search engines do this with large scale Web crawls, of up to one billion documents [1]. However, there are other approaches. Focussed crawlers locate a smaller number of documents, relevant to a particular subject area. Meta-searchers locate documents by querying existing search interfaces. Once acquired, the documents can be indexed or otherwise processed to find relevant information on the user.

    An interesting honours topic is to compare different methods for information acquisition, to see which works best. Picking a particular subject area (for example, Genealogy, Linux, Tech Stocks, Australiana or Acid Jazz) the student could compare the efficiency and effectiveness of several approaches (for example, full crawl [1], "efficient" crawl [2], focussed crawl [3] and meta-search [4]). We have an extensible Java crawler built for precisely this type of experiment, so willingness to work with Java would be a plus.

    [1]
    http://info.webcrawler.com/mak/projects/robots/faq.html
    [2]
    http://citeseer.nj.nec.com/cho98efficient.html
    [3]
    http://www7.scu.edu.au/programme/fullpapers/1849/com1849.htm
    [4]
    http://dbpubs.stanford.edu:8090/pub/2000-36

    Effective intranet search

    Potential supervisors: Nick Craswell or David Hawking

    There are several different types of Web search. Some of them are well understood, having been the subject of research for thirty years. One of the most useful, but least researched, is home page finding. For example, typing the query "ANU" to find http://www.anu.edu.au/. (Other interesting search types include include fact finding, online service finding and finding people with subject expertise within a large organisation.)

    The project is to conduct an experiment using Panoptic software, comparing different search algorithms, to find which is most successful. We already have crawls of over 200 Australian organisations to work with.


    Global Shared Memory and Parallel I/O Models for Scientific Computation

    Contact: Alistair Rendell (Please note I will be away from 28th Dec until 16th January).

    This year's award of a Gordon Bell prize to the ANU BUNYIP cluster clearly demonstrates that it is possible to build cheap high performance computers from inexpensive commodity parts. As well as the raw CPU power provided by such clusters, the provision of large aggregate memory and disk make clusters attractive for many scientific applications. The ability to exploit these last two resources in a convenient and portable manner is, however, non-trivial. In this project we will look at models for using the global memory and disk of massively parallel processors, concentrating largely (but not exclusively) on the "Global Array" and "Chemio" libraries.

    Global Arrays is a library that allows the user to define and manipulate "array objects" on distributed (and shared) memory parallel processors. An important feature of the library is its ability to access memory that may physically reside on a remote processor in a "one-sided" fashion. That is the remote memory is accessed without co-operative message passing. We note that the concept of "one-sided" messages was introduced in the Message Passing Interface-2 (MPI-2) standard and it is now supported by a number of computer vendors.

    Chemio is a library that grew out of the scalable I/O initiate in the US. It has been designed based on the requirements of chemical applications, but it can of course be used in any application area. It primarily provides functionality in three areas i) the capability to map global arrays to disk, ii) support for private files that are local to a particular node (processor and attached disk), iii) support for shared distributed files. As with the Global Arrays MPI-2 also supports some parallel I/O capabilities, and an important part of the project will be contrasting the MPI-2 I/O functionality with Chemio.

    Both the Global Arrays and Chemio have been used as part of the NWChem chemistry package. Information on both libraries are available at the NWChem web site.

    The initial phase of the project will be to look at implementing both libraries on BUNYIP and other ANU platforms. During the course of the project you will acquire skills in C, UNIX environment programming, MPI and Fortran. The project will involve interaction with the new APAC national facility, and probably also with colleagues at Pacific Northwest National Laboratory.


    Keyword-Based Approaches To Text Comparison

    Supervisor: Peter Strazdins
    E-mail: Peter.Strazdins@anu.edu.au
    Phone: (02) 6125 5041

    Text documents, which include computer programs, reports and essays, can often be characterised by the types and distribution of the (key) words that they contain. In the case of computer programs, these words can include identifiers, which have no predefined meaning. Key word characteristics can thus be used to define some metric of similarity between documents. Practical uses of this include determining whether the two documents have the same main subject, and/or whether they have a common origin.

    Most existing means of computing the similarity between such documents rely purely in the spelling of the words, without utilising any predefined meaning that a word may have. Another method (so far only partially developed) which is also reasonably successful uses the frequency counts of key words, which have some predefined meaning. In this case, the concepts of `synonyms' of the key words may be useful in certain contexts.

    However, much more information can be extracted from the words in a document. One is applying the frequency counts idea to words without predefined meaning, ie program identifiers, which introduces some new challenges. Others include the occurrences of small groups of words (with some reasonable definition of locality), and the distribution of particular words throughout the document.

    This project seeks to investigate simple but useful methods to analyse and compare documents based on their composition of (key) words, and to prove their usefulness over collections of documents from a variety of different areas. Its relation to text retrieval methods, which are also key word based, and have concepts of locality, frequency and synonyms, should also be investigated. As such it has a strong research component, and the diversity of its application means that there will be a strong interest in the academic community in any significant results.


    Efficient Task Scheduling on Cluster Computers

    Supervisor: Peter Strazdins
    E-mail: Peter.Strazdins@anu.edu.au
    Phone: (02) 6125 5041

    Details are here.


    Computer Systems Development on the AP+

    Supervisor: Peter Strazdins
    E-mail: Peter.Strazdins@anu.edu.au
    Phone: (02) 6125 5041

    Details are here.


    Evaluation and Optimization of Communication Patterns on a Beowulf

    Supervisor: Peter Strazdins
    E-mail: Peter.Strazdins@anu.edu.au
    Phone: (02) 6125 5041

    Details are here.


    Web Teleoperation of a Mobile Robot

    Potential supervisors: David Austin and Alex Zelinsky

    A robot arm was made controllable using web browser technology (http://telerobot.mech.uwa.edu.au) at the University of Western Australia in 1994 and has been under development ever since using an iterative approach based on analysis of operator behaviour. This is a forerunner for what is likely to be a a wide range of future teleoperated applications including remote operation of mining equipment and web control of machine tools for remote manufacturing.

    This project is to set up the mobile robot owned by the Robotic Systems Lab at the ANU for web based teleoperation using the software developed for the arm at UWA. The project involves adapting existing software for a mobile application, devising stategies for transmitting robot state information effectively to operators, developing strategies for keeping the robot online, and monitoring operator behaviour.

    The telerobot arm at UWA is continually in use attracting more than 5000 operators a month and the developer of the mobile teleoperated robot will also gain the satisfaction of seeing their work in use by the web community.

    Students undertaking this project are expected to have good programming skills in JAVA/C++, as well as an interest in developing web-based programming skills. Scholars will work closely on a day-to-day basis with the project supervisors and PhD students.

    See also:

    [1]
    http://www.syseng.anu.edu.au/rsl/
    [2]
    http://telerobot.mech.uwa.edu.au/


    Autonomous Submersible Robot

    Potential supervisors: Chanop Silpa-Anan and Alex Zelinsky

    Australia has one of the longest coastline in the world. Its continental shelf contains enormous natural resources including coral reefs, marine fisheries, and oil and gas reserves. Such a resource must be managed and protected. The efficient management of our coastal resources will require autonomous machines that are capable of routine monitoring, inspection and maintenance tasks like inspecting and repairing weld joints of oil rigs.

    The Robot Systems Lab at the ANU has built a submersible robot, called Kambara, which will serve as a test bed for experimental research in autonomous sub-sea operations. This robot will allow researchers to study the fundamental problems of controlling the operations of an underwater vehicle. Most present day submersible robots work via tele-operation where a human operator (on a remote ship) controls every aspect of the robot's operations. This is inefficient both in terms of human resources and the execution speed of tasks. Our aim is to incorporate on-board sensors such as sonar, gyros, depth sensors and computer vision that will allow the robot to make its own on board decisions. The ANU submersible has: a max operational depth of 20m, a mass of 100kg and a volume of less than one cubic meter. The five thrusters produce five independent degrees of freedom controllable by on-board systems.

    Projects include working on the development of a vision system for the submersible, building a system of sensors, models and algorithms that estimate the robots position, orientation, and velocity in the water. And the development of a graphical operator interface that receives state information from the robot and presents it in a graphically intuitive form to the operator. Testing of the submersible is done in an on-campus tank.

    Students undertaking this project are expected to have good programming skills in C/C++, as well as having an interest in learning how distributed computer systems are built and programmed. Scholars will work closely on a day-to-day basis with the project supervisors, PhD and Masters students.

    See also: http://www.syseng.anu.edu.au/rsl/sub/


    Autonomous Off-road Vehicle

    Potential supervisors: David Austin and Peter Brown

    The Robotic Systems Laboratory is in the process of developing an autonomous off-road vehicle. A standard 4 wheel drive vehicle is being modified to be "driven by wire". Once operational the vehicle is expected to provide a valuable research platform for mobile robotics and applied computer vision experiments.

    The vehicle will have a number of 'smart' actuators which will control acceleration, steering and braking. Sensors will measure steering angle and the relative motion using a gyroscopic sensor.

    The aim of this project is to develop a C++ class interface to control the vehicle. Once developed the interface will allow researchers to control the vehicle in software without specific knowledge of how the vehicle is actually controlled. With this interface the vehicle will be a valuable and easy to use research platform. To complete this project it is expected that the developer will become familiar with the operation and control of each intellegent actuator and sensing device including gathering metrics from field experiments. Then develop a C++ class and a set of sub-classes to unify the control of these devices using a sound C++ design methodology.

    Students undertaking this project are expected to have good programming skills in C++ and an interest in controlling physical devices with computers.


    Implementing a Prover for Transitive Tense Logics using the Logics Work Bench

    Supervisor: Dr Rajeev Goré, DCS and CSL

    Project: The Logics Work Bench (http://www.lwb.unibe.ch/) is a suite of efficient theorem provers for classical and various nonclassical propositional logics. All procedures are based upon sequent or tableau calculi. The LWB contains a programming language with which implementors can write further procedures for their own favourite propositional logic. Recent research has led to the definition of decision procedures for transitive tense logics which model time as a sequence of branching or linear points. Such a model of time is of fundamental importance in applications in Artificial Intelligence, Hardware Verification, and Hybrid Systems. The project is to use the in-built programming language of the LWB to implement these new decision procedures for tense logic Kt.S4.

    Background: You will need a strong background in theoretical computer science or mathematics. A grounding in logic would be useful but is not essential as there is plenty of local expertise in this area. This project is ideal for students who wish to pursue a PhD in any area of theoretical computer science.

    Details: You will need to become familiar with sequent and tableau proof calculi for various nonclassical logics, and become familiar with modal logic (local expertise abounds). You will need to become familiar with the inner workings of the LWB (non-local expertise available). You will have to understand the theoretical algorithms, and translate them into a working prototype to be included in the LWB.


    Neural Networks for Structured Data

    This topic was taken by a student.

    Supervisor: Professor John Lloyd, CSL

    Traditionally, machine learning techniques have used the attribute value language (AVL) to represent individuals. While the AVL allows efficient learning systems to be constructed, it places limits on the applicability of these learning machines in domains where the individuals have complex internal structure. In view of this, there is good motivation to extend current learning techniques to handle individuals of these more complicated data types. In recent years, extensive research has been conducted on the application of a typed, higher-order logic in decision tree learning, with a good measure of success. There are strong reasons to believe that other learning techniques, for example artificial neural networks, support vector machines, instance-based learning, etc., will benefit from similar extensions.

    This project will look at extending neural networks to handle structured data that includes tuples, sets, multisets, lists, trees, graphs, and so on. There are plenty of important and interesting application domains to which these extended neural networks can be applied, including text mining and bioinformatics. Several of these applications will be investigated.


    Adaptive Neural Network Architectures

    This topic was taken by a student.

    Supervisor: Professor John Lloyd, CSL

    This project involves researching techniques for neural networks to adapt as a part of their training, thereby making the training process more complex than simply setting weights. This is by analogy with the growth of the human brain where individual links between nodes are created as a function of the training data. Nodes will also have the capability to multiply, again with this growth being dependant on the training data for the network, so as to maximize the capabilities of the network while minimizing its complexity, the risk of overfitting the data, and the number of degrees of freedom.

    The adaptation algorithm for neural networks will be designed in a novel way, implemented, and then tested on various datasets to demonstrate the effectiveness of the approach.


  • Updated:  18 February 2011 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address.