Skip navigation
The Australian National University

Honours topics

Half of your time as an honours student is spent working on a project. But first you have to find a project topic.

The "official" reference for projects proposed by potential supervisors is the CECS projects database.

There are projects there available for all levels of research, including short projects, summerscholarship projects, Honours projects, Masters and PhD projects. All potential research students at any level are urged to browse it.

If you see a project that looks interesting, email the potential supervisor about it. Don't be afraid to discuss possible variations to the listed project: what appears on the web site is generally more of a suggestion than a rigid specification.

You don't have to be constrained by what you see on the project website. If you have something that you would like to work on as a project, then feel free to discus it with the honours convener to see if it could form the basis of an honours project and to identify a possible supervisor. Look at the web pages of Computer Science staff to find out their research interests. Remember that projects may also be supervised by people outside the College, or even ontside the University: from CSIRO or NICTA, for instance.

Former Project Topics

For your interest, here is an archive of Honours project proposals from previous years. Some of them are quite ancient now, of course, but they may help to give you ideas of the kind of thing which is suitable.
  • 2009 Proposed Project Topics
  • 2008 Proposed Project Topics
  • 2007 Proposed Project Topics
  • 2006 Proposed Project Topics
  • 2005 Proposed Project Topics
  • 2004 Proposed Project Topics
  • 2003 Proposed Project Topics
  • 2002 Proposed Project Topics
  • 2001 Proposed Project Topics


    2003 Honours project proposals



    Induction in a large temporal domain - the intelligent file pre-fetcher for Linux

    Supervisor: Eric McCreath

    Machine learning may be defined as learning a description of a concept from a set of training examples. This description may then be used for prediction. This project involves investigating the application of Machine Learning with an operating system to pre-fetch files. The central outcomes of this research will be an analysis of machine learning approaches to undertake such a task. This would also involve forming a theoretical model of the domain in question, which would be vital for the analysis.

    Details:

    Operating Systems often employ read-ahead windowing approaches for sequentially accessed files. Basically the next part of a file is pre-fetched into memory even before it is requested. This greatly improves performance of the system.

    Often when we interact with computers the same files are required during our interaction. For example we may always sit down and start-up our mail reader. If the computer could learn this pattern of usage and pre-fetch these files the responsiveness of the system could be improved. This project involves developing such a system and investigating its potential.

    Within machine learning there are a number of issues that make this project both challenging and interesting. These include: the temporal nature of the data, the large number of examples, how the leant hypothesis could be applied to recommend which files to pre-fetch. Also within Operating Systems there are a number of challenges, such as how the system hooks into the kernel and how to measure the performance of the system when it is functioning with the operating system.

    The results of this research have the potential to be published in both Operating Systems and Machine Learning publication. The project would also form a good starting point for graduate study.

    Students will need to be able to reason logically and should have done well in theory subjects (either within Mathematics or Computer Science). Successful completion will require both determination and imagination. Finally, student considering embarking on this project must have good programming skills and will need to gain an understanding of the Linux kernel.

     

    Parallelisation, Machine Learning and a Graphical User Interface for a Data Cleaning and Matching System

    Supervisor: Peter Christen
    E-Mail: Peter.Christen@anu.edu.au

    Many organisations today collect massive amounts of data in their daily businesses. Examples include credit card and insurance companies, the health sector (e.g. Medicare), police/intelligence or telecommunications. Data mining techniques are used to analyse such large data sets in order to find new, unexpected and potentially valuable information (like patterns and rules, or outliers).

    Often several data sets have to be linked to obtain more detailed information. As most data is not primarily collected for data analysis purposes, a common unique identifier (like a patient or customer identification number) is missing in many cases, and so probabilistic techniques have to be applied to link data sets. Data or record linkage - also called data matching, data integration, data cleansing or data scrubbing - is a a rapidly growing field with applications in many areas (business mailing lists, health and biomedical research, fraud and crime detection, census statistics, etc.) and it is an important initial step in many data mining projects.

    The ANU Data Mining Group is currently working in collaboration with the NSW Department of Health, Centre for Epidemiology and Research on the improvement of probabilistic techniques for record linkage. We are mainly interested in developing high-performance techniques for linkage that can be run on parallel computers like the APAC National Facility (a Compaq supercomputer with 480 processors), or clusters of workstations or PCs. Prototype software has been developed and published under an open source software. See http://sourceforge.net/projects/febrl and the project web page at http://datamining.anu.edu.au/linkage.htmlfor more details.

    Student research in this project would include:

    • Exploration of parallelisation and development of techniques for distributed data access, (dynamic) load balancing, fault tolerance and recovery within the framework of the developed prototype software.

    • Exploration of machine learning and data mining techniques to improve the linkage quality. One of the disadvantages of most machine learning techniques is their need of training data, which is hardly available in the given application area. While some initial work has been done by various researchers, there are still many open questions and issues.

    • As users of a record linkage system most likely will not be computer experts, but rather biomedical researchers, statisticians, etc.), a graphical user interface needs to be developed. Research issues involved in this area are for example how to efficiently support a user when human intervention is needed on the linkage status of a record pair.

    A student working on this project would be involved through

    • participation in an exciting research project
    • object-oriented cross-platform programming (using Python, Tkinter, wxPython, etc.)
    • exploration and development of machine learning and data mining techniques
    • distributed parallel computing on SMPs, COWs and supercomputers (APAC)
    • open source software development (see http://sourceforge.net/projects/febrl).

    For more information, publications, and further links please visit the project web page at http://datamining.anu.edu.au/linkage.html.

     

    Large Scale Data Grid projects (Internet Futures)


    Contact: Markus.Buchhorn@anu.edu.au , CSIT-N244, 6125-8810

    The Internet Futures group of the ANU is working with high-speed networks, advanced network services, visualisation and telecollaboration technologies, and large scale data facilities, to support a diverse range of user communities. Many of its activities have a national focus, e.g. through the GrangeNet project.

    In the large-scale-data area, the users include Physics, Astronomy, BioInformatics and Cultural Archives (e.g. audio, spoken and music, or video). Archives can be easily many Terabytes (1000's of Gigabytes) in size, need to be delivered to users on networks ranging from multi-Gigabit down to slow dial-up modem speeds, and may need to interface to large-scale computing systems.

    The IF group has several potential projects feasible for an honours project. These include

    • development of new Bio-Mirror (www.bio-mirror.net) archive services, to greatly reduce their bandwidth needs, while increasing their performance and usability to users. This project would have international impact, as a good solution would be adopted by many biomirror sites.
    • development of Virtual Observatory elements, to support astronomers' real-time access to large scale observational data sets
    • and others, depending on your interests...

    Technologies in these fields include grid toolkits (Globus for example), metadata standards, various network protocols and performance issues, large scale databases and database/directory access methods, and web services and SOAP. Programming skills in C, C++, Java, Perl, Python, are all useful.

    You would get to work with a range of users across the ANU, on real-world problems. You'd also get access to the APAC Mass Store facility (a 1.2 Petabyte (1200 Terabyte) tape robot) and associated computing hardware including possibly the APAC National Facility supercomputer, the ANU Supercomputer Facility's Visualisation Laboratory (VizLab) , and the Internet Futures' Access Grid (AG) (telecollaboration) nodes in various locations on campus.

    You can go as deep into these areas as you like (although we have some ideas), and maybe even come back for a PhD if you're really keen :-)

     

    TeleCollaboration (videoconferencing on steroids) projects (Internet Futures)


    Contact: Markus.Buchhorn@anu.edu.au, CSIT-N244, 6125-8810

    The Internet Futures group of the ANU is working with high-speed networks, advanced network services, visualisation and telecollaboration technologies, and large scale data facilities, to support a diverse range of user communities. Many of its activities have a national focus, e.g. through the GrangeNet project.

    TeleCollaboration can be seen as very high-end videoconferencing, with many additional features. Some of these features require the user's space to be intelligent or flexible, others require advanced network or application services. Any kind of collaboration, for research and for education can be supported. This could include music teaching (masterclasses), architecture, history or medical training with visualisation or even virtual reality interfaces, and large-scale data browsers and instrument consoles (e.g. in Astronomy or TeleMicroscopy).

    The Internet Futures' telecollaboration activities revolve mainly around Access Grid (AG) nodes. There are currently (Dec/Jan 2003) 3 full-size nodes being built at the ANU (including one in the CS&IT building), and several additional nodes are being discussed. There are also plans for smaller-scale (even desktop) nodes, and ports to a variety of systems and other technology spaces (e.g. the Wedge virtual reality/visualisation facility). Around Australia 15 nodes are currently in planning or construction, and there are over 130 nodes around the world at the moment (Jan 2003). ANU users will include Astronomers at Mt Stromlo Observatory (20km west of Canberra), and the new ANU Medical School (on campus).

    The underlying architecture of these systems is open and flexible, and the ANU has the potential to become a worldwide focus of AG development, with strong international collaboration.

    The IF group has several potential projects feasible for an honours project. These include

    • high-quality Audio/Video capture and delivery e.g.
      • tiled (multi-camera) video capture for higher resolutions,
      • broadcast quality video capture,
      • low latency Audio/Video codecs
    • intelligent A/V capture (using the steerable cameras, and possibly audio-capture interfaces). Some applications, e.g. music teaching, require very specific views of students as they play music. For general workshop/meeting interaction, the steerable cameras could be used to track a speaker as they wander around the room, or to focus on specific audience members when they ask questions.
    • integration of other inputs, e.g. laptops and dedicated applications (which aren't multicast aware, or are part of the "standard" AG toolkit), as well as intelligent/electronic whiteboards, wireless PDA's, etc.
    • cluster systems, for codec performance and robustness
    • layered encodings to support differential quality audio/video (e.g. to support high-bandwidth and low-bandwidth clients simultaneously).
    • integration with H323/SIP (the two major standard videoconferencing-over-IP technologies) and streaming systems (for one-way delivery to non-AG clients). This will involve some interesting design issues, e.g. H.323 is designed for essentially one-view-at-a-time, where the AG is designed for multi-site viewing in parallel.
    • User Interface design; the AG user interface is pretty ordinary and unfriendly to new users. There is a lot of untapped information in AG streams that could be used to greatly enhance the user experience. These include audio localisation, visual feedback of audio activity, and audio-video synchronisation ("lip-sync").
    • physical security of AG spaces; protecting the (expensive) hardware in AG spaces when it is not being used. Given multiple steerable cameras, sensitive microphones, and possibly other sensors, it should be possible to create a space that secures itself (and screams if attacked :-) ).
    • and others...

    Technologies in these fields include IP multicast, hardware interfaces, various network protocols and performance issues, user-interface design, and audio/video codecs. Programming skills in C, C++, Java, Perl, Python, are all useful.

    You can go as deep into these areas as you like (although we have some ideas), and maybe even come back for a PhD if you're really keen :-) Possibly more than one area could be looked at during the project.

     

    Extending the Sparc-Sulima Simulator for Clusters

    Supervisor: Peter Strazdins
    Email: Peter.Strazdins@anu.edu.au
    Phone: (02) 6125 5140
     

    For a description of this project please see the web page titled: Extending the Sparc-Sulima Simulator for Clusters.

     

    Checkpointing in a complex object oriented software system

    Supervisor: Bill Clarke
    Email: Bill.Clarke@anu.edu.au
    Phone: (02) 6125 5687
     

    For a description of this project please see the web page titled: Checkpointing in a complex object oriented software system.

     

    Hyperlinks, PageRank and Page Quality

    Potential supervisors: Trystan Upstill, Nick Craswell, David Hawking
    E-Mail: nick.craswell@csiro.au

    This project would suit a student interested in the evaluation of Web search engine technologies.

    The "link recommendation" assumption is that each hyperlink is a vote for the target page being of high quality. The most famous example is Google's PageRank [1], but most search engines probably use some form of link recommendation to boost the ranking of pages with high indegree.

    While the use of link recommendation is widespread there has been no published evidence to support its use. For example, nobody has tested whether, given the home pages of two things (products, companies, CS departments) the one with the highest PageRank tends to be the best or most important one. Yet for us to be convinced that link recommendation is useful in search engines, something like this must be true.

    The project is to compare PageRank to "real world" notions of quality such as company reputation (using S&P [2]), movie popularity (using IMDB [3]) or sales ranks (using bestseller listings). The question is whether PageRank as calculated by Google predicts such rankings, or which rankings are best predicted by PageRank. There would also be room for the student to implement their own link recommendation measure and evaluate it in the same way, depending on their interest.

     

    Data Cleaning on the Web

    Potential supervisors: Francis Crimmins, Nick Craswell, David Hawking
    E-Mail: nick.craswell@csiro.au

    This project would suit a student interested in studying Web search technology.

    Many Web search systems rely on a crawler to gather pages, for example Google crawling billions of pages per month and search.anu.edu.au crawling hundreds of thousands of ANU pages per week. Some information on crawlers is in the Web Robots FAQ.

    Unfortunately, when crawling, all kinds of redundant pages tend to be gathered, lessening the efficiency and effectiveness of the search system. Types of redundancy include:

    • Mirroring: The same pages on two hosts
    • Aliasing: Two pages or directories might deliberately appear in different locations on a host.
    • Case insensitivity: The same page might be available with an upper and lower case URLs (while on other servers maintaining correct case is crucial).
    • Parameterised URLs: By varying the order of CGI parameters in the URL or adding ones which do not affect content (e.g. a user id) the same content can appear at hundreds of URLs in a crawl.
    • Crawler traps: The simplest form of crawler trap leads to URLs such as: http://xyz.anu.edu.au/hons/hons/hons/hons/hons/topics.html. It occurs when topics.html links to the subdirectory "hons/" and that is a valid URL due to some soft link or server misconfiguration.
    • Applications: For example, a calendar application if completely crawled might give monthly calendar pages for a hundred years, even if only the current year has any calendar entries.
    • Error pages: Error pages which have an incorrect HTTP status code (200) also lead to redundant pages being included in the crawl.

    Crawling these redundant pages wastes network and machine resources. Also, since crawls are usually halted after a certain number of URLs or time limit, good pages can be crowded out of the crawl by these redundant pages. Reducing redundancy also frees up disk space on the search machine, improves index/query processing time and can improve retrieval effectiveness (user satisfaction with results).

    The project is to investigate the sort of redundancy which occurs in crawls and methods for eliminating it, perhaps starting with search.anu.edu.au. Possible approaches include pre-GET, post-GET or post crawl redundancy elimination, based on content [Broder et al] or on URLs [Bharat and Broder]. The project could measure the impact of such detection, study its implementation in a real search system and/or investigate improved URL-based detection methods, particularly in the area of parameterised (CGI) URLs.

    References:

    1. http://www.google.com/
    2. http://panoptic2.anu.edu.au/search/search.cgi?collection=anu_search_external
    3. http://www.robotstxt.org/wc/faq.html
    4. http://www.w3.org/Protocols/HTTP/HTRESP.html
    5. http://www.scope.gmd.de/info/www6/technical/paper205/paper205.html
    6. http://www8.org/w8-papers/4c-server/mirror/mirror.html
     

    XML search system

    Potential supervisors: Nick Craswell, David Hawking
    E-Mail: nick.craswell@csiro.au

    A project for students interested in XML technologies, Web technologies, information management and/or coding.

    XML is a simple technology which has far-reaching implications and an increasing number of real-world applications. One way of accessing XML documents, particularly complex or large datasets, is via search engine technology. However, supporting rich queries over arbitrarily complex XML raises a number of research questions. In particular, restricting the results to a particular DOM subtree is something that is difficult to do without an additional index structure.

    This project could be a flat out engineering and measurement exercise, adding DOM subtree support to a simple inverted index. Alternatively it could involve studying alternate solutions and overall XML search requirements. It could be based on Panoptic (http://www.panopticsearch.com/) or an open source inverted indexing system.

     

    Site and Search Engine Summarisation

    Potential supervisors: Nick Craswell, David Hawking
    E-Mail: nick.craswell@csiro.au

    Write and evaluate a site summariser. Optionally apply the same principles to subject-specific search services.

    Yahoo! [1] is an example of a successful company which performs manual site summarisation: all their listed sites have hand-written summaries. InvisibleWeb [2] do the same job as Yahoo!, but summarising subject-specific search services. Either type of company might find automatic summarisation techniques useful.

    There are many document summarisation algorithms, including those based on the content of one or many documents, and those based on hyper-links [3]. The project is to find out which summarisation methods are best at describing the content of a site (and/or subject-specific search service).

     

    Peer to peer postcards: multimedia delivery for wireless handset devices

    Supervisor: Chris Johnson

    Details:

    Mobile phone networks are very expensive way to deliver pictures to mobile phone handsets, but the handsets are now being made with cameras and display screens. We can also see devices that combine remote network phone communications with local Bluetooth communications.

    Voice communications over the phone network are still needed, because voice over IP is not yet widely available with sufficient quality and convenience - and the phone network is much more extensive in geopgraphical reach and number of customers; but where two people have compatible devices and have some access to short-range wireless networks, they should be able to exchange pictures over IP.

    It is proposed to make the two networks complement each other: to send short messages and interactive voice communications over the instant, mobile phone network, and larger multimedia objects with some delay over the cheap, local Bluetooth or WiFi network. Bluetooth networks are short range, and the devices will need to form ad hoc, peer to peer connections to hand the multimedia chunks along the chain to make the distance. (WiFi networks are more centralised but the same peer to peer behaviour is also possible).

    The problem is to discover, design, install, implement the appropriate protocols, taking into account users' social and geographical requirments. Experimental demonstration using mobile wireless devices such as laptops and PDAs in conjunction with normal mobile phones is expected - without deep integration of the two modes of communcation. Some extension work would be desirable towards analysis of network reach, and the density of partygoers needed to make the service work in practice.

    References

    The problem is mentioned in a report on a talk by Henning Schulzrinne, at Mobicom 2002] - as reported in IEEE Internet Computing, Nov/Dec 2002, p 12.

    Peer to peer networks, Bluetooth, and WiFi are hot topics at present, and the student will have to find their way into the gossip and separate hard facts from hyped fancy.

    Student skills required

    Interest in distributed computing application protocols; interest and ability to search out existing software systems; analytical awareness of user needs and social realities as well as technologies. Self-starting person, at home with finding, installing and experimenting with real world software. An ability to analyse and write about the implications of the technical work is desirable.

    Project support will come from the Smart Internet Technology Intelligent Environments group, in both technology and conceptual buzz.

     

    Interval Arithmetic: The New Floating-Point Paradigm?

    Supervisor: Alistair Rendell

    From using a calculator we are all familiar with the fact that many numbers must, at some level, be truncated and rounded. Thus 2/3 may appear as 0.67, 0.66667, or 0.6666667 etc depending on the precision of your calculator. More generally representing the continuum of floating point numbers by the discrete finite representations used in all modern computers gives rise to rounding errors, overflows, underflows etc. Failure to account for these limitations can give rise to spectacular (expensive!!) results, e.g. the failure of some Patriot missiles to hit their targets during the Gulf War or the explosion of the Ariane 5 space rocket (see disasters)

    Interval arithmetic is a fundamental paradigm shift that attempts to address these issues. It defines an "interval" as a continuum of real numbers bounded by its end-points and performs all arithmetic operations with respect to intervals rather than discrete floating point numbers. As a consequence of this errors are automatically propagated and accounted for. Moreover the mathematical properties of intervals can be exploited to develop entirely new algorithms.

    The concept of intervals is actually not new, but was considered by Moore in 1965. What is relatively new is the fact that some hardware vendors are now providing direct support for interval arithmetic within their compilers/hardware.

    The aim of this project will be investigate the use of interval arithmetic for computational science applications. The initial goal will be to use intervals to study error propagation for some "problematic" algorithms found within the Gaussian computational chemistry code. A second goal will be to investigate the use of intervals for nonlinear global optimizations, eg determining the most stable conformation (geometry) of a large molecular system (eg protein).

    For more information on interval arithmetic see the interval home page.

     

    Performance Analysis of Cache Oblivious Algorithms

    Supervisors: Alistair Rendell, Peter Christen and Peter Strazdins

    Although clock speed is important, on modern computers in order to obtain a high percentage of the theoretical peak performance it is vital to maximize cache re-usage. As a consequence much effort has been directed to developing algorithms that are, in some way, blocked so that they spend the majority of their time working on data that is cache resident. The classic example of this is a matrix multiplication algorithm in which large matrices are broken down into sub-matrices small enough to reside in cache. These sub-matrices are then multiplied together and the results used to construct the overall result. A disadvantage of these approaches, however, is that they usually contain explicit parameters relating to the cache size of the computer being used. Thus limiting portability of the code, possibility even between CPUs from the same manufacturer.

    An alternative to explicit cache blocking is to construct your algorithm from the outset in such away that, at some point, it naturally becomes cache resident. Such algorithms are referred to as "cache-oblivious" and have been strongly advocated by Matteo Frigo in a series of papers recently published. Ideal candidates for cache-oblivious algorithms are recursive algorithms, where frequently the data set is successively divided into half, and thus at some stage becomes small enough to reside in cache.

    Unfortunately to date most of the work on cache oblivious algorithms has been theoretical rather than practical. The aim of this project is to implement some cache oblivious algorithms and use hardware performance counters both to compare their performance characteristics with more traditional algorithms, and to study their performance on a range of different hardware platforms.

     

    OpenMP Futures

    Supervisors: Alistair Rendell and Peter Strazdins

    OpenMP is an Application Program Interface (API) for multi-platform shared-memory parallel programming in C/C++ and Fortran. It is jointly defined by a group of hardware and software vendors and, via cOMPunity , the wider community. The first OpenMP specification was released in 97, while version 2.0 for C/C++ was released in 02. Since its initial release it has gained increasing use, particularly in the scientific community.

    Currently there is much debate within the OpenMP community as to what should be included in future OpenMP specifications. We are very interested in these issues as part of the CC-NUMA project. In particular we are keen to explore possible

    • Support for Non-Uniform Memory Access (NUMA) architectures
    • Task queues models
    With respect to the first issue, OpenMP currently assumes a uniform memory access model. However, it is well know that all modern large scale shared memory parallel machines have some level of NUMA. Some vendors, such as SGI and Compaq (now HP) have extended their OpenMP implementations to include NUMA support. However, there is no consensus on how this should be done. Similarly both KAI (now Intel) and Sun have put forward interesting, but different, proposals for including a task queue model in OpenMP. As an honours project you would explore one of these issues in the context of a variety of kernel algorithms and on a selection of high performance computing platforms (from SGI, HP and Sun).

    A further opportunity exists to consider OpenMP on cluster environments. Thus, although OpenMP is a shared memory programming paradigm there has been some work to enable OpenMP on clusters (similar to software distributed shared memory models). Of particularly note is the work of the PC cluster consortium in Japan. Work in this area would involve using their SCore cluster development kit on the Bunyip Cluster and the APAC SC system.  


    Data Warehousing and OLAP Project(Small ARC funded project)

    Supervisor: Weifa Liang
    Email: wliang@cs.anu.edu.au
    Phone: 6125 3019

    One of the fundamental problems in the design of data warehouses is the formation of effective methods for selecting which views to materialize, and which materialized views to replace. In other words, given the resource constraint and query load to be supported in a data warehouse, the selection problem usually is to select a set of views to materialize such that the sum of query response time and view refreshment (or maintenance) time is minimized. Though the previous work has offered significant insight into the nature of the problem, in all of the existing work, it is assumed that the initial data warehouse is empty. In normal operation, the data warehouse grows incrementally. Also, the query and update frequencies of materialized views are subject to changes with time.

    Another related challenging problem in data warehousing is the view refreshment (view maintenance) problem under the constraint of a given refreshment time window. In order to keep the content of a materialized view in the warehouse consistent with the contents of the remote sources, the materialized view in the warehouse must be updated immediately or periodically in response to the source updates. Usually such updates are done incrementally. In summary, the following work will be conducted in this project. design of new algorithms for view selection and replacements in the dynamic data warehouse environment, and evaluation of new strategies for such applications through implementation and performance evaluation. examination of the existing view refreshment policies, development of new view refreshment algorithms that shorten view refreshment time for the data warehouse.

     

    Routing Algorithms for WDM Optical Networks (ARC Discovery Funded Project)

    Supervisor: Weifa Liang
    Email: wliang@cs.anu.edu.au
    Phone: 6125 3019

    Optical network technique plays a key role to the next-generation networks. In particular, wavelength-division-multiplexing (WDM) networks have emerged as a promising candidate for next-generation networks in providing large available bandwidth and connectivity. Routing and wavelength assignment problem is one of the fundamental problems in WDM optical networks.

    Designing efficient routing protocols is a fundamental problem for any network. There is no exception for the WDM optical networks. By the nature of WDM network, the traditional routing algorithms are ill-designed for this setting. It is thus necessary to devise new algorithms and algorithmic methodology for it.

    In this project we primarily focus on devising efficient routing algorithms for two fundamental routing problems in WDM networks. They are the all-to-all routing and robust routing problems. Incorporated with various known techniques, practical algorithms and new techniques for solving them will be developed. The performance of the developed algorithms will also be analyzed through experimental simulation. Our objectives are to devise efficient routing algorithms for several fundamental routing problems in WDM optical networks. In particular we will focus on devising routing algorithms for all-to-all routing and robust routing problems through the use of a collection of algorithmic skills and graph techniques including path-coloring, network flow, maximum matching, etc. The performance of the developed routing algorithms will be analyzed through experimental simulation. develop a generic methodology for the design of ``efficient'' routing protocols for WDM networks. To effectively utilize the network resources, an efficient routing protocol should be able to optimize multiple objectives simultaneously in terms of different sorts of network resources consumption. investigate various known techniques to achieve efficient routing in WDM optical networks, and develop new concepts and techniques to solve the routing problems in WDM optical networks. Practical algorithms to solve some real problems, will be developed, implemented, and tested.

     

    Power-Aware Routing for Ad Hoc Mobile Networks

    Supervisor: Weifa Liang
    Email: wliang@cs.anu.edu.au
    Phone: 6125 3019

    Mobile multihop radio networks, also termed peer-to-peer, all wireless, or wireless ad hoc networks, are expected to fulfill a critical role in applications in which wired backbone networks are not available or not economical to build. These span the commercial, public as well as tactical communications sectors. The networks provide the only solution in situations where instant infrastructure is needed and no central system backbone and administration (like base stations and wired backbone in a cellular system) exist. Some of the applications of the networks include mobile computing in areas where other infrastructure is unavailable, law enforcement operations, disaster recovery situations, as well as ad-hoc networks for large events such as sporting events or congresses when it is not economical to build a fixed infrastructure for a short temporary usage. In tactical battlefield communications the hostility of the environment prevents the application of a fixed backbone network. We consider a peer-to-peer mobile network consisting of a large number mobile nodes that create a network on demand an may communicate with each other via intermediate modes in a multihop mode, i.e., every node can be a router.

    In this project we focus on new power-aware routing protocols to extend the life of the nodes and the network, based on homogeneous and heterogeneous mobile models, a new power-aware routing protocol will be proposed, which takes the capacities of different types of mobiles into consideration.

     

    Memory Management Toolkit

    Supervisor: Steve Blackburn
    Email: Steve.Blackburn@anu.edu.au
    Phone: +61 2 6125 4821

    JMTk is a new memory management toolkit for Java, which I have co-developed with Perry Cheng of IBM Research and Kathryn McKinley of the University of Texas. JMTk has proved very popular and there is significant interest in allowing it to work in other language environments including Microsoft's CLI, and the Haskell runtime. This project will explore the issues associated with interfacing JMTk to these environments. For exceptional students, the possibility exists of an internship at Microsoft Research as part of the project.

     

    Performance analysis of Java

    Supervisor: Steve Blackburn
    Email: Steve.Blackburn@anu.edu.au
    Phone: +61 2 6125 4821

    Java is very important commercially, and yet the runtime systems required to support it and other languages such as C# are relatively immature. This project will involve measurement and analysis of the performance of Java programs, with a view to better understanding where there are opportunities for performance improvement. This work will involve using and modifying JMTk, a memory management toolkit for JikesRVM.

     

    Emergency Web

    Supervisor: Tom Worthington, Ian Barnes
    Email: Tom.Worthington@tomw.net.au
    Phone: 0419 496 150

    The World Wide Web provides a useful way to provide information to the general public and organization staff. However, in an emergency, web sites can be overloaded by demand. Investigate how web based systems have coped with emergencies and suggest techniques to allow web based systems to work better. Techniques might include use of accessibility guidelines and design to enhance the use of caching. Recommend changes, include prototypes for the ACT Government and the Sentinel Fire Mapping System, as used in the ACT bush fire emergency of January 2003.

     

    Simplifying XML for E-commerce

    Supervisor: Tom Worthington, Ian Barnes
    Email: Tom.Worthington@tomw.net.au
    Phone: 0419 496 150

    Web Services provides a way to define a set of transactions and the format of the data which those transactions can be performed on. In theory the XML format used makes e-commerce simple to implement, but in practice the take-up has been slow. The Australian Taxation Office provides free XML based e-commerce software for small businesses to submit electronic business activity statements (eBAS) on-line, but most still use paper forms. It has been demonstrated that it is possible to transform the eBAS format using XSLT, to be displayable by a web browser without the need for additional software. Investigate enhancements to the eBAS format to allow processing by a web browser, without additional software. Propose changes to the eBAS XML format, and if needed to the relevant web services and XML standards to make this possible.

     

    Standardising Office Document Formats

    Supervisor: Tom Worthington, Ian Barnes
    Email: Tom.Worthington@tomw.net.au
    Phone: 0419 496 150

    An XML based standard format for office documents containing text, spreadsheets, charts, and graphics has been proposed. The standard is to be based on OpenOffice.Org's file format. However, the OOO format is complex, contains errors, is not compatible with available web browsers and does not make full use of the latest XML standards. Investigate options for a file format which is simpler than OOO, uses existing standards and is compatible with web browsers. Implement the proposed standard as an extension to the OOO package.

     

    Making Movies with Metadata

    Supervisor: Tom Worthington, Ian Barnes
    Email: Tom.Worthington@tomw.net.au
    Phone: 0419 496 150

    The Australian Creative Resources Archives (ACRA) is an ARC funded pilot project to make “waste” materials (such as video, animation, audio and text) from Australian cultural production processes available to researchers. The pilot infrastructure will consist of commercial digital video equipment, interfaced to GrangeNet. Some possible areas for research:

    • Multimedia design principles: Extend the OpenOffice.org package to work with XML encoded film scripts and interface it to the archive. Investigate user interfaces for displaying the scripts (text, video timeline, or a tree of objects). Implement a prototype allowing the archive's content to be manipulated by OOO as a film script and played back as high resolution video over GrangeNet.

    • Database and interface design: Investigate the metadata definitions built into the archive's commercial software, compare these with avaliable standards and the projects requirements to suggest what needs to be added. Implement an interface to the archive to allow metadata, including additonal items) to be entered, searched and extracted.

     

    Automatic Web Page Layout

    Supervisor: Tom Worthington, Ian Barnes
    Email: Tom.Worthington@tomw.net.au
    Phone: 0419 496 150

    Investigate artificial intelligence algorithms for automatically laying out web pages and produce an open source prototype software. Document layout "hints" for different renderings of the document (print, web, slideshow and AV) would be explicitly encoded in the document (using XML or similar format) or would be inferred from an existing screen layout. Documents would be rendered to suit the user's requirements and the capabilities of their display device and communications link, through features in the display device and/or in a server (for low capability display devices). As an example multiple frames would be used on large screens and one frame with links on small screens. The software would generate documents incorporating accessibility features for the disabled as described in W3C Web Content Accessibility Guidelines. Multiple renderings of information objects (for example multiple language versions for text, text captions for images) would be available.

     

    Declarative theorem-proving

    Supervisor: Michael Norrish
    Email: Michael.Norrish@cs.anu.edu.au

    This project would suit a student keen to work in an area involving logic and proof, and the development of infrastructure to support these activities.

    When specifying systems formally, a significant error-finding benefit can accrue simply from writing everything down in an expressive logic, and getting it all to type-check. Working in this way allows users to quickly deal with the “low-hanging fruit”; important bugs that are easy to find.

    The HOL system currently doesn’t allow for this model of work very well. Part of the reason for this is that script files for HOL inter-mingle SML (which is the theorem-prover’s implementation language) and embedded bits of logic. We would like to check the logic without having to be able to parse (and execute) SML as well.

    Therefore, the first aim of this project is to design a simple file format in which users could write “SML-less HOL”, and to then implement at least two tools for manipulating this format. The first tool would be the type-checker. It would make sure that the definitions and other logical material in the file actually made sense at the level of types. The second tool would do all this and also produce a genuine HOL theory, using the various definitional facilities already present in HOL. This file format would need to include mechanisms for referencing other files, where previous logical developments had been built up.

    The final stage of the project would be to implement a simple declarative language for writing proofs. The issues here are similar to those for definitions; to escape the use of files that include (SML) code, and move to formats that are easy to parse and manipulate as data.

     

    Implementing proof kernels

    Supervisor: Michael Norrish
    Email: Michael.Norrish@cs.anu.edu.au

    At the core of LCF systems like HOL, there is a small “kernel” that implements the primitive rules of inference of the logic. There are a number of interesting issues relating to the design of these kernels. For example

    • How to represent free and bound variables?
    • Optimising common calculations (such as calculating the set of free variables)
    The project would be to explore this design space by implementing first a very simple-minded kernel for HOL, and then testing a set of possible optimisations on top of this initial design. This problem comes with its own test suite, and admits a very explicit and clear specification, because of its position as part of HOL.

     

    Using Model Checkers to Automatically Verify the Correctness of Complex B2B protocols

    Supervisors: Dean Kuo and Sylvie Thiebaux
    E-Mails: Dean.Kuo@csiro.au, Sylvie.Thiebaux@anu.edu.au
    Phones: (02) 6216 7066, (02) 6125 8678

    Suitability: This project would suit a student interested in Software Engineering research. It is best conducted as a 4th year BSE individual research project or as a BInfTech honours thesis. If necessary, it may be possible to approach it as a 4th year BSE team project, but this would require negotiations as to how to appropriately modify its scope given below.

    Prerequisites: Knowledge and interest in distributed computing, logic and formal specification and verification.

    Project Description:
    Application integration technologies have been needed ever since the earliest days of commercial computing since no single application was ever hope to meet the complete needs of an organisation. Integration can mean many things, from simple file based exchanges (e.g. FTP) to tightly coupled systems using remote procedure calls. The need for application integration has given rise to technologies such as EAI (Enterprise Application Integration) and B2Bi (Business to Business integration) and most recently has been the driving force the development and adoption of Web Services. These technologies provide solutions to the low-level issues of connectivity, data interchange and cross-application invocation. They make it possible to integrate autonomous and independent applications with some degree of ease.

    The interactions between autonomous applications often require quite complex B2B protocols - that is, the interaction defines complex stateful interaction between the applications that cross organisational boundaries. An example of a protocol is e-procurement: after the customer sends a purchase order to the merchant, the merchant notifies the customer either the order has been accepted or rejected; if the merchant accepts the order, the merchant will later send a shipping notice and after the customer has received the goods, it sends an acknowledgement that goods have been received. In addition, the merchant sends an invoice to the customer, the customer is then expected to send payment before its due date and then the merchant, on receipt of payment, needs to send a receipt to the customer. The protocol becomes extremely complex when the customer or merchant cancels an order since goods may need to be returned and payment refunded. In the real world, we do not have the luxury of a global architecture and design when integrating autonomous applications, as these have been typically designed and implement separately. Thus, one crucial question that needs to be answered before we can reliably integrate applications is whether they are compatible. An example of two applications that are not compatible is if the customer is unwilling to send payment before goods have been received while the merchant refuses to deliver the goods until payment has been received. 

    Model-checking is an area of computer science which aims at developing tools for verifying that a system (or a set of communicating systems) satisfies a specification usually given in temporal logic, by showing that this (set of) system(s) is a logical model for the temporal logic formulae. Model-checking techniques have proven extremely successful in verifying hardware, and their applicability to other areas, including distributed protocols, has been the subject of much investigation for the past 10 years.

    This research project is to investigate the following:

    • Can we use the model checker SPIN to automate the process of verifying that two or more applications are compatible to perform a given business process? The B2B protocol that we will verify is a protocol, derived from one of our consultancies, for e-procurement. For simple cases, we are confident that it is possible. However, it is not clear whether more complex cases can be handled by SPIN. If not, we want to identify why not and what sort of extensions is required to be able to handle these cases.
    • Identify strength and weakness of using SPIN to verify that applications are compatible.

    References:

     

  • Updated:  18 February 2011 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address.