Honours topics
These projects topics were offered in 2003. While these - or
similar - topics may not offered this year, they do serve as a
guide to the diverse range of topics that will be available.
If you wish you could work on a topic like one of these, then why
not write to the person who offered it. They might be persuaded to
let you pursue a similar project under their supervision this
year.
Please note that some of the projects below are very specific in
nature. However, you may be able to discus minor changes to the
proposal with the supervisor. Other proposals are, in fact, too broad
in scope to describe an individual honours project. You should take
these descriptions as an indication of the areas of interest of the
supervisor. If you are interested in the area then you can discuss
potential specific projects with the supervisor.
If the projects proposed for this year don't inspire you,
check out the projects proposed in previous years as well:
In no particular order, here are the topics proposed so far for 2003:
* This project was undertaken by a student and is no longer available. If you are interested in working in the area please speak with the listed supervisor as related projects may be available.
Supervisor:
Eric McCreath
Machine learning may be defined as learning a description of a
concept from a set of training examples. This description may then be
used for prediction. This project involves investigating the
application of Machine Learning with an operating system to pre-fetch
files. The central outcomes of this research will be an analysis of
machine learning approaches to undertake such a task. This would also
involve forming a theoretical model of the domain in question, which
would be vital for the analysis.
Details:
Operating Systems often employ read-ahead windowing
approaches for sequentially accessed files. Basically the
next part of a file is pre-fetched into memory even before
it is requested. This greatly improves performance of the system.
Often when we interact with computers the same files are
required during our interaction. For example we may always sit
down and start-up our mail reader. If the computer could
learn this pattern of usage and pre-fetch these files
the responsiveness of the system could be improved.
This project involves developing such a system and investigating
its potential.
Within machine learning there are a number of issues that make
this project both challenging and interesting. These include:
the temporal nature of the data, the large number of examples,
how the leant hypothesis could be applied to recommend which files
to pre-fetch. Also within Operating Systems there are a number of
challenges, such as how the system hooks into the kernel and how
to measure the performance of the system when it is functioning
with the operating system.
The results of this research have the potential to be published in
both Operating Systems and Machine Learning publication. The project
would also form a good starting point for graduate study.
Students will need to be able to reason logically and should have
done well in theory subjects (either within Mathematics or Computer
Science). Successful completion will require both determination and
imagination. Finally, student considering embarking on this project
must have good programming skills and will need to gain an
understanding of the Linux kernel.
Supervisor: Peter
Christen
E-Mail: Peter.Christen@anu.edu.au
Many organisations today collect massive amounts of data in their
daily businesses. Examples include credit card and insurance
companies, the health sector (e.g. Medicare), police/intelligence or
telecommunications. Data mining techniques are used to analyse such
large data sets in order to find new, unexpected and potentially
valuable information (like patterns and rules, or outliers).
Often several data sets have to be linked to obtain more detailed
information. As most data is not primarily collected for data
analysis purposes, a common unique identifier (like a patient or
customer identification number) is missing in many cases, and so
probabilistic techniques have to be applied to link data sets. Data
or record linkage - also called data matching, data integration, data
cleansing or data scrubbing - is a a rapidly growing field with
applications in many areas (business mailing lists, health and
biomedical research, fraud and crime detection, census statistics,
etc.) and it is an important initial step in many data mining
projects.
The ANU Data Mining Group
is currently working in collaboration with the
NSW
Department of Health, Centre for Epidemiology and Research on the
improvement of probabilistic techniques for record linkage. We are
mainly interested in developing high-performance techniques for
linkage that can be run on parallel computers like the APAC National Facility (a Compaq
supercomputer with 480 processors), or clusters of workstations or
PCs. Prototype software has been developed and published under an open
source software. See
http://sourceforge.net/projects/febrl and the project web page
at
http://datamining.anu.edu.au/linkage.htmlfor more details.
Student research in this project would include:
- Exploration of parallelisation and development of techniques
for distributed data access, (dynamic) load balancing, fault
tolerance and recovery within the framework of the developed
prototype software.
- Exploration of machine learning and data mining techniques
to improve the linkage quality. One of the disadvantages of
most machine learning techniques is their need of training
data, which is hardly available in the given application area.
While some initial work has been done by various researchers,
there are still many open questions and issues.
- As users of a record linkage system most likely will not be
computer experts, but rather biomedical researchers,
statisticians, etc.), a graphical user interface needs to be
developed. Research issues involved in this area are for
example how to efficiently support a user when human
intervention is needed on the linkage status of a record pair.
A student working on this project would be involved through
- participation in an exciting research project
- object-oriented cross-platform programming
(using Python,
Tkinter,
wxPython, etc.)
- exploration and development of machine learning and data
mining techniques
- distributed parallel computing on SMPs, COWs and
supercomputers (APAC)
- open source software development
(see
http://sourceforge.net/projects/febrl).
For more information, publications, and further links please visit the
project web page at
http://datamining.anu.edu.au/linkage.html.
Contact: Markus.Buchhorn@anu.edu.au , CSIT-N244, 6125-8810
The Internet Futures group of the ANU is working with high-speed networks, advanced network services, visualisation and telecollaboration technologies, and large scale data facilities, to support a diverse range of user communities. Many of its activities have a national focus, e.g. through the GrangeNet project.
In the large-scale-data area, the users include Physics, Astronomy, BioInformatics and Cultural Archives (e.g. audio, spoken and music, or video). Archives can be easily many Terabytes (1000's of Gigabytes) in size, need to be delivered to users on networks ranging from multi-Gigabit down to slow dial-up modem speeds, and may need to interface to large-scale computing systems.
The IF group has several potential projects feasible for an honours project. These include
- development of new Bio-Mirror (www.bio-mirror.net) archive services, to greatly reduce their bandwidth needs, while increasing their performance and usability to users. This project would have international impact, as a good solution would be adopted by many biomirror sites.
- development of Virtual Observatory elements, to support astronomers' real-time access to large scale observational data sets
- and others, depending on your interests...
Technologies in these fields include grid toolkits (Globus for example), metadata standards, various network protocols and performance issues, large scale databases and database/directory access methods, and web services and SOAP. Programming skills in C, C++, Java, Perl, Python, are all useful.
You would get to work with a range of users across the ANU, on real-world problems. You'd also get access to the APAC Mass Store facility (a 1.2 Petabyte (1200 Terabyte) tape robot) and associated computing hardware including possibly the APAC National Facility supercomputer, the ANU Supercomputer Facility's Visualisation Laboratory (VizLab) , and the Internet Futures' Access Grid (AG) (telecollaboration) nodes in various locations on campus.
You can go as deep into these areas as you like (although we have some ideas), and maybe even come back for a PhD if you're really keen :-)
Contact: Markus.Buchhorn@anu.edu.au, CSIT-N244, 6125-8810
The Internet Futures group of the ANU is working with high-speed networks, advanced network services, visualisation and telecollaboration technologies, and large scale data facilities, to support a diverse range of user communities. Many of its activities have a national focus, e.g. through the GrangeNet project.
TeleCollaboration can be seen as very high-end videoconferencing, with many additional features. Some of these features require the user's space to be intelligent or flexible, others require advanced network or application services. Any kind of collaboration, for research and for education can be supported. This could include music teaching (masterclasses), architecture, history or medical training with visualisation or even virtual reality interfaces, and large-scale data browsers and instrument consoles (e.g. in Astronomy or TeleMicroscopy).
The Internet Futures' telecollaboration activities revolve mainly around Access Grid (AG) nodes. There are currently (Dec/Jan 2003) 3 full-size nodes being built at the ANU (including one in the CS&IT building), and several additional nodes are being discussed. There are also plans for smaller-scale (even desktop) nodes, and ports to a variety of systems and other technology spaces (e.g. the Wedge virtual reality/visualisation facility). Around Australia 15 nodes are currently in planning or construction, and there are over 130 nodes around the world at the moment (Jan 2003). ANU users will include Astronomers at Mt Stromlo Observatory (20km west of Canberra), and the new ANU Medical School (on campus).
The underlying architecture of these systems is open and flexible, and the ANU has the potential to become a worldwide focus of AG development, with strong international collaboration.
The IF group has several potential projects feasible for an honours project. These include
- high-quality Audio/Video capture and delivery e.g.
- tiled (multi-camera) video capture for higher resolutions,
- broadcast quality video capture,
- low latency Audio/Video codecs
- intelligent A/V capture (using the steerable cameras, and possibly audio-capture interfaces). Some applications, e.g. music teaching, require very specific views of students as they play music. For general workshop/meeting interaction, the steerable cameras could be used to track a speaker as they wander around the room, or to focus on specific audience members when they ask questions.
- integration of other inputs, e.g. laptops and dedicated applications (which aren't multicast aware, or are part of the "standard" AG toolkit), as well as intelligent/electronic whiteboards, wireless PDA's, etc.
- cluster systems, for codec performance and robustness
- layered encodings to support differential quality audio/video (e.g. to support high-bandwidth and low-bandwidth clients simultaneously).
- integration with H323/SIP (the two major standard videoconferencing-over-IP technologies) and streaming systems (for one-way delivery to non-AG clients). This will involve some interesting design issues, e.g. H.323 is designed for essentially one-view-at-a-time, where the AG is designed for multi-site viewing in parallel.
- User Interface design; the AG user interface is pretty ordinary and unfriendly to new users. There is a lot of untapped information in AG streams that could be used to greatly enhance the user experience. These include audio localisation, visual feedback of audio activity, and audio-video synchronisation ("lip-sync").
- physical security of AG spaces; protecting the (expensive) hardware in AG spaces when it is not being used. Given multiple steerable cameras, sensitive microphones, and possibly other sensors, it should be possible to create a space that secures itself (and screams if attacked :-) ).
- and others...
Technologies in these fields include IP multicast, hardware interfaces, various network protocols and performance issues, user-interface design, and audio/video codecs. Programming skills in C, C++, Java, Perl, Python, are all useful.
You can go as deep into these areas as you like (although we have some ideas), and maybe even come back for a PhD if you're really keen :-) Possibly more than one area could be looked at during the project.
Supervisor: Peter Strazdins
Email: Peter.Strazdins@anu.edu.au
Phone: (02) 6125 5140
For a description of this project please see the web page titled:
Extending the Sparc-Sulima Simulator for Clusters.
Supervisor: Bill Clarke
Email: Bill.Clarke@anu.edu.au
Phone: (02) 6125 5687
For a description of this project please see the web page titled: Checkpointing in a complex object oriented software system.
Potential supervisors: Trystan Upstill, Nick Craswell, David Hawking
E-Mail: nick.craswell@csiro.au
This project would suit a student interested in the evaluation of
Web search engine technologies.
The "link recommendation" assumption is that each hyperlink is a
vote for the target page being of high quality. The most famous
example is Google's PageRank [1], but most search
engines probably use some form of link recommendation to boost the
ranking of pages with high indegree.
While the use of link recommendation is widespread there has been
no published evidence to support its use. For example, nobody has
tested whether, given the home pages of two things (products,
companies, CS departments) the one with the highest PageRank tends to
be the best or most important one. Yet for us to be convinced that
link recommendation is useful in search engines, something like this
must be true.
The project is to compare PageRank to "real world" notions of
quality such as company reputation (using S&P [2]), movie popularity (using
IMDB [3]) or sales ranks (using
bestseller listings). The question is whether PageRank as calculated
by Google predicts such rankings, or which rankings are best
predicted by PageRank. There would also be room for the student to
implement their own link recommendation measure and evaluate it in
the same way, depending on their interest.
Potential supervisors: Francis Crimmins, Nick Craswell, David Hawking
E-Mail: nick.craswell@csiro.au
This project would suit a student interested in studying Web
search technology.
Many Web search systems rely on a crawler to gather pages, for
example Google crawling billions
of pages per month and
search.anu.edu.au crawling hundreds of thousands of ANU pages per
week. Some information on crawlers is in the Web Robots FAQ.
Unfortunately, when crawling, all kinds of redundant pages tend to
be gathered, lessening the efficiency and effectiveness of the search
system. Types of redundancy include:
- Mirroring: The same pages on two hosts
- Aliasing: Two pages or directories might deliberately appear in
different locations on a host.
- Case insensitivity: The same page might be available with an
upper and lower case URLs (while on other servers maintaining correct
case is crucial).
- Parameterised URLs: By varying the order of CGI parameters in the
URL or adding ones which do not affect content (e.g. a user id) the
same content can appear at hundreds of URLs in a crawl.
- Crawler traps: The simplest form of crawler trap leads to URLs
such as: http://xyz.anu.edu.au/hons/hons/hons/hons/hons/topics.html.
It occurs when topics.html links to the subdirectory "hons/" and that
is a valid URL due to some soft link or server misconfiguration.
- Applications: For example, a calendar application if completely
crawled might give monthly calendar pages for a hundred years, even
if only the current year has any calendar entries.
- Error pages: Error pages which have an incorrect HTTP status code
(200) also lead to redundant pages being included in the crawl.
Crawling these redundant pages wastes network and machine
resources. Also, since crawls are usually halted after a certain
number of URLs or time limit, good pages can be crowded out of the
crawl by these redundant pages. Reducing redundancy also frees up
disk space on the search machine, improves index/query processing
time and can improve retrieval effectiveness (user satisfaction with
results).
The project is to investigate the sort of redundancy which occurs
in crawls and methods for eliminating it, perhaps starting with
search.anu.edu.au. Possible approaches include pre-GET, post-GET or
post crawl redundancy elimination, based on content [Broder
et al] or on URLs [Bharat and
Broder]. The project could measure the impact of such detection,
study its implementation in a real search system and/or investigate
improved URL-based detection methods, particularly in the area of
parameterised (CGI) URLs.
References:
- http://www.google.com/
-
http://panoptic2.anu.edu.au/search/search.cgi?collection=anu_search_external
- http://www.robotstxt.org/wc/faq.html
- http://www.w3.org/Protocols/HTTP/HTRESP.html
- http://www.scope.gmd.de/info/www6/technical/paper205/paper205.html
- http://www8.org/w8-papers/4c-server/mirror/mirror.html
Potential supervisors: Nick Craswell, David Hawking
E-Mail: nick.craswell@csiro.au
A project for students interested in XML technologies, Web
technologies, information management and/or coding.
XML is a simple technology which has far-reaching implications and
an increasing number of real-world applications. One way of accessing
XML documents, particularly complex or large datasets, is via search
engine technology. However, supporting rich queries over arbitrarily
complex XML raises a number of research questions. In particular,
restricting the results to a particular DOM subtree is something that
is difficult to do without an additional index structure.
This project could be a flat out engineering and measurement
exercise, adding DOM subtree support to a simple inverted index.
Alternatively it could involve studying alternate solutions and
overall XML search requirements. It could be based on Panoptic
(http://www.panopticsearch.com/) or an open source inverted indexing
system.
Potential supervisors: Nick Craswell, David Hawking
E-Mail: nick.craswell@csiro.au
Write and evaluate a site summariser. Optionally apply the same
principles to subject-specific search services.
Yahoo! [1] is an example of a
successful company which performs manual site summarisation: all
their listed sites have hand-written summaries. InvisibleWeb [2] do the same job as Yahoo!,
but summarising subject-specific search services. Either type of
company might find automatic summarisation techniques useful.
There are many document summarisation algorithms, including those
based on the content of one or many documents, and those based on
hyper-links [3]. The project
is to find out which summarisation methods are best at describing the
content of a site (and/or subject-specific search service).
Supervisor:
Chris Johnson
Details:
Mobile phone networks are very expensive way to deliver pictures
to mobile phone handsets, but the handsets are now being made with
cameras and display screens.
We can also see devices that combine remote network phone communications with local Bluetooth
communications.
Voice communications over the phone network are still needed, because
voice over IP is not yet widely available with sufficient quality and
convenience - and the phone network is much more extensive in
geopgraphical reach and number of customers; but where two people have
compatible devices and have some access to short-range wireless
networks, they should be able to exchange pictures over IP.
It is proposed to make the two networks complement each other: to
send short messages and interactive voice communications over the
instant, mobile phone network, and larger multimedia objects with some
delay over the cheap, local Bluetooth or WiFi network.
Bluetooth networks are short range, and the devices will need to form
ad hoc, peer to peer connections to hand the multimedia chunks along
the chain to make the distance. (WiFi networks are more centralised
but the same peer to peer behaviour is also possible).
The problem is to discover, design, install, implement the
appropriate protocols, taking into account users' social and
geographical requirments. Experimental demonstration using mobile
wireless devices such as laptops and PDAs in conjunction with normal
mobile phones is expected - without deep integration of the two modes
of communcation. Some extension work would be desirable towards
analysis of network reach, and the density of partygoers needed to
make the service work in practice.
References
The problem is mentioned in a report on a talk by Henning Schulzrinne, at Mobicom 2002] - as
reported in IEEE Internet Computing, Nov/Dec 2002, p 12.
Peer to peer networks, Bluetooth, and WiFi are hot topics at present,
and the student will have to find their way into the gossip and
separate hard facts from hyped fancy.
Student skills required
Interest in distributed computing application protocols; interest and
ability to search out existing software systems; analytical awareness
of user needs and social realities as well as
technologies. Self-starting person, at home with finding, installing and
experimenting with real world software. An ability to analyse and
write about the implications of the technical work is desirable.
Project support will come from the Smart Internet
Technology Intelligent Environments group, in both technology and
conceptual buzz.
Supervisor:
Alistair Rendell
From using a calculator we are all familiar with the fact that many
numbers must, at some level, be truncated and rounded. Thus 2/3 may
appear as 0.67, 0.66667, or 0.6666667 etc depending on the precision
of your calculator. More generally representing the continuum of floating
point numbers by the discrete finite representations used in all
modern computers gives rise to rounding errors, overflows, underflows
etc. Failure to account for these limitations can give rise to
spectacular (expensive!!) results, e.g. the failure of some Patriot
missiles to hit their targets during the Gulf War or the
explosion of the Ariane 5 space rocket (see
disasters)
Interval arithmetic is a fundamental paradigm shift that attempts to
address these issues. It defines an "interval" as a continuum of real
numbers bounded by its end-points and performs all arithmetic
operations with respect to intervals rather than discrete floating
point numbers. As a consequence of this errors are automatically
propagated and accounted for. Moreover the mathematical properties of
intervals can be exploited to develop entirely new algorithms.
The concept of intervals is actually not new, but was considered by
Moore in 1965. What is relatively new is the fact that some
hardware vendors are now providing direct support for interval
arithmetic within their compilers/hardware.
The aim of this project will be investigate the use of interval
arithmetic for computational science applications. The initial goal
will be to use intervals to study error propagation for some
"problematic" algorithms found within the Gaussian computational
chemistry code. A second goal will be to investigate the use of
intervals for nonlinear global optimizations, eg determining the
most stable conformation (geometry) of a large molecular system (eg
protein).
For more information on interval arithmetic see the
interval home page.
Supervisors:
Alistair Rendell,
Peter Christen and
Peter Strazdins
Although clock speed is important, on modern computers in order to
obtain a high percentage of the theoretical peak performance
it is vital to maximize cache re-usage. As a consequence much effort
has been directed to developing algorithms that are, in some way,
blocked so that they spend the majority of their time working on data
that is cache resident. The classic example of this is a matrix
multiplication algorithm in which large matrices are broken down into
sub-matrices small enough to reside in cache. These sub-matrices are
then multiplied together and the results used to construct
the overall result. A disadvantage of these
approaches, however, is that they usually contain explicit parameters
relating to the cache size of the computer being used. Thus limiting
portability of the code, possibility even between CPUs from the same
manufacturer.
An alternative to explicit cache blocking is to construct your
algorithm from the outset in such away that, at some point, it
naturally becomes cache resident. Such algorithms are referred to as
"cache-oblivious" and have been strongly advocated by Matteo Frigo in
a series of papers recently published. Ideal candidates for
cache-oblivious algorithms are recursive algorithms, where frequently
the data set is successively divided into half, and thus at some stage
becomes small enough to reside in cache.
Unfortunately to date most of the work on cache oblivious algorithms
has been theoretical rather than practical. The aim of this
project is to implement some cache oblivious algorithms and use
hardware performance counters both to compare their performance
characteristics with more traditional algorithms, and to study their
performance on a range of different hardware platforms.
Supervisors:
Alistair Rendell and
Peter Strazdins
OpenMP is an Application Program
Interface (API) for multi-platform shared-memory parallel programming
in C/C++ and Fortran. It is jointly defined by a group of hardware and
software vendors and, via cOMPunity
, the wider community. The first OpenMP specification was released
in 97, while version 2.0 for C/C++ was released in 02. Since its
initial release it has gained increasing use, particularly in the
scientific community.
Currently there is much debate within the OpenMP community as to what
should be included in future OpenMP specifications. We are very
interested in these issues as part of the
CC-NUMA project. In
particular we are keen to explore possible
- Support for Non-Uniform Memory Access (NUMA) architectures
- Task queues models
With respect to the first issue, OpenMP currently assumes a uniform memory
access model. However, it is well know that all modern large scale
shared memory parallel machines have some level of NUMA. Some vendors,
such as SGI and Compaq (now HP) have extended their OpenMP
implementations to include NUMA support. However, there is no consensus
on how this should be done. Similarly both KAI (now Intel) and Sun
have put forward interesting, but different, proposals for including a
task queue model in OpenMP. As an honours project you would explore
one of these issues in the context of a variety of kernel algorithms
and on a selection of high performance computing platforms (from SGI,
HP and Sun).
A further opportunity exists to consider OpenMP on cluster
environments. Thus, although OpenMP is a shared memory
programming paradigm there has been some work to enable OpenMP on
clusters (similar to software distributed shared memory models).
Of particularly note is the work of the PC cluster
consortium in Japan. Work in this area would involve using their SCore
cluster development kit on the Bunyip Cluster and the
APAC SC system.
Supervisor: Weifa Liang
Email: wliang@cs.anu.edu.au
Phone: 6125 3019
One of the fundamental problems in the design of data
warehouses is the formation of effective methods for selecting which views
to materialize,
and which materialized views to replace. In other words, given
the resource constraint and query load to be supported in a data
warehouse, the
selection problem usually is to select a set of views to
materialize such that the sum of query response time and view refreshment
(or
maintenance) time is minimized. Though the previous work has
offered significant insight into the nature of the problem, in all of the
existing work,
it is assumed that the initial data warehouse is empty. In
normal operation, the data warehouse grows incrementally. Also, the query
and update
frequencies of materialized views are subject to changes with
time.
Another related challenging problem in data warehousing is the
view refreshment (view maintenance) problem under the constraint of a
given
refreshment time window. In order to keep the content of a
materialized view in the warehouse consistent with the contents of the
remote sources,
the materialized view in the warehouse must be updated
immediately or periodically in response to the source updates. Usually
such updates are
done incrementally. In summary, the following work will be
conducted in this project.
design of new algorithms for view selection and replacements
in the dynamic data warehouse environment, and evaluation of new
strategies
for such applications through implementation and performance
evaluation.
examination of the existing view refreshment policies,
development of new view refreshment algorithms that shorten view
refreshment time
for the data warehouse.
Supervisor: Weifa Liang
Email: wliang@cs.anu.edu.au
Phone: 6125 3019
Optical network technique plays a key role to the
next-generation networks. In particular, wavelength-division-multiplexing
(WDM) networks have
emerged as a promising candidate for next-generation networks
in providing large available bandwidth and connectivity. Routing and
wavelength
assignment problem is one of the fundamental problems in WDM
optical networks.
Designing efficient routing protocols is a fundamental problem
for any network. There is no exception for the WDM optical networks. By
the nature
of WDM network, the traditional routing algorithms are
ill-designed for this setting. It is thus necessary to devise new
algorithms and algorithmic
methodology for it.
In this project we primarily focus on devising efficient
routing algorithms for two fundamental routing problems in WDM networks.
They are the
all-to-all routing and robust routing problems. Incorporated
with various known techniques, practical algorithms and new techniques for
solving
them will be developed. The performance of the developed
algorithms will also be analyzed through experimental simulation. Our
objectives are to devise efficient routing algorithms for several fundamental
routing problems in WDM optical networks. In particular we will focus on
devising
routing algorithms for all-to-all routing and robust routing
problems through the use of a collection of algorithmic skills and graph
techniques
including path-coloring, network flow, maximum matching, etc.
The performance of the developed routing algorithms will be analyzed
through
experimental simulation.
develop a generic methodology for the design of ``efficient''
routing protocols for WDM networks. To effectively utilize the network
resources,
an efficient routing protocol should be able to optimize
multiple objectives simultaneously in terms of different sorts of network
resources
consumption.
investigate various known techniques to achieve efficient
routing in WDM optical networks, and develop new concepts and techniques
to
solve the routing problems in WDM optical networks. Practical
algorithms to solve some real problems, will be developed, implemented,
and
tested.
Supervisor: Weifa Liang
Email: wliang@cs.anu.edu.au
Phone: 6125 3019
Mobile multihop radio networks, also termed peer-to-peer, all
wireless, or wireless ad hoc networks, are expected to fulfill a critical
role in
applications in which wired backbone networks are not available
or not economical to build. These span the commercial, public as well as
tactical
communications sectors. The networks provide the only solution
in situations where instant infrastructure is needed and no central system
backbone and administration (like base stations and wired
backbone in a cellular system) exist. Some of the applications of the
networks include
mobile computing in areas where other infrastructure is
unavailable, law enforcement operations, disaster recovery situations, as
well as ad-hoc
networks for large events such as sporting events or congresses
when it is not economical to build a fixed infrastructure for a short
temporary
usage. In tactical battlefield communications the hostility of
the environment prevents the application of a fixed backbone network. We
consider a
peer-to-peer mobile network consisting of a large number mobile
nodes that create a network on demand an may communicate with each other
via
intermediate modes in a multihop mode, i.e., every node can be
a router.
In this project we focus on new power-aware routing protocols
to extend the life of the nodes and the network, based on homogeneous and
heterogeneous mobile models, a new power-aware routing protocol
will be proposed, which takes the capacities of different types of mobiles
into consideration.
Supervisor: Steve Blackburn
Email: Steve.Blackburn@anu.edu.au
Phone: +61 2 6125 4821
JMTk is a new memory management toolkit for Java, which I have
co-developed with Perry Cheng of IBM Research and Kathryn McKinley of
the University of Texas. JMTk has proved very popular and there is
significant interest in allowing it to work in other language
environments including Microsoft's CLI, and the Haskell runtime. This
project will explore the issues associated with interfacing JMTk to
these environments. For exceptional students, the possibility exists
of an internship at Microsoft Research as part of the project.
Supervisor: Steve Blackburn
Email: Steve.Blackburn@anu.edu.au
Phone: +61 2 6125 4821
Java is very important commercially, and yet the runtime systems
required to support it and other languages such as C# are relatively
immature. This project will involve measurement and analysis of the
performance of Java programs, with a view to better understanding
where there are opportunities for performance improvement. This work
will involve using and modifying JMTk, a memory management toolkit
for JikesRVM.
Supervisor: Tom Worthington, Ian Barnes
Email: Tom.Worthington@tomw.net.au
Phone: 0419 496 150
The World Wide Web provides a useful way to
provide information to the general public and organization staff.
However, in an emergency, web sites can be overloaded by demand.
Investigate how web based systems have coped with emergencies and
suggest techniques to allow web based systems to work better.
Techniques might include use of accessibility
guidelines and design to enhance the use of caching. Recommend
changes, include prototypes for the ACT
Government and the Sentinel
Fire Mapping System, as used in the ACT
bush fire emergency of January 2003.
Supervisor: Tom Worthington, Ian Barnes
Email: Tom.Worthington@tomw.net.au
Phone: 0419 496 150
Web Services provides a way to define
a set of transactions and the format of the data which those
transactions can be performed on. In theory the XML format used makes
e-commerce simple to implement, but in practice the take-up has been
slow. The Australian Taxation Office provides free
XML based e-commerce software for small businesses to submit
electronic business activity statements (eBAS) on-line, but most
still
use paper forms. It has been demonstrated that it is possible to
transform the eBAS
format using XSLT, to be displayable by a web browser without the
need for additional software. Investigate enhancements to the eBAS
format to allow processing by a web browser, without additional
software. Propose changes to the eBAS XML format, and if needed to
the relevant web services and XML standards to make this possible.
Supervisor: Tom Worthington, Ian Barnes
Email: Tom.Worthington@tomw.net.au
Phone: 0419 496 150
An XML
based standard format for office documents containing text, spreadsheets, charts, and
graphics has been proposed. The standard is to be based on
OpenOffice.Org's file format. However, the OOO
format is complex, contains errors, is not compatible with
available web browsers and does not make full use of the latest XML
standards. Investigate options for a file format which is simpler
than OOO, uses existing standards and is compatible with web
browsers. Implement the proposed standard as an extension to the OOO
package.
Supervisor: Tom Worthington, Ian Barnes
Email: Tom.Worthington@tomw.net.au
Phone: 0419 496 150
The Australian Creative Resources Archives (ACRA)
is an ARC funded pilot project to make “waste” materials
(such as video, animation, audio and text) from Australian cultural
production processes available to researchers. The pilot
infrastructure will consist of commercial digital video equipment,
interfaced to GrangeNet.
Some possible areas for research:
Multimedia design principles:
Extend the OpenOffice.org
package to work with XML encoded film scripts and interface it
to the archive. Investigate user interfaces for displaying the
scripts (text, video timeline, or a tree of objects). Implement a
prototype allowing the archive's content to be manipulated by OOO as
a film script and played back as high resolution video over
GrangeNet.
Database and interface design:
Investigate the metadata definitions built into the archive's
commercial software, compare these with avaliable
standards and the projects requirements to suggest what needs to
be added. Implement an interface to the archive to allow metadata,
including additonal items) to be entered, searched and extracted.
Supervisor: Tom Worthington, Ian Barnes
Email: Tom.Worthington@tomw.net.au
Phone: 0419 496 150
Investigate artificial intelligence algorithms for automatically laying
out web pages and produce an open source prototype software. Document layout
"hints" for different renderings of the document (print, web, slideshow and
AV) would be explicitly encoded in the document (using XML or similar
format) or would be inferred from an existing screen layout. Documents would
be rendered to suit the user's requirements and the capabilities of their
display device and communications link, through features in the display
device and/or in a server (for low capability display devices). As an
example multiple frames would be used on large screens and one frame with
links on small screens. The software would generate documents incorporating
accessibility features for the disabled as described in W3C Web Content Accessibility
Guidelines. Multiple renderings of information objects (for example
multiple language versions for text, text captions for images) would be
available.
Supervisor: Michael Norrish
Email: Michael.Norrish@cs.anu.edu.au
This project would suit a student keen to work in an area involving
logic and proof, and the development of infrastructure to support
these activities.
When specifying systems formally, a significant error-finding benefit
can accrue simply from writing everything down in an expressive logic,
and getting it all to type-check. Working in this way allows users to
quickly deal with the “low-hanging fruit”; important bugs
that are easy to find.
The HOL system currently
doesn’t allow for this model of work very well. Part of the reason
for this is that script files for HOL inter-mingle SML (which is the
theorem-prover’s implementation language) and embedded bits of logic.
We would like to check the logic without having to be able to parse
(and execute) SML as well.
Therefore, the first aim of this project is to design a simple file
format in which users could write “SML-less HOL”, and to
then implement at least two tools for manipulating this format. The
first tool would be the type-checker. It would make sure that the
definitions and other logical material in the file actually made sense
at the level of types. The second tool would do all this and also
produce a genuine HOL theory, using the various definitional
facilities already present in HOL. This file format would need to
include mechanisms for referencing other files, where previous logical
developments had been built up.
The final stage of the project would be to implement a simple
declarative language for writing proofs. The issues here are similar
to those for definitions; to escape the use of files that include
(SML) code, and move to formats that are easy to parse and manipulate
as data.
Supervisor: Michael Norrish
Email: Michael.Norrish@cs.anu.edu.au
At the core of LCF systems like HOL,
there is a small “kernel” that implements the primitive
rules of inference of the logic. There are a number of interesting
issues relating to the design of these kernels. For example
- How to represent free and bound variables?
- Optimising common calculations (such as calculating the set of
free variables)
The project would be to explore this design space by
implementing first a very simple-minded kernel for HOL, and then
testing a set of possible optimisations on top of this initial design.
This problem comes with its own test suite, and admits a very explicit
and clear specification, because of its position as part of HOL.
Supervisors: Dean Kuo
and Sylvie Thiebaux
E-Mails: Dean.Kuo@csiro.au,
Sylvie.Thiebaux@anu.edu.au
Phones: (02) 6216 7066, (02) 6125 8678
Suitability: This project would suit a student interested in Software
Engineering research. It is best conducted as a 4th year BSE individual
research project or as a BInfTech honours thesis. If necessary, it may
be possible to approach it as a 4th year BSE team project, but this would
require negotiations as to how to appropriately modify its scope given below.
Prerequisites: Knowledge and interest in distributed computing,
logic and formal specification and verification.
Project Description:
Application integration technologies have been needed ever since the earliest
days of commercial computing since no single application was ever hope to
meet the complete needs of an organisation. Integration can mean many things,
from simple file based exchanges (e.g. FTP) to tightly coupled systems using
remote procedure calls. The need for application integration has given rise
to technologies such as EAI (Enterprise Application Integration) and B2Bi
(Business to Business integration) and most recently has been the driving
force the development and adoption of Web Services. These technologies provide
solutions to the low-level issues of connectivity, data interchange and cross-application
invocation. They make it possible to integrate autonomous and independent
applications with some degree of ease.
The interactions between autonomous applications often require quite complex
B2B protocols - that is, the interaction defines complex stateful interaction
between the applications that cross organisational boundaries. An example
of a protocol is e-procurement: after the customer sends a purchase order
to the merchant, the merchant notifies the customer either the order has been
accepted or rejected; if the merchant accepts the order, the merchant will
later send a shipping notice and after the customer has received the goods,
it sends an acknowledgement that goods have been received. In addition, the
merchant sends an invoice to the customer, the customer is then expected to
send payment before its due date and then the merchant, on receipt of payment,
needs to send a receipt to the customer. The protocol becomes extremely complex
when the customer or merchant cancels an order since goods may need to be
returned and payment refunded. In the real world, we do not have the luxury
of a global architecture and design when integrating autonomous applications,
as these have been typically designed and implement separately. Thus, one
crucial question that needs to be answered before we can reliably integrate
applications is whether they are compatible. An example of two applications
that are not compatible is if the customer is unwilling to send payment before
goods have been received while the merchant refuses to deliver the goods
until payment has been received.
Model-checking is an area of computer science which aims at developing
tools for verifying that a system (or a set of communicating systems) satisfies
a specification usually given in temporal logic, by showing that this (set
of) system(s) is a logical model for the temporal logic formulae. Model-checking
techniques have proven extremely successful in verifying hardware, and their
applicability to other areas, including distributed protocols, has been the
subject of much investigation for the past 10 years.
This research project is to investigate the following:
- Can we use the model checker SPIN to automate the process of verifying
that two or more applications are compatible to perform a given business process?
The B2B protocol that we will verify is a protocol, derived from one of our
consultancies, for e-procurement. For simple cases, we are confident that
it is possible. However, it is not clear whether more complex cases can be
handled by SPIN. If not, we want to identify why not and what sort of extensions
is required to be able to handle these cases.
- Identify strength and weakness of using SPIN to verify that applications
are compatible.
References: