Title: Embedding one language within another

* Appropriate Courses: COMP8720 (SwEng), COMP8760 (CS), COMP8790 (SwEng)
* Orientation: software engineering
* Status: proposition
* Student:
* Supervisor(s)/Client: Patrik Haslum (ANU/NICTA)
* Research Area(s): Programming languages, compiler technology
* Technical Difficulty Level: medium
* Conceptual Difficulty Level: easy

Description

There exists several useful programming libraries whose functionality is
accessed by means of a language. To give one example, the sqlite library
(sqlite.org) provides a database engine, and offers an API that is based
mainly on passing SQL statements as strings. Using one of the many available
wrappers, executing a query from within a C++ program can look something
like:

Query* the_query =
new Query(db, "select name, number from tbl1 where number < 5");
while (the_query->has_row()) {
string the_name = the_query->value_as_string(0);
int the_number = the_query->value_as_int(1);
// ...
the_query->next_row();
}
delete the_query;

What is wrong with this?

* The C++ compiler sees the SQL query only as a string, so it will not check
that the query is syntactically correct SQL, or that the database schema
(if known) contains the tables/fields it refers to. Thus, a statically
committed error will not be noticed until runtime.

* In making use of the result of the query, we're forced to refer to the
result fields by index (0, 1, ...) instead of their names. For larger SQL
queries, this becomes another source of easy-to-make mistakes, which also
will not be noticed until the program is run (and then only when the query
is executed).

What we would like to do is embed the language used by the library API
(in this case, SQL) directly within the language of the program calling
the API, i.e., write something like:

for (select name, number from tbl1 where number < 5) {
string the_name = tbl1.name;
int the_number = tbl1.number;
}

and have this automatically translated to the form above (API calls using
strings), and checking the SQL statement for syntactic correctness (and
maybe compliance with the database schema as well) in the process.

Something like this, for SQL, exists in Comega
(http://research.microsoft.com/en-us/um/cambridge/projects/comega/; see also
http://en.wikipedia.org/wiki/C%CF%89), an experimental extension of C#.

But there is no reason why it should be limited to the particular case of
embedding SQL in C# (or why it should only work with one particular
compiler). My conjecture is that much of the desirable properties of such
an embedding can be achieved by simple macro processing.

The purpose of this project is to test this hypothesis, by implementing some
such language embeddings. The more general the better: ideally, significant
parts of the code developed for, for instance, embedding SQL in C++ should
be reusable for embedding SQL in Java/Lisp/Python/whatever (ok, maybe it
doesn't make so much sense for an interpreted language), as well as for
embedding, say, XPath, XSLT or regexps in C++, Java, Lisp, Python, ...

See also:

* http://clsql.b9.com/ (looks like someone has already done it for Lisp...)

Title: Projects in Automated Planning

* Appropriate Courses: COMP8740 (AI)
* Orientation: research
* Status: proposition
* Student:
* Supervisor(s)/Client: Patrik Haslum (ANU/NICTA)
* Research Area(s): AI planning
* Technical Difficulty Level: variable
* Conceptual Difficulty Level: variable

Description

Planning is a branch of AI concerned with automating the reasoning
that goes on in formulating plans, in the sense of a series of steps
to be taken, each of which has some effect on the state of (a highly
abstract model of) the world, so as to bring about some desired end
state. As a prototypical example, one may take of single-player games
("puzzles") such as the Freecell card game or the old Rubik's Cube.
Possible applications of automated planning methods, aside from solving
puzzles, are many and diverse: examples include airport ground traffic
control, computing genome edit distances, testing protocols for logical
flaws, or controlling a printer.

The aim of research in planning, as in many other branches of AI, is
to construct domain-independent ("universal") solutions for this kind
of problem. That is, rather than solving each application problem
individually, a general AI planning system should be able to solve any
one of them, provided a formal specification of the problem as input.
Several approaches to achieving this have been tried, such as different
variations of search, or recasting the problem as another kind of
general reasoning (e.g., a CSP, or logical deduction). Yet, efficient
general automated planning remains a challenge.

Within the area of AI planning, a wide variety of projects are possible,
ranging from highly theoretical to practical, implementation-oriented,
and anywhere in between. New projects based on student's ideas can also
be considered. Students interested in AI planning are encouraged to
enquire.

Title: Various Combinatorial Optimisation Problems

* Appropriate Courses: COMP8740 (AI), COMP8740 (CS)
* Orientation: research
* Status: proposition
* Student:
* Supervisor(s)/Client: Patrik Haslum (ANU/NICTA)
* Research Area(s): Combinatorial optimisation, AI.
* Technical Difficulty Level: variable
* Conceptual Difficulty Level: variable

Description

Although research in AI planning aims to construct fully general,
domain-independent, planning systems, systems are often evaluated
and compared by testing them on a collection of benchmark domains,
accumulated over time by the research community. Many of these
benchmarks resemble well-known combinatorial optimisation problems,
such scheduling/sequencing, allocation, etc., or at least contain
parts that do, but each also has its own peculiarities and tweaks.
For a majority of benchmark domains, it is known that finding optimal
solutions is difficult (NP-hard or worse) while just finding any
solution is relatively easy (there exists efficient, and often simple,
domain-specific algorithms), providing further evidence that the
underlying problems encoded in these benchmark domains are, at heart,
about optimisation.

This project consists in picking up one (or more, depending on
project time frame, problem difficulty, etc) of the commonly used
planning benchmark domains that encode optimisation problems, and
attack the underlying problem using whatever methods/tools seem most
suitable, including for example LP/MILP, CSP/COP/SAT, local search
methods, or anything else. The aim is to obtain a better understanding
of the problems that underlie planning benchmarks -- how hard are
they (in a practical, rather than complexity-theoretical, sense) and
what instance parameters determine that hardness? how well do
domain-independent planning systems compare, in terms of effectiveness
and solution quality, to a domain-specific solution? -- but also to
learn something about the different combinatorial optimisation
technologies -- how difficult are they to apply to a given problem?
which is more suited to what kind of problem?