CECS Home | ANU Home | Search ANU
The Australian National University
ANU College of Engineering and Computer Science
Department of Computer Science
Printer Friendly Version of this Document

UniSAFE

COMP8400 - Course content and schedule

  • Check the current COMP8400 lecture and tutorial/lab times from ANU timetabling.

  • There will be one two-hour lecture per week (ten two-hour lectures - 20 hours - in total).

  • There will be four tutorials (based on selected readings).

  • There will be four practical laboratory sessions.

  • There will be two assignments (programming / data analysis projects).

  • There will be one student presentation of a research paper.

Note: The course lecturer Peter Christen will be away in the first week after the semester break (27 April to 1 May). He will attend the PAKDD'09 conference in Bangkok.

Course content

The course will cover the following topics:

  1. Course introduction and data mining overview   (1 hour)
  2. Data mining process   (1 hour)
  3. Data issues in data mining (including data warehouses) and data pre-processing   (2 hours)
  4. Data integration and linkage   (2 hours)
  5. Mining frequent patterns and associations   (2 hours)
  6. Cluster analysis   (2 hours)
  7. Classification and prediction: Decision trees, Bayes classification, neural networks, support vector machines, predictive modelling, accuracy and evaluation measures, model selection   (4 hours)
  8. Mining time series and data streams   (1 hour)
  9. Privacy-preserving data mining   (1 hour)
  10. Text data mining   (1 hour)
  11. Web data mining   (1 hour)
  12. End-to-end data mining (guest speaker)   (1 hour)
  13. Data mining trends, social impacts and course review   (1 hour)

Course schedule   (last update 18 May 2009)

Semester week
(year week)
Date
Lecture
Computer Science and Information Technology Building, N101
(building number 108)
This building is located in map GH54 at grid reference G4
Tutorials / Labs
Computer Science and Information Technology Building, N115/N116
(building number 108)
This building is located in map GH54 at grid reference G4
Assignments
1 (8) 26 Feb Course introduction and
data mining overview
The data mining process
   
2 (9) 5 Mar Data issues in data mining
Data pre-processing
   
3 (10) 12 Mar Data integration and
data linkage
Laboratory 1
(Introduction to Rattle)
 
4 (11) 19 Mar Mining frequent patterns and associations Tutorial 1
(Rahm and Do, 2000: Data cleaning,
Winkler, 2004: Data quality)
 
5 (12) 26 Mar Cluster analysis Laboratory 2
(Association rules in Rattle)
 
6 (13) 2 Apr Classification and prediction (1) Tutorial 2
(Agrawal and Srikant, 1994: Fast Algorithms for Mining Association Rules,
Tan, Kumar and Srivastava, 2002: Selecting the Right Interestingness Measure for Association Patterns)
 
7 (14) 9 Apr Classification and prediction (2) Laboratory 3 (Decision trees in Rattle)  
Mid Semester Break (Friday 10 April - Sunday 26 April)
8 (17) 30 Apr No lecture (Peter is attending PAKDD'09)    
9 (18) 7 May Mining data streams and time series
Privacy-preserving data mining
Tutorial 3
(Salzberg, 1997: Comparing classifiers,
Hand, 2006: Classifier technology)
Assignment 1 due
(Wednesday 6 May, 5 pm)
10 (19) 14 May Text data mining
Web data mining
Laboratory 4 (SVM and other classifiers in Rattle)
New: Moved to week 11 (21 May)
 
11 (20) 21 May Paper presentations (1) Tutorial 4
(Verykios et al., 2004: Privacy-preserving data mining,
Patman and Thompson, 2003: Names: A new frontier in text mining) New: Moved to week 12 (28 May)
 
12 (21) 28 May Paper presentations (2) (9-10)
New: End-to-end data mining (guest speaker: Graham Williams, ATO / Togaware data mining) <10-11)
Extra tutorial, discussion, catch-up lecture, etc. Assignment 2 due
(Wednesday 27 May, 5pm)
13 (22) 4 Jun New: Paper presentations (3) (9-10)

Data mining trends, social impacts and course review
   


Last modified: 18/05/2009, 09:50