CECS Home | ANU Home | Search ANU
The Australian National University
ANU College of Engineering and Computer Science
Department of Computer Science
Printer Friendly Version of this Document

UniSAFE

COMP8400 - Paper presentation

This presentation is worth 20% of your total course mark. It will be marked out of 20 as indicated below.

Please note: Students are not expected to choose a paper for their presentation until after the semester break!

Objectives

The objective of this paper presentation is for students to learn about recent research in data mining, by selecting and reading a scientific paper published in a relevant data mining conference or journal, and to summarise and present the paper in both a short (10 minutes) talk and a report that addresses various aspects of the paper.

The estimated time we expect you to spend on this assignment is around 20 hours in total (1 hour per mark). This includes browsing through the papers provided on the conference and journal Web sites listed below, selecting a paper of your interest, reading and summarising this paper, possibly read further material to clarify certain aspects of the paper, and to write the report and produce the slides for the presentation.

Submission

You will have to submit three documents, your report, your slides and the paper you have selected (all three in PDF format, not in MS Word or MS Power Point!!). These three documents have to be named:

  • u1234567-report.pdf
  • u1234567-slides.pdf
  • u1234567-paper.pdf
(please replace u1234567 with your ANU university ID).

You need to submit your slides before your presentation, while the report and paper can be submitted afterwards (details further below).

Submission details:

  1. Your presentation can be a maximum of ten (10) minutes long and should be made of five (5) slides, as detailed below.

  2. Your report can be maximum four (4) A4 pages long, and must contain the section discussed below. Don't use a font smaller than 12 points.

Important:

  • Please submit your slides file (i.e. u1234567-slides.pdf) at least two days before your presentation is scheduled, so that I can upload them onto my laptop and check the correctness/compatibility of your slides.

    If you submit later I might NOT have the time to upload and check your slides.

  • Do not bring your laptop to the presentation! I will copy all slide files onto my laptop in order to reduce the time needed between presentations.

  • You have to submit your report and paper files (i.e. u1234567-report.pdf and u1234567-paper.pdf) by 5 pm on Friday 29 May.

  • You are not allowed to chose one of the eight COMP8400 tutorial papers for your presentation.

  • You have to send your submission to comp8400@cs.anu.edu.au.

Penalties

Penalties for late submissions of your report (if you submit it after Friday 29 May 5 pm) are as follows:

How late less than 6 hours 6 to 24 hours 24 to 48 hours 48 to 72 hours more than 72 hours
Penalty from 10 marks -0.5 -1 -2 -4 -8 (forget it!)

Penalties for report submission that are longer than 4 pages are as follows:

Number of pages   5     6   7 or more
Penalty from 10 marks -1 -2 -4

Plagiarism

We do take plagiarism seriously! You should read the chapter in the Department of Computer Science Student Handbook that discusses assessment (Chapter 6, pages 18-24), particularly the sections headed Misconduct in examinations (which also applies to assignments and other forms of assessment) and Collaborations versus misconduct in assignments.

If you do include material from some other documents (e.g. graphics and figures, tables or formulas extracted from a paper, a book or a Web site), then you clearly have to make attribution, for example by writing the name of the paper, book, etc. where you got it from onto your slides or adding a reference to your report.


Tasks

  1. Select a paper from one of the sources provided below. You should browse around and choose a paper which is of your interest. Once you have found a paper you would like to present, e-mail me, the course coordinator, the title, author(s) and conference/journal of this paper. Please either attach the paper as a PDF file to the e-mail, or include the URL from where you got the paper in your e-mail.
    I will check if this paper is appropriate, and if it has already been selected by another student or not (please check the list of already selected papers below)

    Alternatively, if you already have a paper you like to present from another source (not from the list given below), please send the paper (as PDF or postscript file) as well as a URL (from where you got it from) to the course coordinator and ask if this paper is appropriate.

    Please also e-mail me if you prefer to give your presentation in the first session (currently scheduled for Thursday 21 May, 9-11) or in the second session (currently scheduled for Thursday 28 May, 9-11).

  2. Report (10 marks)

    • The report on your selected paper has to be maximum four (4) A4 pages long. Please do not use fonts smaller than 12 points.

    • The first page should contain at the top your name and ANU university ID, the paper title and its author(s), publication year, and the conference or journal where the paper was published.

    • Your report has to contain the following sections:
      1. Summary: In your own words (not the paper abstract!) summarise the topic, content, contributions and outcomes or results of the paper, as well as its citation count (for example taken from Google Scholar).
      2. Area and techniques: Give a short description of what area this paper covers (like data cleaning, transformation, pre-processing; clustering; associations; prediction; classification; privacy; etc.) and what techniques are used or proposed.
      3. Experiments and data sets: If the paper contains experimental results, what data sets are used for these experiments (synthetic or real data, sourced from where, publicly available or not, etc.).
      4. Measurements: What quality measures are used (for example accuracy, precision, recall, support, confidence, etc.)?
      5. Criticism: What are the main critic points of this paper? This might include, for example, only limited experiments on small data sets, claims written which are not supported by theory or experiments, etc.

  3. Presentation (10 marks: 5 for slides and 5 for oral presentation)

    • You should summarise your selected paper in a short presentation of maximum 10 minutes duration.

    • Your presentation should have the following 5 slides:
      1. Your name and ANU student ID, the paper title and its author(s), publication year, and the conference or journal where the paper was published, as well as its citation count (for example taken from Google Scholar).
      2. A summary of the paper (for example as a dot list of its main content and contributions).
      3. The area and techniques used or proposed in the paper, as well as data sets used for experiments.
      4. Outcomes and/or results of the paper.
      5. Your criticism of the paper, for example if it does not provide an experimental evaluation, or only give experiments on small data sets, or parts of the paper are written unclear, or not enough detail are given, etc.

Presentation schedule

  • We will have two two-hour sessions this year, with five presentations per hour (ten per session). They are scheduled to be on:

    1. Thursday 21 May, 9-11.
    2. Thursday 28 May, 9-11.

  • All student presentations will be held in the CSIT seminar room N101 (where the COMP8400 lectures are held).


Sources of paper (conference and journal repositories)

You should be able to access electronic version of papers in these conferences and journals from within the ANU, if you have problems please let the course-coordinator know.


Selected papers

The following papers have already been selected by a student:

  1. Title: Dynamics of a Collaborative Rating System
    Authors: Kristina Lerman
    Conference: WebKDD 2007

  2. Title: Mining frequent patterns without candidate generation
    Authors: H Jiawei, P Jian, Y Yiwen
    Conference: ACM SIGMOD 2000

  3. Title: Probabilistic Latent Semantic Visualization: Topic Model for Visualizing Documents
    Authors: Tomoharu Iwata, Takeshi Yamada, and Naonori Ueda
    Conference: ACM SIGKDD 2008, Las Vegas

  4. Title: OntoDM: An Ontology of Data Mining
    Authors: Panov, P.; Dzeroski, S.; Soldatova, L.
    Conference: ICDMW 2008 Workshops, Pisa, Italy

  5. Title: Application of Data Mining Techniques for Medical Image Classification
    Authors: Maria-Luiza Antonie, Osmar R. Zaiane and Alexandru Coman
    Conference: ACM SIGKDD 2001 Workshops

  6. Title: Efficient Anonymity Preserving Data Collection
    Authors: Justin Brickell and Vitaly Shmatikov
    Conference: ACM SIGKDD, Philadelphia, 2006

  7. Title: Schema exchange: Generic mappings for transforming data and metadata
    Authors: Paolo Papotti and Riccardo Torlone
    Journal: Elsevier Data & Knowledge Engineering

  8. Title: Combined Association Rule Mining
    Authors: Huaifeng Zhang, Yanchang Zhao, Longbing Cao, Chengqi Zhang
    Conference: PAKDD 2008, Osaka

  9. Title: Mining Quality-Aware Subspace Clusters
    Authors: Ying-Ju Chen, Yi-Hong Chu, Ming-Syan Chen
    Conference: PAKDD 2008, Osaka.

  10. Title: Multi-label Lazy Associative Classification
    Authors: Adriano Veloso, Wagner Meira Jr., Marcos Goncalves and Mohammed Zaki
    Conference: PAKDD 2007, Nanjing.

  11. Title: Mining Frequent Itemsets from Uncertain Data
    Authors: Chun-Kit Chui, Ben Kao, and Edward Hung
    Conference: PAKDD 2007, Nanjing.

  12. Title: Distributed Data Clustering can be Efficient and Exact
    Authors: George Forman and Bin Zhang
    Journal: SIGKDD Explorations, vol. 2, 2000.

  13. Title: Cost-efficient mining techniques for data streams
    Authors: Mohamed Gaber, Shonali Krishnaswamy, Arkady Zaslavsky
    Conference: Workshop on Australasian Information Security, Data Mining and Web Intelligence, and Software Internationalisation, 2004.

  14. Title: Influence and correlation in Social Networks
    Authors: Aris Anagnostopoulos, Ravi Kumar, Mohammad Mahdian
    Conference: ACM SIGKDD 2008, Las Vegas.

  15. Title: Clustering of Streaming Time Series is Meaningless
    Authors: Jessica Lin, Eamonn Keogh and Wagner Truppel
    Conference: ACM DMKD 2003.

  16. Title: SPARCL: Efficient and Effective Shape-based Clustering
    Authors: Vineet Chaoji, Mohammad Hasan, Saeed Salem, and Mohammed J. Zaki
    Conference: IEEE ICDM, Pisa, 2008.

  17. Title: Relational Data Pre-Processing Techniques for Improved Securities Fraud Detection
    Authors: Andrew Fast, Lisa Friedland, Marc Maier, Brian Taylor, David Jensen, Henry G. Goldberg, John Komorosk
    Conference: ACM SIGKDD 2007, San Jose.

  18. Title: An Innovative Concept for Image Information Mining
    Authors: Mihai Datcu and Klaus Seidal
    Conference: MDM/KDD 2002: International Workshop on Multimedia Data Mining (with ACM SIGKDD 2002)

  19. Title: Ranking and classifying attractiveness of photos in folksonomies
    Authors: Jose San Pedro, Stefan Siersdorfer
    Conference: WWW '09: Proceedings of the 18th international conference on World wide web


Last modified: 26/05/2009, 08:45