ANU The Australian National University



____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

COMP2100/2500
Assignment 1 Marking Guide

Background

Students were asked to add code to or modify the Java oops program to carry out the following seven tasks:

  1. Remove paragraphs that consist of a single data element made up only of hyphens or only of underscores (by adding a new class TreeFixer);

  2. Modify the TextRenderer to implement list continuations (the text:continue-numbering="true" attribute) on ordered lists).

  3. Gather information about the parent styles of automatically generated paragraph styles (by adding a new class StyleDecoder).

  4. Extract title, author and affiliation metadata from documents (by adding a new class MetadataExtractor).

  5. Produce HTML output (by adding a new class HtmlRenderer).

  6. Fix the extra spaces appearing in the plain text output (by modifying class TextRenderer).

  7. Rewrite the scanner with a new design and mode of operation.

You will need to read the Assignment 1 specification and the Assignment 1 Hints and FAQ carefully before you start marking. If you find any inconsistencies, please let me know.

Overall I want the mark on each assignment to reflect the overall level of achievement attained. I hope the guide below points you in that direction. It is important that a mark of 24/30 or higher truly reflects a piece of work at High Distinction level; that a mark of 21 is work at Distinction level; 18 is a Credit and 15 is a Pass.

Assign marks as follows:

Q13
Q22
Q34
Q44
Q55
Q62
Q74
Style6
Total30

But make sure that the overall mark really reflects their level of achievement.


The printout

The printout you will receive for each student consists of:

  1. The usual header with information about lateness etc and a space for you to write some feedback comments and a final mark.

  2. A section indicating any changes made to their code in the marker script. These are as follows:

    • Many students never ran their program on a document containing an anchor element, despite these being mentioned specifically in the FAQ discussion of Q5, and there clearly being a visitAnchor routine in every visitor. So they never found the error in the require assertion, which checked for text:anchor instead of text:a. This is really slack testing. Deduct two marks from any assignment where my script had to fix this.

    • One pair of students (u2562890) managed to mess up creating the “jarball” by omitting the name of the jar file. This caused the jar file to overwrite the first Java source file in the list: Assert.java, causing (of course) a compilation failure. I replaced this with the original Assert.java. One mark penalty.

    • Two pairs of students submitted assignments with incomplete (non-working) attempts at Q7. They consulted with me about this first. In one case (u4112785) the new scanner compiled OK but crashed, so I renamed it NewScanner.java and copied the original scanner in for compilation and testing. In the other case (u3355411) the new scanner doesn't compile, so I renamed the file ScannerNew.java.txt to prevent the compiler from looking at it. No penalty. In both cases please take a little time to look at the code (both for the scanner class and for its helpers, the various filters) and see if there is something in there worth a few marks.

    • One student (u3360551) did not complete Q5, and submitted an HTML renderer that doesn't compile. I renamed this to HtmlRenderer.java.txt so that the compiler wouldn't see it. No penalty (special circumstances). Please see if there is anything in the HTML renderer that is worth any marks.

  3. A calculation of the number of new and changed lines of code. Most students who did not attempt Q7 are in the range 700–1000 lines. A few students have submitted over 2000 lines of code. I think this is probably excessive and should be penalised, particularly in the case of the group who submitted 3100 new or changed lines of code.

  4. A complete listing of every new class, plus a diff listing for each changed class. The diff listing has the switches -w -U20, which means that whitespace is ignored in determining whether lines match, and that there are 20 lines of context around each change. I ran the diff output through grep -v ^- so that the original lines that have been removed or changed are not printed. (I found this confusing in the past.) Each new or changed line is marked with a + in the first column. (I've done this with the new classes too, so that every line the students wrote is marked with a +, whether it's in a new file or not. Let me know if you think this helps or not.)

  5. A record of what happened when marker attempted to compile their program. Usually this says very little, but if there were problems, the output might help you. Almost everyone gets a warning about unchecked operations. This just means there is old (pre Java 1.5) code that uses collections without the new generics syntax. No penalty.

  6. The results of running their program on a short test document test.sxw. The students did not have access to this document. The printout first shows what the program printed on the standard output, then the text file sample8.txt and then the HTML file sample8.html.

    Note that this test document is fairly short and simple. Just because a student's program deals with it correctly does not mean they should automatically get full marks. Some quite poor programs may still perform well on this document.


1. Tree fixer (3 marks)

Their tree fixer should be a new class that implements Visitor. It should traverse the tree removing all paragraphs that satisfy the following conditions:

  1. They have exactly one child. AND

  2. That child is a data node. AND

  3. The content of that data node is all hyphens. OR

  4. The content of that data node is all underscores.

Consider the following points:


2. List continuations (2 marks)

This only applies to the plain text renderer, not the HTML renderer. The first thing is to check the plain text output. The relevant section should look exactly like this:

 1. Item one of the list.

This is an interruption in the middle of the list.

 2. Item two of the list.

 3. Item three of the list.

If item two has the number 1 in front of it, then the student has not succeeded with this part.

A few points to consider:


3. Style decoder (4 marks)

This was in COMP2100 Assignment 2 in 2003, but in Eiffel.

Here they had to write a new visitor whose only task is to compile a lookup table of style “inheritance”. (See the assignment sheet.) This is pretty straightforward once you understand the task, but it's hard to get started.

Points to consider:


4. Metadata extractor (4 marks)

This was also in the 2003 assignment.

Again they had to write a new visitor class for this question. The task was to traverse the tree looking for paragraphs representing the document title, the author's name and the author's affiliation. This requires using the style inheritance lookup table from Question 3. Output should be written on the standard output after the visitor has finished traversing the tree. Their code has to be able to handle multiple authors, multiple affiliations etc, but I haven't tested that in the test document. The output should look like this:

Metadata information collected:
Title         = "A Short Paper"
Author's Name = "Ian Barnes"
Affiliation   = "Australian National University"

Some points to look out for:


5. HTML renderer (5 marks)

This was probably the biggest part (except for Q7). Students had to write a new visitor that produces an HTML version of the document.

Some points to look out for:


6. Fix bad spaces (2 marks)

The text renderer they were given breaks data elements up into words, then adds a space character after every word it prints. This is usually OK, but wrong sometimes, particularly at the beginnings and ends of spans. The correct way to do the output is to only put a space in the output if there was one in the input. Then sometimes you replace a space character with a newline if the line is too long. Line breaks are only allowed where there was a space.

This may be hard to mark, partly because the output will be wrong if they just do Q6 but not Q7.

Here is part of what my text output looks like with Q6 but not Q7 done:

This is a paragraph of Initial Body Text. Here is aspan in
italicsand here is aspan in boldface.This paragraph is long so
that we can check that the line breaking algorithm works
correctly in the text renderer. Here is an ordered list with an
interruption and continuation.

...

Here is a line break.
Here is an italic span containing only underscores (should not
be removed):___and here is a bold span containing only hyphens
(also should not be removed):-----.Here is a hyperlink to
theCOMP2100 Home Page.

Note the missing spaces in a few places. These are because the scanner supplied incorrectly does a trim() on all data strings. Some students may have made that one-line change to the supplied

Some points to look out for:


7. Scanner redesign/rewrite (4 marks)

This is much longer and harder than any of the other parts of the assignment. It was intended only for students who are aiming at getting very high marks in this course. The students had much more freedom in how they implemented this than in the other parts.

Look at my solution to get an idea of how this should work (although students may well have come up with alternative implementations of the details). The scanner no longer stores the complete input as a string. Instead it reads one token at a time from its input. The input is done with the Decorator pattern, with a series of filters wrapped around the Reader it is passed on creation. These filters remove comments, processing instructions, and Doctype declarations from the input, as well as normalising whitespace. In order that whitespace around a comment, PI or Doctype declaration gets merged correctly, the whitespace filter needs to be last in the chain (closest to the scanner, furthest from the actual input stream). Since there is no string to search for tokens in, the parsing of tags, attributes etc needs to be rewritten a bit. It's not really that hard, unless students try to allow for every possible error... This wasn't required. They can basically assume that the input is correct XML and let the program crash or (preferably) throw an exception if it isn't.

Some points to look out for:


Style (6 marks)

As well as looking at the correctness of their solution, I want you also to consider their coding style. We haven't discussed a coding standard in class yet, so there is no formal standard to judge them against, but they have been told about the following, either last year or by me (or both):

The main principle here is clarity. If you find it hard to understand their code, then give them a lower mark for style. Try to identify and give them feedback on what it is that makes their code hard to understand, so that they can improve next time.


Late Penalty

If an assignment was handed in late, this will be shown on the printout. The late penalty is simple: late assignments will be penalised six (6) marks.

Do not penalise assignments that are less than fifteen minutes late. What with this generosity and extensions granted, I think this only leaves one late assignment (u4131114).


Responsibilities

As a marker for this assignment, you are required to:

Don't forget to photocopy the front pages.

Do not enter zero for a missing assignment, just leave the field blank.

I can only pay you to spend about seven and a half hours marking the 17 assignments given to you. That works out at about twenty-five minutes per student. If you find yourself needing more time than this, let me know.

____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

Copyright © 2005, Ian Barnes, The Australian National University
Version 2005.1, Monday, 18 April 2005, 15:20:51 +1000
Feedback & Queries to comp2100@cs.anu.edu.au