COMP2100
Lab Exam and sample solutionsHere are the lab exam questions together with some sample solutions. Note that there were often several ways to do them, particularly with the scripts, so just because your solution doesn't look like ours, doesn't mean it's wrong.
Question 1
In this question you will write two simple bash scripts.
Q 1(a)
Write a bash script called words1 to count the number of distinct words in a text file.
A word is a contiguous sequence of non-blank characters bordered by white space (or the beginning or end of the file). White space characters are spaces, tabs, form feeds, and new lines.
Two words are considered to be equivalent to each other if and only if they are the same length, and corresponding characters of the words are identical, disregarding the case of letters. For example, the words abc!De(F and abc!de(f are equivalent. Two words are considered to be distinct from each other if and only if they are not equivalent.
The script shall take the name of the file as its only command-line argument. After scanning the file it shall write one line to standard output containing the number of distinct words in that file, and the file's name.
For example, if the file file1.txt looks like this:
abc( def) gh@i jkl aBc( deF) GH@I mnorunning the script should work as follows:
comp2100@partch ./words1 file1.txt 5 file1.txtbecause there are five distinct words.
Hint: Use the tr program to (a) replace lower-case letters with the equivalent upper-case letters, and (b) replace all spaces, tabs and form feeds with new lines (so that each word is on a line by itself), then use sort to find distinct lines, then use grep to select all non-empty lines. Either pipe the result of this into wc or use the count option in grep.
Hint: You will probably need to consult the manpages for tr, sort, and grep. What regular expression matches any character?
Hint: You can write a tab as \t, a form feed as \f, and a new line as \n.
Note: The name of the script must be words1; the last character is the digit '1' (one).
[5 marks]
Solution
Here's one possible solution:
#!/bin/bash words=$(cat ${1} | tr 'a-z\t\f ' 'A-Z\n\n\n' | sort -u | grep -c '.') echo ${words} ${1}This is not the only way to do it. An obvious variant is to split the two jobs done by the one call to tr, so that tr 'a-z\t\f ' 'A-Z\n\n\n' is replaced by tr 'a-z' 'A-Z' | tr '\t\f ' '\n'.
If you didn't know about the -c option for grep, you could have replaced grep -c '.' with grep '.' | wc -l.
If you didn't know that the pattern '.' matches any non-empty line, you could have done the same thing by using the -v option for grep (which reverses the logic of the selection) and saying grep -v '^$'. Since ^ represents the start of a line, and $ represents the end of a line, the pattern '^$' matches only empty lines. Reversing that with -v selects all non-empty lines.
Q 1(b)
Write a new script called words2 that loops through all the files given on the command line, counting distinct words and printing one line of output for each file, as in part (a). After it has done that, if there was more than one input file, it shall leave a blank line and then print a summary line giving the number of files read and the total number of distinct words across all input files.
You don't have to worry about what to do if a file doesn't exist.
For example, if the file file1.txt looks like this:
abc( def) gh@i jkl aBc( deF) GH@I mnoand the file file2.txt looks like this:
aBc( deF) GH@I mno pqr stu vwxrunning the script should work as follows:
comp2100@partch ./words2 *.txt 5 file1.txt 7 file2.txt 2 files, 8 distinct wordsNote that 5 + 7 ≠ 8. We conclude that there were 5 + 7 - 8 = 4 distinct words that occur in both file1.txt and file2.txt.
Hint: If you have succeeded with part (a), make a copy of your solution and use that as your starting point.
Hint: You might like to use some temporary files. If you do, it might be nice to include ${$} somewhere in the file name(s). Don't forget to clean up at the end of the script.
Hint: If you can't do part (a), you can still get marks for this part. Just count the number of words in each file with wc -w.
Hint: You may find this part easier than part (a), or not.
[5 marks]
Solution
Here's Richard's solution:
#!/bin/bash rm -f tmp${$} tmptotal${$} for f in ${*} do cat ${f} | tr 'a-z\t\f ' 'A-Z\n\n\n' | sort -u > tmp${$} words=$(cat tmp${$} | grep -c '.') echo ${words} ${f} cat tmp${$} >> tmptotal${$} done if [ ${#} -gt 1 ] then echo echo ${#} files, $(cat tmptotal${$} | sort -u | grep -c '.') distinct words fi rm -f tmp${$} tmptotal${$}This uses a couple of temporary files tmp${$} and tmptotal${$}. The variable ${$} expands to the process ID of this shell. It's nice to use that as part of the file names, because then even if there are two processes running this same script at the same time, they won't clobber each other's temporary files because they will have different process IDs.
You dind't need to use temporary files if you didn't want to. An alternative solution for getting the totals would be to cat together all the files on the command line and then do the same thing to the resulting stream as you did to each file. (There's a minor issue here if a file doesn't end with a newline character.)
Question 2
In this question you will write two simple Eiffel programs that take data from a DICTIONARY and print it to the standard output. Imagine you are compiling the index of a book. For each entry you want in the index, you add an element to a dictionary with the key being the word or phrase, and the value being the page number on which it can be found.
In your home directory you will find two files. The first, index.txt, is a data file. The second, index0.e, is an Eiffel program that reads data from a file named on the command line and stores it in a DICTIONARY. Use this program as a starting point for this question.
Q 2(a)
Write an Eiffel program called index1 that prints out all the information in the data dictionary.
For example, if the file index.txt looks like this:
17 Bash scripting 23 Cascading Style Sheets 49 Eiffel 62 Assignment attempt 70 Data structuresthen running the program should give something like this:
comp2100@partch index1 index.txt Data structures 70 Cascading Style Sheets 23 Eiffel 49 Bash scripting 17 Assignment attempt 62Notice that the order is all scrambled. Your program does not have to print the entries in any particular order, but it must print them all.
Your program must use an object of class ITERATOR from the Eiffel library. It must not loop through the data structure using a normal loop with a control variable. Have your iterator loop over the keys in the dictionary, not over the values.
Hint: You will need to look at the short forms of classes ITERATOR and DICTIONARY. How do you create an iterator for the keys in the dictionary?
Hint: Some of you may notice that class ITERATOR is an expanded class. Don't let this worry you. This makes no difference for the purposes of this exercise.
Note: The name of your class must be INDEX1; the last character is the digit '1' (one). This class must be in a file called index1.e. The name of the creation routine must be make.
[5 marks]
Solution
First, here's the "starting" version of the program, class INDEX0:
class INDEX0 -- Starting point for 2004 COMP2100 Lab Exam Q2 creation make feature make is -- Create a dictionary, read data into it from the file -- named on the command line, then print it out. local data: DICTIONARY [INTEGER, STRING] input: TEXT_FILE_READ value: INTEGER key: STRING do create data.make create input.connect_to (argument (1)) from input.read_integer until input.end_of_input loop value := input.last_integer input.skip_separators input.read_line key := clone (input.last_string) data.add (value, key) input.read_integer end -- Insert code here to print the contents of -- "data" to standard output. end end -- class INDEX0Here's a solution, with comments. First you need to declare an iterator. Since the keys are strings, it needs to be an iterator over strings. You could make it an attribute of the class, or a local in make. Probably a local is better, since it's not going to be used anywhere else.
iterator: ITERATOR [STRING]Now here's the loop you need:
from iterator := data.get_new_iterator_on_keys iterator.start until iterator.is_off loop key := iterator.item value := data.at (key) std_output.put_string (key + " " + value.to_string + "%N") iterator.next endThe first thing you needed to find was the feature get_new_iterator_on_keys in the documentation for class DICTIONARY. After that, it was just a matter of reading and understanding the documentation for class ITERATOR.
Here's the completed solution:
class INDEX1 -- Solution to 2004 COMP2100 Lab Exam Q2(a) creation make feature make is -- Create a dictionary, read data into it from the file -- named on the command line, then print it out. local data: DICTIONARY [INTEGER, STRING] input: TEXT_FILE_READ value: INTEGER key: STRING iterator: ITERATOR [STRING] do create data.make create input.connect_to (argument (1)) from input.read_integer until input.end_of_input loop value := input.last_integer input.skip_separators input.read_line key := clone (input.last_string) data.add (value, key) input.read_integer end from iterator := data.get_new_iterator_on_keys iterator.start until iterator.is_off loop key := iterator.item value := data.at (key) std_output.put_string (key + " " + value.to_string + "%N") iterator.next end end end -- class INDEX1
Q 2(b)
Write a new Eiffel program called index2 that does exactly the same as index1 except that it prints out the index entries in alphabetical order.
For example if the file index.txt looks like this:
17 Bash scripting 23 Cascading Style Sheets 49 Eiffel 62 Assignment attempt 70 Data structuresthen running the program should give this:
comp2100@partch index2 index.txt Assignment attempt 62 Bash scripting 17 Cascading Style Sheets 23 Data structures 70 Eiffel 49Notice that the entries are now listed in alphabetical order.
You may want to use your index1.e as a starting point. If you do, make sure that you copy it into a new file called index2.e before you start. Do not continue editing index1.e.
You will need to get all the keys from the dictionary into an array, then sort them. You must sort them using an object of class COLLECTION_SORTER from the Eiffel library. Do not write your own sorting routine.
Once you have sorted the keys, you must use an iterator to loop through them.
Hint: There is an easy way to get all the keys into an array, if you look at the documentation hard enough.
Hint: You will certainly want to look up the short form of class COLLECTION_SORTER.
Hint: You may notice that class COLLECTION_SORTER is an expanded class. Don't let this worry you. The only difference is that you don't have to create the sorter before you use it. Look at the example code at the top of the short form.
Hint: Class STRING inherits from class COMPARABLE, which means that it has an ordering already defined on it. If you find yourself writing complicated code to compare two strings, stop and think again. You don't need to do that; it's already done for you in class STRING.
Hint: You will need to look at the short form of class ARRAY to find out how to get an iterator for the elements of the array.
Note: The name of your class must be INDEX2. It must be in a file called index2.e. The name of the creation routine must be make.
[5 marks]
Solution
Here there are three steps.
Get an array of strings containing the keys.
Sort that array into alphabetical order.
Loop through the sorted array and print out the keys and values.
For the first step, you need to declare an array of strings to hold the keys. Again this could be an attribute or a local in make. Probably better as a local, since it's not used anywhere else.
keys: ARRAY [STRING]Now there are two ways to get the keys into this array. The easy way was to use the feature key_map_in of class DICTIONARY:
create keys.with_capacity (data.count, 1) data.key_map_in (keys)If you didn't spot that, you would have had to do it the hard way, by looping through the keys and copying the references one by one:
from create keys.make (1, 0) iterator := data.get_new_iterator_on_keys iterator.start until iterator.is_off loop keys.add_last (iterator.item) iterator.next endThe second step is to sort the array using a COLLECTION_SORTER. First you have to declare one, again as a local in make. The things we're sorting are strings, so the declaration has to reflect that.
sorter: COLLECTION_SORTER [STRING]Now it is just one line to do the sorting:
sorter.sort (keys)The final step is to print out the sorted keys and corresponding values. This is very similar to part (a), but you have to loop through the array, not the dictionary.
from iterator := keys.get_new_iterator iterator.start until iterator.is_off loop key := iterator.item value := data.at (key) std_output.put_string (key + " " + value.to_string + "%N") iterator.next endThroughout all this, I've used local variables key of class STRING and value of class INTEGER, but you don't have to do that. The body of that last loop could just as well have been written:
std_output.put_string (iterator.item) std_output.put_character (' ') std_output.put_integer (data.at (iterator.item)) std_output.put_new_line iterator.nextIt's a matter of taste as to which you think is clearer.
Here's the completed program for part (b):
class INDEX2 -- Solution to 2004 COMP2100 Lab Exam Q2(b) creation make feature make is -- Create a dictionary, read data into it from the file -- named on the command line, then print it out. local data: DICTIONARY [INTEGER, STRING] input: TEXT_FILE_READ value: INTEGER key: STRING keys: ARRAY [STRING] sorter: COLLECTION_SORTER [STRING] iterator: ITERATOR [STRING] do create data.make create input.connect_to (argument (1)) from input.read_integer until input.end_of_input loop value := input.last_integer input.skip_separators input.read_line key := clone (input.last_string) data.add (value, key) input.read_integer end create keys.with_capacity (data.count, 1) data.key_map_in (keys) sorter.sort (keys) from iterator := keys.get_new_iterator iterator.start until iterator.is_off loop key := iterator.item value := data.at (key) std_output.put_string (key + " " + value.to_string + "%N") iterator.next end end end -- class INDEX2
Copyright © 2004, Ian Barnes & Richard Walker, The Australian National University
Feedback & Queries to
comp2100@iwaki.anu.edu.au
Version 2004.2, 4 June 2004, 19:05:36