ANU The Australian National University



____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

COMP2100/2500
Lecture 23: Shell Programming III

Summary

More Unix commands for scripting

Functions in bash

sed and awk

Command Line Interface vs Graphical Use Interface.

Aims


Already known commands

(forgotten how they are used? Me too! Look up the man pages)

Other useful commands


Functions in bash

The shell scripting languages allow for effective code re-use in the form of function definition. A function is defined in a code block (statements embedded into the curly brackets). A bash function definition is a "black box", its internals are invisible to the outer parts of the script. If there is repetitive code with a task which is subject to a slight variations, the function feature of bash can be useful. The syntax is:

function fn() {
command1...
command2...
  ...
}

where the round brackets after the function name fn() can be omitted (the first option is more portable, though). A function is triggered in the outer script by invoking its name, but the triggering can only follow the definition, it cannot precede it. Eg, I put the following into the script my_script:

#!/usr/local/bin/bash

exclamation="Vivat, I won!"
echo -n "exclaiming from outside: "
echo ${exclamation}
f(){
local exclamation
exclamation="Alas, I lost."
echo -n "exclaiming from inside: "
echo ${exclamation}
}
f
echo -n "again exclaiming from outside: "
echo ${exclamation}

When ran, the output will be

abx@eudyptula:~/comp2100/cvs/lectures$ ./my_script
exclaiming from outside: Vivat, I won!
exclaiming from inside: Alas, I lost.
again exclaiming from outside: Vivat, I won!

Functions may process arguments passed to them and return an exit status to the script for further processing.

Bash cannot effectively deal with regular expressions. To make you script recognise and use RE, one can use the Unix utilities sed and awk.


sed and awk

sed

sedstream editor, a powerful filter program. Syntax for this command is

sed 'list of sed commands' filenames

sed processes the input files line by line, independently. The quotes are almost always needed to protect the sed metacharacters from the shell (where they also can have meaning). The output is produced on the command line (stdout). The sed commands usually are matching and substitution instructions (including a simple variant of regular expressions language) in accordance with which the processed lines shall be modified. Consider an example. Suppose, I (after enduring a considerable pain) manually edited the assignment group list (from a1—›a2, taking into account your requests for group change). But I didn't change the "a1" in the original file:

abx@eudyptula:~/comp2100/cvs/lectures$ cat groups_a2.txt
[a1:/branches/group03] u4125148, u4222523
[a1:/branches/group04] u4222980, u4223071
[a1:/branches/group05] u4210391, u4234643
 .......

Not to worry! I need not to even open a text editor. First I rename the file

abx@eudyptula:~/comp2100/cvs/lectures$ mv groups_a2.txt groups_a1.txt

Then I run it through sed

abx@eudyptula:~/comp2100/cvs/lectures$ sed 's/a1/a2/' groups_a1.txt > groups_a2.txt
abx@eudyptula:~/comp2100/cvs/lectures$ cat groups_a2.txt
[a2:/branches/group03] u4125148, u4222523
[a2:/branches/group04] u4222980, u4223071
[a2:/branches/group05] u4210391, u4234643
   .....

and I am with the correct file list.

To create Subversion repositories for Assignment project in comp2100 one should use the command

svn copy http://svn/comp2100/a2/trunk/ http://svn/comp2100/a2/branches/groups01/

But for 50 odd groups, it would be smarter to use a script. First, I construct the group list group03, group04,...,group46, using the file groups_a2.txt. Using the following command,

groups=$(sed -e 's/].*//' -e 's/.*group/group/' groups_a2.txt)

the -e option means to treat the next token as the sed command, not as the file name from which sed reads the lines; Using the list $groups and by writing a short bash script with the for loop, one can easily create all those svn directories.

Apart from substitution sed can be instructed to do (or not to do) certain action if it finds a pattern in the input. The following example shows an implementation of a command newer which lists all files in a directory that are newer than a specified file (it also shows that you can use the command line arguments passed to sed much like you do in bash):

ls -t | sed '/^'$1'$/q'
(the quotes around $1 expose it to the shell which will replace it with the filename).

sed is good: it's fast, easy to use, it can handle a very long inputs. But it does everything line by line only, multi-line processing is hard and awkward. Here, to its aid comes

awk

awk (Aho, Weinberger and Kernighan) is a powerful stream processor and formatter, with syntax similar to C. In some respects, it is more powerful than shell (eg, it has floating-point arithmetics capabilities, which shell doesn't have). The usage is similar to sed:

awk [options] 'program' filenames

awk also reads the inputs one line at a time, but you can define what the line is (in awk it's called record, and it can spread several actual lines), and the line is automatically parsed into fields (the field separator can be also defined prior to and in the middle of processing), which can be dealt with separately. Like sed, awk does not alters the content of the input files. The awk program is different from sed's:

pattern { action } 
pattern { action } 
   .....

for each pattern that matches the line the corresponding action is performed. The pattern can be a regex, or a boolean expression. The fields are referenced in the same way as CLA for shell scripts: $0 — entire input line, $1 — first field, etc; instead of $* there is NF built-in variable). The following script prints list of users who has no passwords (imaging this nowadays):

awk -F: '$2 == ""' /etc/passwd

awk has two special patterns, BEGIN and END; followed by blocks in curly brackets, the define actions performed before and after all input lines were processed. Very convenient for setting the built-in variables, and performing some action over all processed data. Eg,

awk 'END { printf NR }' files

does the same as cat files | wc -l.

the awk script

         { nc += length($0) + 1
           nw += NF }
    END  { print NR, nw, nc }

counts lines, words and characters like "full" wc. The full list of the built-in variables in awk

By controlling RS and ORS one can process a complex formatted input, and produce a custom formatted output.

With similar syntax to C, awk has full set of control structures (if-else, for and while loops). It has arrays: an awk script backwards

         { line[NR] = $0 }
    END  { for (i = NR; i > 0; i--) print line[i] }

when called awk -f backwards file will print the lines from file in the reversed order. It has also associative arrays. It also has a set of built-in functions (like seen above length(); they include mathematical functions).


Advanced Bash-Scripting Guide

This is an on-line book (or very long tutorial, the genre is unclear), very helpful, very detailed, very free. Use it!


CLI and GUI doing the same task

The following examples were taken from "The Pragmatic Programmer" by Hunt and Thomas.

Find all .java file modified more recently than your Makefile

Shell find . -name '*.java' -newer Makefile -print
GUI Open the FileManager, navigate to the correct directory.
Click on the Makefile, and note the modification time.
Bring up Tools/Find, and enter *.java for the
file specification. Select the date tab, and enter the date you
noted for the Makefile in the first date field. Click OK.

Construct a zip/jar/tar archive of a project source files

Shell zip archive.zip *.java –or–
jar cvf archive.jar *.java –or–
tar cvf archive.tar *.java
GUI Bring up WinZip utility, select in the menu Create New Archive.
Enter its name, select the sources directory in the adding dialog.
Set the filter to *.java. Click Add. Close the archive.

Which Java files have not been changed in the last week?

Shell find . -name '*.java' -mtime +7 -print
GUI Click and navigate to Find files, click the Named field
and type in '*.java'. Select the Data Modified tab.
Select Between. Click on the starting date and type in
the starting date of the beginning of the project. Click on
ending date and type in the date of a week ago today
(you may need to check with a calendar), Click on Find Now.

Of those files, which use the awt library?

Shell find . -name '*.java' -mtime +7 -print |
xargs grep 'java.awt'
GUI Load each file in the list from the previous example
into an editor, and search for the string "java.awt".
Write down the name of each file containing a match.

Not enough lurid, you say. Then consider this...

Create a list of all unique package names explicitly imported by your code

Shell grep '^import ' *.java |
sed -e 's/.*import *//' -e 's/;.*$//' |
sort -u > list
GUI (dreadful to even contemplate; Microsoft share price plummets)

A parable from Eric S. Raymond (The Art of Unix programming)

Master Foo Discourses on the Graphical User Interface

One evening, Master Foo and Nubi attended a gathering of programmers who had met to learn from each other. One of the programmers asked Nubi to what school he and his master belonged. Upon being told they were followers of the Great Way of Unix, the programmer grew scornful.

"The command-line tools of Unix are crude and backward", he scoffed. "Modern, properly designed operating systems do everything through a graphical user interface."

Master Foo said nothing, but pointed at the moon. A nearby dog began to bark at the master's hand.

"I don't understand you!" said the programmer.

Master Foo remained silent, and pointed at an image of Buddha. Then he pointed at a window.

"What are you trying to tell me?" asked the programmer.

Master Foo pointed at the programmer's head. Then he pointed at the rock.

"Why can't you make yourself clear?" demanded the programmer.

Master Foo frowned thoughtfully, tapped the programmer twice on the nose, and dropped him in a nearby trashcan.

As the programmer was attempting to extricate himself from the garbage, the dog wandered over and piddled on him.

At that moment, the programmer achieved enlightenment.


GUI: hallmark of technological progress, problem for mankind

CLI and programming languages in general are adequate tools for capturing and using abstraction. We need complex notions and concepts because we need to solve comlplex problems. Abstractions are not just means to grasp these complex notions and concepts, abstactions is the hallmark of our mental ability to be up to it. GUI which lacks the ability to express abstraction, therefore, degrades our ability to handle complex problems. You can argue that visual means (including GUI) do allow capturing abstractions. Consider abstact art, for example! Indeed, look at this picture. What is it? The answer depends on the additional information (part of which can be due to perception, therefore be subjective). If this were a simple geometrical drawing, the interpretation would be simple, and the abstraction resolution would be easy. But this is, in fact, a piece of fine art, a rather famous picture. To understand its meaning ("how to read it") requires knowledge and understanding of the context (cultural, historic, etc.). The image only does not reveal its context.

Images are not good for having generalised (abstract) meaning. When they do have abstract meaning, this can only be due to associated context which is not part of the image.

By the way, this is the artist's self-portrait. Note, that the head is also a black square, only smaller (microcosm and macrocosm).

____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

Copyright © 2006, Alexei Khorev (standing on the shoulders of giants), The Australian National University
Version 2006.5, Wednesday, 10 May 2006, 15:33:06 +1000
Feedback & Queries to comp2100@cs.anu.edu.au