ANU The Australian National University



____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

COMP2100/2500
Lecture 21: The C Programming Language I

Summary

This is the first of three lectures on the C language. Today's lecture will concentrate on small-scale aspects of C, Lecture 22 will discuss the division of C programs into files and modules, and Lecture 23 will cover pointers, arrays and memory allocation.

The overall aim of the three lectures is to give you enough of an understanding of C that you can read and understand C code, and then after Lecture 26, that you will be able to write Java code which interfaces to existing C code.


Outline of today's lecture


Further Reading:

The C Programming Language, 2nd (ANSI C) edition.
Brian W. Kernighan & Dennis M. Ritchie.
Prentice-Hall, 1988.


Why are we studying C in COMP2100?

Richard's note: only the first and last reasons seem compelling to me, but they are very compelling.


The History of C

C history diagram

Note that there are three main versions of C. The original design by Kernighan and Ritchie is known as K&R C, and is rarely seen these days. The language was standardized first by ANSI and then ISO as ISO/IEC 9899-1990, Programming Languages - C - we'll call this C90, and this is what is described in the book and is what most C compilers accept. A new revision of the language was published in 1999 as ISO/IEC 9899-1999, Programming Languages - C, - we'll call this C99. We'll be using gcc, which accepts C90 plus some parts of C99 by default, but can accept most of C99 with a compiler option.


Why is C So Popular?

By 1970s standards C is


What does C look like?

C syntax is the basis of much of Java, and some C programs can be written in a way that makes them almost impossible to read. But it doesn't have to be like that.

/* Find the maximum value of array[0...len-1].
 * PRE: 0 < len <= length of array */

int maximum(int array[], int len)
{ 
  int i, result;

  assert(0 < len);
  result = array[0];
  for (i=1; i!=len; i=i+1) {
    /* INVARIANT: result is max value of array[0...i-1]. 
     * VARIANT: len - i */
    if (array[i] > result) {
      result = array[i];
    }
  }
  return result;
}

Java and C, What's the Difference?

C is missing some features of Java:

But C does have:

Warning: Although there are many similarities, there are also many syntactic differences between the languages.


Hello World in C

1  #include <stdio.h>
2  int main(void) {
3    printf("Hello World\n");
4    return 0;
5  }

Note: There is no enclosing class { ... } or similar.

Line 1

Tell the preprocessor to include the interface to the standard I/O library. This is like an import statement in Java.

Line 2

The main function returns an integer (its return code) to the operating system and takes no arguments.

Line 3

Print the string `Hello World', followed by a newline character. Notice that every instruction must end with a semicolon.

Line 4

Return status (success) to the operating system.


Compiling C

It's a two step process:

  1. Compile the source files to object files.

  2. Link the object files into an executable. (Merge object files, and connect definitions in one to uses in another)

For a single file program we can do both stages in one step.

gcc -Wall -o foo foo.c

Notes:

gcc

the GNU C Compiler

-Wall

show all warnings

-o foo

output the executable to file foo

foo.c

the C source code file

For a multi file program we perform the two stages in two steps.

  1. gcc -c -Wall foo_1.c
    ...
    gcc -c -Wall foo_n.c
  2. gcc -o foo foo_1.o ... foo_n.o

Notes:

-c

just compile, don't link

Comments

Characters between /* and the next */ are ignored.

/* INV: result is maximum value of array[0...i-1]. 
 * VAR: len - i */
if (array[i] > result) {
  result = array[i];
}

Note: Comments don't nest. So if you wanted to ``comment out'' that whole block of code, this won't work:

/*
/* INV: result is maximum value of array[0...i-1]. 
 * VAR: len - i */
if (array[i] > result) {
  result = array[i];
}
*/

The // style of comment is not part of C90 but is part of C99. It's best not to use it. (It is possible to construct (bizarre) examples using // that are legal C90 and C99 but with different meanings according to each standard.)


Basic C Data Types

Comments:


Identifiers

For our purposes, these are the same as Java. The one thing to be aware of is that with some very old compilers, as few as 6 characters may be significant! (So identifiers are often heavily abbreviated.) C99 requires that at least 63 characters be significant.

By convention:


Literals


Expressions

One important thing to know about C is that everything is an expression. Every instruction is also an expression that returns a value to its caller. The caller may be non-existent or may choose to ignore the return value, but it is always there. This (as we'll see) can lead to some very strange errors.

Arithmetic Expressions:

For our purposes, these are just like Java.

Increment and Decrement expressions:

These are instructions, but also expressions.

C syntax What it means
++i increment i, then return its new value
i++ return the current value of i, then increment it
--i decrement i, then return its new value
i-- return the current value of i, then decrement it

Exercise: What does this mean? a[i]=i++;

Relational Expressions

Like Java, including:

Note: The expression/instruction a = b assigns to a the value of b (as in Java) and then returns that value. A very common error for Java programmers writing C is to write

int a, b;

. . .

if (a=b) {
  /* do something */
} else {
  /* do something else */
}

Exercise: What will happen?

Boolean Expressions

Remember that before C99 there was no boolean type, and very few people use _Bool. The following operators all operate on integers, to increase the potential for confusion.

C syntax What it means
!a

not a

a && b

a and b

a || b

a or b

Note that a single `&' or a single `|' are also legal operators, but they perform bitwise logical operations on their arguments. This is another potential source of really strange errors in C programs.

Note also that `&&' and `||' are short-circuit operators as in Java.

Special assignment operations

C syntax What it means
i += j

i = i + j

i -= j

i = i - j

i *= j

i = i * j

i /= j

i = i / j

i %= j

i = i % j


Statements

As in Java, the semicolon is used as a statement terminator. (Some other languages, such as Pascal, use a semicolon as a statement separator.)

It is also possible to separate statements with a comma. The comma binds more tightly than the semicolon. This turns out to be commonly used when writing for loops.


Block Statements

If S1, ... Sn are all statements, then {S1 ... Sn} is a block statement.


Variable Declarations

Variable declarations can be made in any block. In C90, they must occur at the beginning of the block, whereas in C99, as in Java, they can be interspersed with other statements. The variables only exist within that block. (To say that more formally: the lifetime of a variable is the immediately enclosing block.)

The syntax for variable declarations in C is:

Declaration
syntax

Examples: int x; char c = 'c', d;

Variable declarations may also be made at the top level, i.e. outside the scope of a block. These are called global variables (though that name is a little misleading).


Conditional Statements

As in Java.

Example:

if (x > y) z = x; else z = y;

Multi-way Conditionals using if

 if (i > 0)  { 
   printf("i is positive\n");
 } else if (i < 0) {
   printf("i is negative\n");
 } else {
   printf("i is zero\n");
 }

Multi-way Conditionals using switch

As in Java.

/* Place uppercase version of low in up. */
switch (low) {
  case 'a': up = 'A'; break;
  case 'b': up = 'B'; break;
   ...
  case 'z': up = 'Z'; break;
  default: up = low;
}

Note 1: Case values must be constant expressions (no variables). They are evaluated at compile time.

Note 2: As in Java, those break instructions are necessary, otherwise control drops through to the next case and executes it also. This is a `feature', not an error in the language...


while Loops

As in Java.

 while (expression)
   statement

Execute statement while expression is true (that is, nonzero).

There is also a corresponding do loop:

do
  statement
while (expression);


for Loops

As in Java.

for (statement1; expression; statement2)
   statement3

An empty expression equates to true, so for(;;) is an infinite loop. Equivalently, you could write while(1).

In C99, as in Java, you can write for (int i = . . .). You can't do this in C90.


The break and continue Statements

Saying break gives an early exit from a loop. (More precisely: it causes an immediate exit from the innermost enclosing loop.) Saying continue skips immediately to the end of a loop body. (More precisely: it causes an immediate jump to just before the end of the body of the innermost enclosing loop.)


The goto Statement

This causes immediate transfer of control to a labelled location somewhere else in the code. This is almost never OK. Anything more than the most sparing use renders code incomprehensible, unpredictable, impossible to analyse...

The only conceivable acceptable use is in error handling. (Modern languages such as Java use exceptions.)

Optional: read Knuth's classic 1974 paper Structured programming with go to statements. What do you think?

____________________________________________________

[ANU] [DCS] [COMP2100/2500] [Description] [Schedule] [Lectures] [Labs] [Homework] [Assignments] [COMP2500] [Assessment] [PSP] [Java] [Reading] [Help]

____________________________________________________

Copyright © 2005, Jim Grundy & Ian Barnes & Richard Walker, The Australian National University
Version 2005.5, Monday, 2 May 2005, 13:34:29 +1000
Feedback & Queries to comp2100@cs.anu.edu.au