COMP2310/COMP6310:
Concurrent and Distributed Systems
Semester 2 2012

Laboratory 7

Fighting Zombies with Forks and Pipes


You are expected to use the man pages! Remember that man pages with the same title may appear in multiple sections. If you want to search all sections use man -a.


Process Identifiers


Every Unix process is associated with a unique non-negative integer corresponding to its process ID. There are some special processes, such as process ID 0 that is usually associated with the scheduler, while process ID 1 (remember this number) is usually the init process that is invoked by the kernel at the end of the bootstrap procedure. Two useful functions that we will use in this lab are:
pid_t getpid(void);
pid_t getppid(void);
ACTIONS

The fork() Function


The only way a new process is created by the Unix kernel is when an existing process calls the fork() function. This function is called once but returns twice as two processes - the parent and the child process. The only difference between the two processes is the value of the return code, in the parent the return code gives the PID of the child, while in the return value of the child is 0. A small example code is given below:
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
int main(void) {
  pid_t fork_pid;
  printf("Hello from main\n");
  fork_pid = fork();
  printf("After fork, fork_pid=%d\n", fork_pid);
  return 0;
}
ACTIONS

The wait() and waitpid() Functions


When a process terminates, either normally or abnormally, the parent is notified by the kernel sending the parent the SIGCHLD signal. (What's a signal you ask - do man 7 signal, and/or ask your tutor). The parent can choose to ignore this signal, or it can provide a function that is called when the signal occurs (an `signal handler' function). The default action is for this signal to be ignored. The functions wait() and waitpid() permit the parent to check on the termination status of its children. These functions do one of the following: ACTIONS
In the following code, we create two processes. The child process is forced to sleep for some period (5 seconds - see man sleep). To verify that the process is sleeping we have inserted a number of timestamps.

Zombies


In Unix terminology, a process that has terminated, but whose parent has not yet waited for it, is called a zombie, i.e. it is a process that is not quite dead - it can no longer execute instructions, but information concerning this process is still taking up space in the process tables. The following code creates a zombie child.

ACTIONS
Consider the following program:

#include <unistd.h>
#include <stdio.h>
int main(void) {
  printf("Hello from main \n");

  if (fork() != 0) {
    /* THIS IS THE PARENT */
    printf("Hello from Parent\n");
    sleep(4);
  } else {
    printf("Parent PID of Child before sleep %d\n", getppid());
    sleep(2);
    printf("Parent PID of Child after sleep  %d\n", getppid());
  }
  return 0;
}

The exec Functions


As demonstrated above, fork() enables the user to create new processes. These processes are, however, a clone of the calling process. If we want to create a new process running a different executable we have to follow the initial fork() command with a call to one of the exec family of functions (essentially members of this family differ in how they locate the executable, the treatment of command line arguments, and their treatment of current execution environment. Do man exec to view the details). When a process calls exec(), that process is completely replaced by the new program, and the new program starts executing at its main() function. The process ID does not change, but the current process (its text, data, heap and stack segments) are replaced with a brand new program from disk.

The following gives a short example of the use of exec().

ACTIONS


The pipe() Function


Unix IPC (Interprocess Communication) has been, and continues to be, a hodgepodge of different approaches, few of which are portable across all Unix implementations. .... About the only form of IPC that we can count on, regardless of the Unix implementation, is half duplex pipes (Stevens, Advanced Programming in the UNIX Environment).

Although common to all flavors of Unix, pipes have two limitations:

(make sure you know what the above means - it was part of an exam question).

A pipe is like an I/O buffer administered by the kernel through which data can flow. We create a pipe by a call to the pipe() function. Two file descriptors are returned, one that is used for reading and one that is used for writing to the pipe. The following illustrates a pipe in a single process.

The program declares two character strings that are initialized to some random strings. The program writes character string string1 to the pipe and then reads from the pipe into character string string2.

ACTIONS

In a single process a pipe is next to useless. Normally they are used together with fork to provide an IPC channel between a parent and child process. To do this two pipes (A and B) are defined before the process issues a fork() call - this gives rise to 4 file descriptors; fd_A_read, fd_A_write, fd_B_read and fd_B_write. After forking, both the parent and child has copies of all these file descriptors. To turn pipe A into a channel that provides communication from:

Thus with two pipes we can use one for communication from parent to child (e.g. pipe A) and the other (e.g. pipe B) for communication from child to parent. This is illustrated in the following program. ACTIONS

Next Lab

Select and Sockets in Unix