COMP2310/COMP6310:
Concurrent and Distributed Systems
Semester 2 2013

Laboratory 7

Sockets and UNIX Select


You are expected to use the man pages! Remember that man pages with the same title may appear in multiple sections. If you want to search all sections use man -a.


Earlier in the course you were introduced to the Ada rendezvous communication model. This provided a message passing based communication primitive involving two tasks. Recall that the Ada rendezvous was asymmetric, in that one task posts an accept, while any other task could call on the associated entry. This communication pattern typifies that required by the client/server model.

In this lab we explore the Unix equivalent of rendezvous, namely sockets. You will also use the Unix equivalent of the Ada select statement, also conveniently called select(). However, you will find that implementing sockets and/or using Unix select is considerably more involved than it is with Ada.


Sockets


Server Side

In the previous lab, you encountered pipes as a means of providing interprocess communication (IPC) within the Unix environment. To establish a pipe connection it was required, however, that the processes involved had common ancestry. Sockets remove this requirement, enabling point to point connections between processes that may reside on either the same, or physically separate, computer systems.

Creating and using a socket involves several steps. First you must create it using a call socket (do man 2 socket):

       int socket(int domain, int type, int protocol);
As you may deduce from the above sockets come in several flavors. The return value is a socket descriptor that is used subsequently to refer to this socket.

Having created a socket we must associate it with an address. This is done using the bind() call and can be likened to assigning a telephone number to a telephone.

       int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);
In the above we pass a pointer to an address structure. With the AF_INET addressing mode there are two critical pieces of information required: i) the hostname ii) the port number (you may remember port numbers from COMP2300. You can view a list of port numbers for different services provided through those ports by looking at file /etc/services). When we bind a socket we will leave it up to the O/S to assign us a free port number.

For SOCK_STREAM sockets, a number of incoming messages can be queued, after which point subsequent messages are blocked. Normally the default number of queued messages is 5, but there is no requirement that this be true. As a consequence it is good practice to alway explicitly set the maximum number of queued messages. To do this we use the listen() function.

       int listen(int s, int backlog);
With the socket created, bound to an address and with a non-zero incoming queue limit, we are now in a position to establish point to point communications with other processes. This is done using the accept() call:
       int accept(int s, struct sockaddr *addr, socklen_t *addrlen);
This extracts the first connection request on the queue of pending connections, creates a new connected socket with mostly the same properties as the original, and allocates it a new file descriptor. Note that accept() can be either blocking or non-blocking, depending on the attributes of the socket. For our purposes, we will use only blocking sockets.

Client Side

In the above, the server has created a socket and is waiting to accept incoming communications. For the client to connect to that socket it must first create its own socket, binding it to an address and port number in an identical fashion. To contact the server, we then use the connect() function:
      int  connect(int  sockfd,  const  struct sockaddr *serv_addr, socklen_t addrlen);
In this call sockfd is the socket descriptor on the client side, while serv_addr is the address (host id and port number) on the server side. The obvious question - that we will return to, but you may want to consider - is how does the client know the port number of the server if that port number was dynamically assigned?

Communicating

Having established a socket connection, data can be communicated using a variety of system calls such as (send(), recv()) or (write()), read()). Essentially analogous calls to those used for writing data to a file. Moreover, just like file I/O we can terminate the socket connection by issuing a close() system call.

A Classic Server Program

The following code creates a socket and binds it to the default Internet interfaces with a port number that is determined by the O/S. It then waits for an incoming connection on that port number. When a connection is established, it receives an incoming character string, prints it, then appends to it and returns the string to the client.

Note that as we are using Internet based communications and as a result this could give rise to communicates between machines with different byte ordering (big and little endian). An IPv4 Internet address is represented as a 4 byte integer, so there would be obvious problems if one machine stored the IP representation of partch.anu.edu.au in memory as 150.203.24.13, while another effectively had it stored as 13.24.203.150. To remove this difficulty, TCP/IP defines what is called network byte ordering (which is in fact big endian) and assumes all IP related data uses this representation. In recognition of this Unix provides us with a variety of functions to convert from "host byte ordering" (be that big or little endian) to "network byte ordering". These are used in the following program. For more information do e.g. man ntohs.

A Classic Client Program

The following code is the client companion of the server program. Again we create a pipe, but to make a connection we need to know the IP address and the port ID of the server program. As we will be running the client and server on the same physical computer we can again use INADDRs_ANY as the IP address. The port number is, however, required as command line input. Thus this program can only be run after the server program has been started and printed out its assigned port number.

ACTIONS


The select Function


Now for the really tricky stuff. The function select() enables a process to wait for a change of status on any one of a variety of file descriptors. It also enables the process to set time limits similar to the use of "or delay" with the Ada select construction. The file descriptors of interest are grouped into what is called sets, with separate sets for read, write and exceptions. The format is as follows:
    int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
The parameter n is one higher than the maximum file descriptor to watch. On return the function reports the number of file descriptors that have changed. Thus the user is then required to interrogate each file descriptor in turn to determine what has happened.

To manipulate the file descriptor sets the user is provided with a variety of macros: FD_CLR, FD_ISSET, FD_SET and FD_ZERO. For complete details, do man select

Below is an example of using select() with pipes. The program forks two children. Each child sends a request to the parent, who responds by sending them the value of a counter. After receiving the counter, they sleep for either 1 or 2 seconds. The parent terminates everything after the counter has reached a value of 20.

 
#include <sys/types.h> 
#include <sys/time.h> 
#include <unistd.h> 
#include <stdlib.h> 
#include <stdio.h>
#include <sys/wait.h>
#define MAXBUF 32
#define NUM_CHILD 2
#define COUNT_MAX 20
     
int main(int argc, char* argv[]) {
  int fd_pc[NUM_CHILD][2], fd_cp[NUM_CHILD][2];
  int maxfd;
  fd_set fd_read_set;
  char buffer[MAXBUF];
  int i, counter;
      
  /* ----Create the pipes---- */
  for (i=0; i < NUM_CHILD; i++) {
    if (pipe(fd_pc[i]) || pipe(fd_cp[i])) {
      perror("pipe create");
      exit(-1);
    }
  }
      
  for (i=0; i < NUM_CHILD; i++) {
    /* ----Create child i ---- */
    if (fork()==0) { /* ---- perform child i  ---- */
      int count;
      close(fd_pc[i][1]); /*Close parent to child write*/
      close(fd_cp[i][0]); /*Close child to parent read*/
      sprintf(buffer, "Request from child %d", i);
      do {
        write(fd_cp[i][1], buffer, MAXBUF);
        read(fd_pc[i][0], &count, sizeof(count));
        printf("Value of counter on child %d: %d\n", i, count);
        fflush(stdout);
        sleep(i+1);
      } while (count > 0);
      printf("Child %d terminates\n", i);
      fflush(stdout);
      exit(0);
    }
  }
 
  /* ----Now for the parent---- */
  for (i=0; i < NUM_CHILD; i++) {
    close(fd_pc[i][0]); /*Close parent to child read*/  
    close(fd_cp[i][1]); /*Close child to parent write*/
  }
      
  /* ----What is the maximum file descriptor for select---- */
  maxfd = fd_cp[0][0];
  for (i=1; i < NUM_CHILD; i++)
    if (fd_cp[i][0] > maxfd)
      maxfd = fd_cp[i][0];
  maxfd = maxfd + 1;
      
  counter = 0;
  /* ----Clear the read set---- */
  FD_ZERO(&fd_read_set);
  while (counter < COUNT_MAX) {
    /* ----Set file descriptor set to NUM_CHILD read descriptors---- */
    for (i=0; i < NUM_CHILD; i++) {
      FD_SET(fd_cp[i][0], &fd_read_set);
    }

    /* ----Wait in select until file descriptors change---- */
    select(maxfd, &fd_read_set, NULL, NULL, NULL);

    for (i=0; i < NUM_CHILD; i++) {
      /* ----Was it child i---- */
      if (FD_ISSET(fd_cp[i][0], &fd_read_set)) {
        read(fd_cp[i][0], buffer, MAXBUF);
        printf("%s\n", buffer);
        counter++;
        write(fd_pc[i][1], &counter, sizeof(counter));
      }
    }
  }
    
  /* ----Terminate the children with waiting---- */
  counter = -1;
  for (i=0; i < NUM_CHILD; i++)
    write(fd_pc[i][1], &counter, sizeof(counter));
  for (i=0; i < NUM_CHILD; i++)
    wait(NULL);
  exit(0);
}

ACTIONS