Unit introduction
info handout
Syllabus and topics
Lecture notes
Module 1
Lecture 1.1
Module 2
Lecture 2
Lecture 3
Lecture 4
Assignments
assignment 1
specification
additional notes
assessment results
assignment 2
... similiarly
Survey results
Discipline rules on use of IT services
These are static structuring methods.
Many sites employ more dynamic techniques - and can become so interactive and dynamic that they no longer have a static structural map.
This information site supports the INFS3056 unit at ANU.
INFS3056 Course introduction handout
Syllabus and lecture topics
Lecture notes
Assignments
Laboratory/tutorial Exercises
You may be interested in the
entry skill survey of students
Web pages created by members of the class
Select initial letter of author's name:
A B C D E F G H I J K L ...
- also search index:
This is a search Index. Enter search keywords:. .
See http://info.anu.edu.au/elisa/elibrary/indexes1.html for example.
See also RFC 2068 HTTP/1.1
Tanenbaum section 7.6.2
and many other books about the Web.
HTTP is a simple protocol with few message types.
Assumes reliable transport service (usually over TCP/IP, not necessarily).
2 basic kinds of message: request and response
(all the interest is in the details of sub-formats within these)
The protocol allows for one request and one response within a connection
established at the transport layer.
Then both client and server close the connection.
client (CONNECT) * REQUEST * (CLOSE)
\ / \ /
\ / \ /
\ / \ /
\ / \ /
server * (ACCEPT) * RESPONSE (CLOSE)
The request is received by the server which sends a response.
It's not really that simple...
An HTTP message in either direction is in the form
start line *message-header CRLF [message-body]
The start line denotes the type of message:
request or
status (response)
- see below
The message-header lines are all in the same form as header lines in
RFC 822 (email):
name ":" [field-value] CRLF
e.g.
Date: Mon, 04 Aug 1997 08:08:02 GMT
Last-modified: Tue, 25 Feb 1997 06:43:22 GMT
Server: NCSA/1.5.2
Content-type: text/html
Content-Length: 8247
and carry various information about the message.
Some different fields are expected to be associated with requests and responses.
A request message is in the form of a Request line, optional header lines, and an optional message body.
For example, a simple request is
GET /index.html HTTP/1.1 CRLF
method URI HTTP-version end of line
The URI (universal resource indicator) here allows a full URL (universal resource locator): e.g.
GET http://challender.anu.edu.au/index.html HTTP/1.1
Of these, the most widely used are GET, HEAD and POST
(no message body, no header lines required)
request a document to be sent from the server.
The response may be the document itself
HTTP/1.0 200 Document follows
Date: Mon, 04 Aug 1997 08:18:10 GMT
Server: NCSA/1.5.2
Last-modified: Tue, 25 Feb 1997 06:43:22 GMT
Content-type: text/html
Content-length: 209
<html><head>
<title>ANU DCS Information for Students</title>
</head>
<body> here is the body of a very simple document </body>
</html>
note start line includes response code (200) and its meaning for readers (Document follows)
header lines describe the document and server properties:
date, content type etc.
body of document is HTML in this case.
Error responses do not exclude a message: for example, the dreaded 404:
HTTP/1.0 404 Not Found
Date: Mon, 04 Aug 1997 08:29:48 GMT
Server: NCSA/1.5.2
Content-type: text/html
<HEAD><TITLE>404 Not Found</TITLE></HEAD>
<BODY><H1>404 Not Found</H1>
The requested URL /fred.html was not found on this server.
</BODY>
(no message body, no header lines required)
request only the header information, usually to determine the date-modified or whether the document exists.
send information from the client to the server in the form of a message body (more later)
Responses are all 3-decimal digit codes:
The content of the response includes headers (as shown above)
and a message body - usually in the form of a document for display by the client.
A server may interpret any URL as referring to
The configuration of the server determines what it does with a URL.
The common case is that URLs with a pathname that includes the server directory
cgi-bin
refer to a program for execution.
For example,
http://challender/cgi-bin/openday/webpages/webpage.pl
is in fact the pathname of an executable script (in the Perl language) that generates the HTML document for display.
The server inspects the type of file and the context, and returns headers
and that file data as the message
or executes the program and returns headers (both GET and HEAD) and the result
as the body of the message (GET).
A URL may contain an optional query part up to 255 characters long:
http://challender/cgi-bin/dostuff?widget+wadget+boff
The server interprets the query string (following "?") as data
to be given to the executing program
- passed via an environment variable QUERY_STRING in UNIX.
If the item is a data file then the query part is ignored.
This provides a more powerful way for a client to interact with the server.
A POST message includes a message body, which can contain data of unlimited length.
Its interpretation is
[RFC 2068]
To ``provide a block of data to a process'' the URL is interpreted by the
server in the same way as GET (identify a program by pathname);
but the data is delivered to the program (if any) via its standard input
stream (like reading from a keyboard or an input file).
As with GET, the response may be a further document, a document dynamically constructed by the program, a simple success response, or an error...
The response is generated by the program writing HTML text to standard output.
Last modified: Tue Mar 30 11:26:19 EST 1999