TDC 561: Network Programming, Winter 08

 

Assignment 2: Distributed Database System (I/O Multiplexing Implementation)

 

Due: 5:45pm, 28 February 2008

 

 

Assignment Description

In this network programming assignment, you are asked to develop a simple distributed database system that provides query interface on data repositories distributed over multiple sites as described in Assignmetn#1 but using I/O multiplexing instead of multiprocesses to provide concurrency. It consists of a database client and server. The database client (simddb2) provides a query interface for users to add, delete and retrieve information. The database server (simddbd2) implements the database engine that receives and process the client queries and sends responses. The system may involve multiple server and multiple clients. The data might be distributed over multiple servers or simddbd2. The client simddb2 needs to fragment the user query (UQ) to multiple queries (CQ) and send each one to the appropriate server. We will use the database scheme of a simple course registration system. The database has three tables: Student Table<Name, SSN>, Course Table<SSN, courses_code>, Location Table<course_code, location> where the underline ones are the key attributes. Each table exists in one simddbd2 server. The following is the deign specification of both the simddb2 and simddbd2:

 

Distributed Database Client—simddb2: (50%)

 

Simddb2 establishes a TCP connection with each simddbd2 server. The client reads the configuration of the database servers from a file called ddbconfig.txt which is formatted as follows:

 

<number of servers>

<port number of server 1> <db schema in server1>

..

<port number of server 1> <db schema in server1>

 

Example of ddbconfig.txt is as follows:

---------------------------------------------

2

5000 name ssn

6000 ssn course_code

7000  course_code location

--------------------------------------------------

 

Then the client connects to each database server in the config file. The client then reads from the STDIN the UQ, fragments it into CQ, and sends each one as a message to the corresspoding server. The message format for query and delete is as follows:

 

Query or Delete-Request = <ReqType><KeyAttributeValue> such that

Insert-Request or Query-Response=<ReqType><KeyAttributeValue><numberOfAttributes><Other attributes>

 

<ReqType>::= (integer) -1, 0, 1, 2, 3, 4,  Error, Confirmation, Query, Delete, Insert and Response respectively

<KeyAttributeValue>::= (char[32]) which is a string name of the key attribute

<NumberOfAttributes>::= (integer) maximum is 10

<Other attributes>::= array of structures each of char[32]

 

The server always sends a confirmation or error message for each Insert or Delete request. The format of these messages is as follows

 

<ReqType><ErrorDescription>

 

such that

 

<ReqType>::= 0 for Error and Insert and delete Confirmation or -1 for Error

<ErrorDescription>::= (char[128]) test to explain the reason of the error. Confirmation messag will have null string

 

After the client connects to all servers, it sends each server the appropriate CQ and reads the response. It uses this response to construct the next CQ and so on. For example, the user requests UQ the course locations for all courses taken by “Jimmy”. Then CQ#1 (to server 1) will be <Jimmy> which returns SSN=123, then CQ#2 (to server 2) <123> which returns “course_code=TDC561”, and CQ3 (to server 3) will be <TDC561> which will returns location “Luis 1216”. The client will continuously reads user queries from the STDIN till the user hits  <ctr><c>. In this case, the client will shut down the tpc connections. The user query interface (UQI) is simply as follows:

 

For user query ::= <Command><KeyAttributeName>”=”<Value>”,”<DesiredAttributeName>”?’

For user deletion ::= <Command><KeyAttributeName>?

For user Insertion ::= <Command><KeyAttributeName>”=”<Value>”,”<OhterAttributeName>”=”<Value>”?’

 

 

<Command>: Q, or D for query, insert or delete records.

<KeyAttributeName><DesiredAttributeName> ::= char[32] text string that identifies the name of the attributes

<Value>::= integer or string depends on the data type of the value of this key attribute

 

An example of user input: “Q Name=Jimmy, location?” or “Q Name=Jimmy, course_code?” and so on.

 

 

Concurrent Distributed Database Server—simddbd2: (50%)

 

Simddbd2 first binds to a port pass in the commands line argument as follows: $simddbd2  <PortNumber>. To avoid the port conflict with your classmates, you can use as a default value for Portnumber 1024 + the last four digits of your SSN. Then the server reads the db records from a file “<portnumber>.db” (e.g., 5000.db) and load it up in a array of structures. The file format is as follows:

 

“5000.db”:

name    ssn

Jimmy    1234

John        4567

…..

 

It then goes to accept TCP connection from one or more clients simultaneously. The server can serve multiple db client concurrently using select() or I/O multiplexing. When the server child receives the query request, it performs a table lookup and sends back the response in the message format specified above. The server child will maintain the connection as long as the client does not shut down the connection. The server must be protected from SIGPIPE and it must also clean up all zombie processes resulted from terminated processes/connections. 

As a hint, you can modify “TCPmecho.c” and “TCPmechod.c” programs to develop the client and server, respectively.

 

Running Example:

 

$simddbd2 5000

$simddbd2 6000

$simddbd2 7000

 

$simddb2 /* notice that the client get the port numbers to connect to from the config file */

UserQueyInterface>Q name=Jimmy, ssn?

1234  /* this is the response from the server(s) */

UserQueyInterface>Q name=Jimmy, course_code?

TDC561

UserQueyInterface>Q name=Jimmy, location?

Luis 1216

UserQueyInterface>Q course_code=TDC561, location?

Luis 1216

UserQueyInterface>I name=Ken, ssn= 4567?

Ok

UserQueyInterface>D name=Ehab?

Error: Name Not Found

 <ctr><c>

 

NOTES:

Implementing DB “insertion” and “deletion” are mandatory in this assignment. With multiple client operations, the DB must be kept consistent too.

 

Extra Credit:

Change the client from assignment to send requests to all servers first then wait for their replies. This means that the client should use select() to handle receiving concurrent responses without getting into deadlock. For this, see the example in the text book.

 

HINT:

(1)     Modify TCPmecho.c and TCPmechod.c to build the client and server

(2)     Start with develop only a single client with a single server

(3)     Develop an iterative version of the db server using only Query Request

(4)     You must use the struct to specify the message format described above for sending and receiving messages

For example, this is how you create a message structure for Insert req:

 

Typedef ATTRIBUTE {

char                        name[32];

} Attribute;

 

struct  INSERTQ {

                int                           reqtype;

                char                        key[32];

                int                           numberofAtt;

                Attribute                ListofAttributes [10];

} IQ;

 

(5)     To do the message encoding, packetization, reading and writing, refer to slides (PacketizationI.pdf) under Resources in the course home page.

 

 

Submission Procedure:

Write your name and SSN in the main program and in the README file. After cleaning your directory from objects and bin files, do the following: (Your submission must contain a Makefile and a README file that describes how to run your program). You must use the same naming convention and command line arguments as specified in the assignment description.

  

  • Delete any bin executables or object files, first.
  • $ tar -cvf  <HawkLogin>-<SSNLast4>-hw2.tar  *
  • $gzip  <HawkLogin>-<SSNLast4>-hw2.tar
  • $ ftp as ”bin” to your local desktop (you just type bin in the ftp prompt)

(if you do not enable the “bin” option, your file will be corrupted)

  • Upload you it to DLWEB

For verification: upload it back from your DLWEB into your local unix directory and untar it to make sure that every things works fine.

 

NOTE: Students are responsible to upload a working copy in the right slot. Thus you may “download”, “untar” and then “compile” it to verify that it works. DO NOT for get to exclude any binary or object files in your submission

  

Grading Policy (READ THIS CARFULLY):

Late penalty is 10% per day. 30 % for the server program, 20% concurrency, 30% for the clients, 10% for cleaning up and 10% for the README and Makefile files. Use the same naming conventions and command line argument as described above.

If you have a question about the homework, then the best thing is to ask during office hours or in class. DO NOT wait until few days before the deadline to start on the assignment. The assignments are designed to almost fit the given time window.

 

The grader should explain clearly the reason for the deduction if any. Read the comment and in case of questions or dispute then follow this process:

(1)     Send email to the grader requesting re-grading your assignment,

(2)     If they grader did not reply back or his reply was not sufficient, then send an email to the instructor at the word-reverse of this string (edu.depaul.cs@ehabclass) or better to meet him during office hours. The grader name and email will be announced in class.

 

We will discuss briefly Makefile in class. But I expect students to handle the Makefile and C language issues individually. But if you look under Resources link in the course home page, you will find tutorials and enough information to get you started in Makefile and Unix. This course is very fun.. the average is normally is very high and the experience is highly appreciated. Good Luck.