Laboratory Exercise 8

Filtering and Formatting Data

CMP 420: Database Systems
Lehman College, City University of New York
3 April 2003


Employee Database

Last lab, we used the Java interface to SQL, called JDBC. Today, we will populate a simple database using data from the web. The database will contain two tables:
    Employees(ID, Name, Address, StartYear, StopYear, Salary)
Colleges(SchoolName, StudentName)
Telnet into comet and create a database for today's lab (or use the one you created for Lab 7).  You should create tables corresponding to the schema above, with the following constraints:

Populating the Database

One way to enter the information into your database is to type in every line by hand. Another way is to read in the database stored as a file. For today's lab, we will use data from the web to populate the database, found at: http://www.presidentsusa.net/index.html. We will use the information about the presidents found there to populate our database.

Unfortunately, most data on the web cannot be immediately loaded into your database. You first must format it, so, it can be correctly loaded. For example, this page gives most of the information needed for the first table, while information for the second table can be found here. However, it is not in the correct format (for example of the correct format, see the lab2data.txt).

The main part of today's lab is to write a filter program that reads in the data from the presidential webpage (save it as a file in your account), and writes out a file with the "INSERT ... VALUES" statements needed to insert the information about the presidents into the database.

Some hints on how to do this:

Once you have created you data file, you can read it into your database using the "psql -f" option or using JDBC as described in the last lab.

Write a similar program to use the information about the colleges that the presidents attended to populate the Colleges table.

Using your Database

Demonstrate your database works by:

Final Comments

Before leaving the lab, you need to log out and shut down your computer. First, you should quit all applications. Then, go to the K menu in the bottom left hand corner and select Log Out. Choose the Shut Down option, and shut off the monitor when you're done.

Remember to pick up any files you sent to the printer and to take any personal belongings. It is difficult to get back into the lab later (since it's locked when not in use), so, it's worth making sure you haven't forgotten anything.