Today's lab will focus on formatting files.
De Blasio, Bill Bloomberg, Mike Giuliani, Rudy Dinkins, David
The company's database uses SQL commands and the format for new entries is:
insert into customer (first, last) values ('FirstName','LastName')
If there's only a few names, we can easily convert the file to the desired format, by hand:
insert into customer (first, last) values ('Bill','De Blasio') insert into customer (first, last) values ('Mike','Bloomberg') insert into customer (first, last) values ('Rudy','Guiliani') insert into customer (first, last) values ('David','Dinkins')
If there's many names, or if you have to add entries every night, it is much easier to write a program. Here's the basic outline of the program:
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names #Open files to be used in the program #For each line, #Separate out the first and last names #Format the insert string and write it to the file #Close files main()
In Lab 6, we opened and closed files, so, let's add those in here:
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, #Separate out the first and last names #Format the insert string and write it to the file #Close files infile.close() outfile.close() main()To make sure there's no typing errors, try running the above program with the mayors.txt as the input file. Nothing will print yet, since we have not included any print() statements.
Next, let's go through the input file, line by line. To do this, we will use the readlines() function, since it allows us to access each line (we've included a print statement in our for loop to test that the program is working):
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, for line in infile.readlines(): #Separate out the first and last names #Format the insert string and write it to the file print(line) #Close files infile.close() outfile.close() main()When you run the program, the output is doublespaced! Let's remove the extra newline like we did in the last program (i.e. line[:-1]) and then split it in parts using the comma as our delimiter:
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, for line in infile.readlines(): #Separate out the first and last names #Format the insert string and write it to the file last, first = line[:-1].split(',') print(first, last) #Close files infile.close() outfile.close() main()Notice that we have the variables ordered: last, first since that's the order that our boss had them in the initial file. Next, we need to format the output using the SQL format he gave us:
insert into customer (first, last) values ('FirstName','LastName')We want to take the string and insert the values of first and last into the positions marked by {0} and {1}:
"insert into customer (first, last) values ('{0}','{1}')"which we can do using the format statement:
"insert into customer (first, last) values ('{0}','{1}')".format(first,last)Let's start by printing that to the screen:
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, for line in infile.readlines(): #Separate out the first and last names #Format the insert string and write it to the file last, first = line[:-1].split(',') s = "insert into customer (first, last) values ('{0}','{1}')".format(first,last) print(s) #Close files infile.close() outfile.close() main()Try running this program. Note that we have some extra spaces before the first names. Those will cause problems with the database if we leave them, so, we'll use one of the useful string methods (p 140 in the textbook) to remove the leading whitespace on the first names: first.lstrip():
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, for line in infile.readlines(): #Separate out the first and last names #Format the insert string and write it to the file last, first = line[:-1].split(',') s = "insert into customer (first, last) values ('{0}','{1}')".format(first.lstrip(),last) print(s) #Close files infile.close() outfile.close() main()Lastly, we need to print this information to the file (we were only printing to the screen to make it easier for us to debug the program). We'll print a message to screen to let the user know where we are in the processing:
# Program to convert CSV file of customer data into SQL insert statements def main(): #Ask user for input and output file names inName = input("Enter input filename: ") outName = input("Enter output filename: ") #Open files to be used in the program infile = open(inName,"r") outfile = open(outName,"w") #For each line, for line in infile.readlines(): #Separate out the first and last names #Format the insert string and write it to the file last, first = line[:-1].split(',') s = "insert into customer (first, last) values ('{0}','{1}')".format(first.lstrip(),last) print(s, file = outfile) print("\tWrote {0} to {1}.".format(last,outName)) #Close files infile.close() outfile.close() main()This basic structure works for processing most data files. For example, say your boss started collecting addresses from customers. How could you change it to add that information to your database?
If you finish early, you may work on the programming problems.