Today's lab focuses on formatting and file processing as well as finding (and fixing!) errors.

Working With Files

Python has many built-in functions for working with strings and files (string methods). In this lab, we will manipulate files using various string methods.

Let's start with the program printfile.py from the book's website. Try running it. When it asks for a file, type in printfile.py. What does it print out?

Next, let's try it on the file allcaps.txt. Let's change the print line to

	print(data.lower())
What happens? Why?

To print to a file (instead of to the screen) is very easy:

Let's do that for the printfile.py program:

def main():
    fname = input("Enter filename: ")
    infile = open(fname,"r")
    outfile = open("out.txt","w")
    data = infile.read()
    print(data, file=outfile)
Run the program. Where did it send the output? Modify this program so that all the output is in lowercase and test it on the text file above.

In python, there's often many different ways to write the same program. Let's write a program that prints the lines of a file using a for loop:

def main():
    print("This program prints a file.")

    # get the input file name
    infileName = input("Please enter name of input file: ")

    # open the files:
    infile = open(infileName, "r")

    # read each line of the input file and print it to outfile
    #   
    for line in infile:
        print(line)

main()
What happens when you run this program? Why is it doublespacing the output? When you read a line from a file, it ends with an enter or `newline' character (often represented as `\n'. We can solve this in several different ways. One way to keep the last character of a line from being printed, we can use the slice operator to truncate the string:
	print(line[:-1], file = outfile)
The slice, line[:-1] says that you would like the string that consists of all the characters in line up to but not including the last character (since the -1 index always refers to the last character of a string, no matter how long the string is).

Modify this second program to print out the file all lowercase and singlespaced, and test it.

Finding Errors

Finding, and fixing errors, in your programs is a very useful skill. Let's look at a program with lots of errors and work through how to identify the issues and fix them. The program, errors.py, when loaded into IDLE does not run:
# errors.py is based on dateconvert2.py from Chapter 5 of the Zelle textbook
#     Converts day month and year numbers into two date formats

def main()
    # get the day month and year
    day, month year = eval(input("Please enter day, month, and year numbers: ")

    date1 = str(month)"/"+str(day)+"/"+str(year)

    months = ["January", "February", "March", "April", 
              "May", "June", "July", "August", 
              "September", "October", "November", "December"]
    monthStr = months[month+1]
    date2 = monthStr+" " + str(day) + ",  + str(year)

    print("The date is" date1, "or", date2+".")

main()
Instead, a dialog box pops up and says "invalid syntax":


The red line indicates where the intepreter has found an error. Can you tell what it is? Syntax is another word for grammar, so, it most likely missing `punctuation' or a misspelling of some sort. We have spelled def correctly and have the right number of parenthesis, so, what else is missing?

The answer is after the parenthesis on a function definition, a colon is required. Add that in:

    def main():
and try to run the program again.

Again, we get a dialog box:



Instead of the whole line being highlighted, only the word year is. The Python intepreter was not expecting year and says there is a grammatical mistake. Since year does not include any grammatical constructs, we need to look before the message to see where the error is. Do you see it?

The answer is lists of variables need commas in between them to distinguish one from the next. Add the comma in:

    day, month, year = ...
and try to run the program again.

Once more we get a dialog box:


It has highlighted the first item, date1 on the line. That is a name and looks fine. So, as above, let's look before the highlighted error to see if there's a problem. The line above it is:

    day, month, year = eval(input("Please enter day, month, and year numbers: ")
It did not highlight this line, so, the problem must be at the end. Do you see it?

The answer is we are missing a closing parenthesis. The line has two left parenthesis but only one right parenthesis. Add the right parenthesis in:

    ... and year numbers: "))
and try to run the program again.

Again, we get a dialog box:


The intepreter does not understand the second " on the line. Why? What is this line doing? It's constructing a string and storing it in the variable date1. How do you build a string out of smaller strings?

The answer is to put smaller strings together (called concatenation) we need to use the plus sign (+). The line is missing a plus sign right before the quotes. Add the plus sign in:

    date1 = str(month) + "/" ...
and try to run the program again.

Again, we get a dialog box, but this one has a different message:


EOL means "End of the line", so, the message says that the end of the line was reached before you finished defining the string. How can you fix this?

The answer is to end the string, using quotation marks. The line is missing a quotation mark at the very end. Add the quotation mark :

    ...+ ",  + str(year)"
and try to run the program again.

Our familiar dialog box returns:


We have seen this type of error before. How do you fix it?

The answer is lists of arguments need commas in between them to distinguish one from the next. Add the comma in:

    ... date is", date1 ...
and try to run the program again.

It runs! Now let's make sure it works. Type in at the prompt:

Please enter day, month, and year numbers: 31, 12, 2014
Uh oh, instead of output, we get the following messages:
Traceback (most recent call last):
  File "/Users/stjohn/public_html/teaching/cmp/cmp230/f14/errors.py", line 18, in 
    main()
  File "/Users/stjohn/public_html/teaching/cmp/cmp230/f14/errors.py", line 13, in main
    monthStr = months[month+1]
IndexError: list index out of range

When you see messages like this, go to the very last line:

IndexError: list index out of range
It says that the index for our list is out of range. An index is the item of the list that we're accessing. For example, months[1] has index 1 and will give us February. The range of the index for a list is 0 to one less than the length of the list. In the case of months, the range is [0,1,2,...,11]. What went wrong when we entered 12 for our month?

The answer is we used month+1 = 12 + 1 = 13 as the index:

    monthStr = months[month+1]
which is out of range. What do we want instead? Instead of adding 1, we should subtract 1. Change it in the program:
    monthStr = months[month-1]
and try to run the program again.

It still runs, but does it work? Let's try the same input again:

Please enter day, month, and year numbers: 31, 12, 2014
The date is 12/31/14 or December 31,  + str(year).

Something odd is happening at the end-- str(year) does not look right. Let's look at the print statement:

date2 = monthStr+" " + str(day) + ",  + str(year)"
The intepreter is treating ", + str(year)" as a string (instead of evaluating str(year)), so, we must have put the quotation mark in the wrong place before. Let's move it:
date2 = monthStr+" " + str(day) + ","  + str(year)
and try to run the program again.

Success! But try a few other inputs, just to make sure. It is always good to try cases that are near the `boundary' of what's allowed, since those are the places we are most likely to make mistakes:

Please enter day, month, and year numbers: 1,1,2015
The date is 1/1/2015 or January 1,2015.

Please enter day, month, and year numbers: 1, 2, 2003
The date is 2/1/03 or February 1,2003.

Please enter day, month, and year numbers: 4, 7, 1976
The date is 7/4/1976 or July 4,1976.

We have removed all the errors, and the program now runs correctly!