Lab 5, CIS 166 & CMP 167,Introductory Programming, Lehman College, CUNY, Spring 2015

Decoding Text

In Chapter 5, Zelle converts text into its Unicode codes (text2numbers.py):

# text2numbers.py
#     A program to convert a textual message into a sequence of
#         numbers, utilizing the underlying Unicode encoding.

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        print(ord(ch), end=" ")
        
    print() # blank line before prompt

main()

We are going to modify his program to create a Caesar cipher (or shift cipher) which is a simple code where each letter is replaced by one some fixed distance down the alphabet. The simplest Caesar cipher is to shift all letters to the right by one. So, ABC would be encoded as BCD and I LOVE PYTHON would be J MPWF QZUIPO.

To do this, we will modify the above program. Let's start by printing out the index of each letter in the alphabet that's entered. So, instead of printing out ord(ch), we'll print out ord(ch)-ord('A'), or how far past 'A' the letter is in the alphabet:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        print(ord(ch)-ord('A'), end=" ")     # CHANGED ONLY THIS LINE
        
    print() # blank line before prompt

main()

Try running the new program with input ABCDE. What is printed out? Let's save that value in a variable called index:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')
        print(ord('A')+index, index, end=" ")     
        
    print() # blank line before prompt

main()

This change to the program should produce the same output. We are using variables to make it easier to read. What does the chr() function do? Modify your program as follows to find out:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')
        print(chr(ord('A')+index), index, end=" ")     
        
    print() # blank line before prompt

main()

Note that chr() "undoes" the ord() function by taking the unicode number as input and returning the corresponding character. How can we use this to shift encode each character in our message? If we want to shift by 1, we can simply add 1 before printing:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')+1
        print(chr(ord('A')+index), index, end=" ")    
        
    print() # blank line before prompt

main()

What happens if you enter a Z? For a true Caesar ciper, Z should go to A or an index of 0, but ours has an index of 26. To fix it, we will use modular arithmethic to make sure any number 26 or above is "wrapped" around to the beginning to the alphabet:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = (ord(ch)-ord('A')+1) % 26
        print(chr(ord('A')+index), end=" ")    
        
    print() # blank line before prompt

main()

And lastly, we will convert all messages to upper case letters to simplify the encoding:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message.upper():
        index = (ord(ch) - ord('A')+1) %26
        print(chr(ord('A')+index), end=" ")    
        
    print() # blank line before prompt

main()

Try encoding several messages to make sure your program works. You can use the same program to decode messages by changing the offset, i.e. how much each letter of plain text is shifted. Ours is currently set to +1, but you can "undo" this by setting it to -1.

Note that our program is not designed to handle punctuation or non-letters (for example, it replaces spaces by !).

Lab 5
CIS 166 & CMP 167: Introductory Programming
Lehman College, City University of New York
Spring 2015

Decoding Text

Python Challenge

Lab 5 CIS 166 & CMP 167: Introductory Programming Lehman College, City University of New York Spring 2015

Decoding Text

Python Challenge

Lab 5
CIS 166 & CMP 167: Introductory Programming
Lehman College, City University of New York
Spring 2015