The data file statesSummary.csv is from the CDC. Before starting the program, open up the csv file and see what it looks like.
import matplotlib.pyplot as plt import numpy as np import csv
infile = open('statesSummary.csv','r')
infile = open('statesSummary.csv','rU') reader = csv.reader(infile) yearLine = reader.next() years = [int(w) for w in yearLine[1:]]
for i in range(5): line = reader.next() stateName = line[0] stateValues = [int(w) for w in line[1:]] color = (0.0, i/5.0, 0.0) plt.scatter(years, stateValues, c=color, label=stateName)
plt.title("Cases of Lyme Disease") plt.xlabel('Years') plt.ylabel('Number of Cases') plt.legend(loc = 2, fontsize = 'x-small') plt.show()
color = (1.0,0.0,i/5.0) #Color tuple: 100% red, 0% green, and blue increasing with iWhat happens?
The basemap webpage has an example of coloring in states by population, fillstates.py. Once you have the 3 files with the shapes of the states, you can run this program to see the map (don't worry to much about all the details, we will go through a simpler one first).
We will use a simpler version of it to map the Lyme Disease data, statesFilled.py.
For each state, we will need the total number of incidences. We start out as before:
import matplotlib.pyplot as plt import numpy as np import csv infile = open('statesSummary.csv','r') reader = csv.reader(infile) yearLine = reader.next() years = [int(w) for w in yearLine[1:]]for each state, we'll save the name and total to a list:
stateNames = [] stateTotals = [] for row in reader: stateNames.append(row[0]) stateTotals.append(sum([int(r) for r in row[1:]]))
Note: The use of two 'parallel' arrays, stateNames and stateTotals, is not the best programming practice. Instead, since the information is linked (i.e. the ith state total is for the ith state name), we should store them in a linked way, such as a dictionary.
We will scale every state total to be a fraction of the highest total:
maxCases = float(max(stateTotals)) scaledTotals = [i/maxCases for i in stateTotals]Now, let's add in the plotting of each state. The first part is as before:
import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from matplotlib.patches import Polygon # create the map map = Basemap(llcrnrlon=-119,llcrnrlat=22,urcrnrlon=-64,urcrnrlat=49, projection='lcc',lat_1=33,lat_2=45,lon_0=-95) # load the shapefile, use the name 'states' map.readshapefile('st99_d00', name='states', drawbounds=True) ax = plt.gca() # get current axes instance # collect the state names from the shapefile attributes so we can # look up the shape obect for a state by it's name names = [] for shape_dict in map.states_info: names.append(shape_dict['NAME'])What changes is how we add colors to each state. The line c = ... sets the color to be 100% red, a percentage of green that's based on the scaled totals, and 100% blue. When the scaled total for a state is low, this is close to 100% red, 100% green, and 100% blue, which is white (on the computer, colors mix like light, instead of the traditional paint-- that is, as you add more, instead of getting darker (like paint), it gets lighter). When the scaled total for a state is high, the color is still 100% red and blue but the green decreases, so, the color appears more purple:
#For each state that we have Lyme Disease data: for i in range(len(stateNames)): print "Plotting", stateNames[i] seg = map.states[names.index(stateNames[i])] c = (1.0,1.0-scaledTotals[i],1.0) poly = Polygon(seg, facecolor=c,edgecolor='black') ax.add_patch(poly) plt.show()(The whole file is in lymeMapped.py).