Homework #8

CMP 464-C401/MAT 456-01:
Topics Course: Data Science
Spring 2016

Topics: Mapping Data & Markov Chains
Deadline: Thursday, 31 March 2016, 10:30am

Data

For this assignment, you will need to download two different data sets from the New York City open data project:

  1. CUNY Locations: the locations of the campuses of the City University of New York on a satelitte image of the city:
    https://data.ny.gov/Education/City-University-of-New-York-CUNY-University-Campus/i5b5-imzn
  2. Wifi in New York City (CMP Only): The wireless hotspot locations in the city:
    https://nycopendata.socrata.com/Social-Services/NYC-Wi-Fi-Hotspot-Locations/a9we-mtpn
as well as use the birthday data set you collected for Homework 3.

We will use these data sets for later homework assignments. Since scraping the data takes time, save these data sets to use again for the future programs.

Assignment

The work to be submitted differs by whether you are enrolled in the computer science or mathematics course.

CMP 464 Homework: MAT 456 Homework:
#1-2 Plot the locations of the campuses of the City University of New York on a satelitte image of the city. Make sure to include in the title of your plot the date plotted.

#1: Submit your Python program as a .py file.
#2: Submit a screen shot of the graphics window containing the plot.

Hint: Use the drawcounties() option in basemap and show only the region including New York City. Use the csv module to extract the latitude and longitude from the downloaded NYC data file.
#3-4 Using the birthday data set collected in Homework 3, plot all the crashes that occurred on your birthday. Bound the regions of your map to include primarily New York City. Make sure to include in the title in your plot.

#3: Submit your Python program as a .py file.
#4: Submit a screen shot of the graphics window containing the plot.
#5-6 Make a plot that includes all the WiFi locations within one hundredth of a degree of a location of your choice. For example, Lehman College is at 40.8725 degrees North and 73.8939 degrees West. For this location, you would include all hotspots that within 0.01 in either direction (since the plot will be square). Your plot should color the inside hotspots blue and the outside hotspots green (GIS location as well as type available in the CSV file of hotspots). Make sure to include in your title of the plot the region of the city you are plotting.

#5: Submit your Python program as a .py file.
#6: Submit a screen shot of the graphics window containing the plot.
Assume that you have that people move from the three states with the following probabilities:

70% stay in NY 25% of NYers move to California 5% of NYers move to Texas
7% CA move to NY 90% of CA stay in CA 3% of CA move to TX
15% of TX move to NY 25% of TX stay to CA 60% of TX stays in TX


#5: Represent the transitions that people move between states as a matrix. Compute the eigenvalues and eigenvectors for this matrix, or show that none exist. Submit your answer as a text file or scan of a neatly handwritten file.
#6: If the initial populations are California: 40 million, New York: 20 million, and Texas: 25 million. What are the populations for each state in 1 year? In 5 years? In 10 years? Submit your answer as a text file or scan of a neatly handwritten file. Include a list of any packages (i.e. numpy, maple, etc) used to solve this.

Hint: The module numpy has many useful functions for computing determinants and eigenvalues in its linear algebra package. Note that the * for matrices is element-wise (not regular matrix multiplication). To multiply two matrices, a and b together use a.dot(b).

Submitting Homework

To submit your homework, log on to the Blackboard system, and go to "Homework". For each part of the homework, there is a separate input box. You may submit the homework as many times as you would like before the deadline.