Program 1: Turnstile Counts. Due 10am, Wednesday, 1 February.
The NYC MTA provides counts of the number of entries and exits through every turnstile in every subway station, as well as daily counts for the
entire system.
Write a program that will compute the number of entries at subway stations using the MTA turnstile dataset. To allow for unit testing, your program should have the following functions: For example, open the data file turnstile_220611.txt, storing the data in Notes:
Learning Objective: to build competency with dictionaries and string functions of core Python.
Available Libraries: Core Python 3.6+ only.
Data Sources: MTA's Turnstile Data.
Sample Datasets: Week ending 10/29/22 (turnstile_221029.txt),
Week ending 6/11/22 (turnstile_220611.txt).
make_dictionary(data, kind = "min")
: Depending on the kind specified (min
, max
, or station
), the resulting dictionary will store the minimum entry number seen (as an integer), the maximum entry number seen (as an integer), or the station name (as a string). Returns the resulting dictionary.
get_turnstiles(station_dict, stations = None)
:
Returns a dictionary of names with values the number of times each name occurs in the input, names_lst
.
If station is None, returns the names of all the turnstiles stored as keys in the inputted dictionary. stations is a list. If a station is specified, returns the keys which have value station in the inputed dictionary. Expected return value is a list.
compute_ridership(min_dict,max_dict,turnstiles = None)
:
Takes as input two dictionaries, min_dict
and max_dict
, and a list, turnstiles
(possibly None) of stations. Returns a list of the ridership (the difference between the minimum and maximum values). If no value is passed for turnstile, the default value of None is used (that is, a list of the ridership for every station in the dictionaries).
data
:
Next, lets use the functions above to set up the three dictionaries:
file_name = 'turnstile_220611.txt'
#Store the file contents in data:
with open(file_name,encoding='UTF-8') as file_d:
lines = file_d.readlines()
#Discard first line with headers:
data = lines[1:]
gives the output:
min_dict = make_dictionary(data, kind = "min")
max_dict = make_dictionary(data, kind = "max")
station_dict = make_dictionary(data, kind = "station")
#Print out the station names, alphabetically, without duplicates:
print(f'All stations: {sorted(list(set(station_dict.values())))}')
We can print all the turnstiles from the data:
All stations: ['1 AV', '103 ST', '103 ST-CORONA', '104 ST', '110 ST', '111 ST', '116 ST', '116 ST-COLUMBIA', '121 ST', '125 ST', '135 ST', '137 ST CITY COL', '138/GRAND CONC', '14 ST', '14 ST-UNION SQ', '145 ST', '149/GRAND CONC', '14TH STREET', '15 ST-PROSPECT', '155 ST', '157 ST', '161/YANKEE STAD', '163 ST-AMSTERDM', '167 ST', '168 ST', '169 ST', '170 ST', '174 ST', '174-175 STS', '175 ST', '176 ST', '18 AV', '18 ST', '181 ST', '182-183 STS', '183 ST', '190 ST', '191 ST', '2 AV', '20 AV', '207 ST', '21 ST', '21 ST-QNSBRIDGE', '215 ST', '219 ST', '225 ST', '23 ST', '231 ST', '233 ST', '238 ST', '25 AV', '25 ST', '28 ST', '3 AV', '3 AV 138 ST', '3 AV-149 ST', '30 AV', '33 ST', '33 ST-RAWSON ST', '34 ST-HERALD SQ', '34 ST-HUDSON YD', '34 ST-PENN STA', '36 AV', '36 ST', '39 AV', '4 AV-9 ST', '40 ST LOWERY ST', '42 ST-BRYANT PK', '42 ST-PORT AUTH', '45 ST', '46 ST', '46 ST BLISS ST', '47-50 STS ROCK', '49 ST', '4AV-9 ST', '5 AV/53 ST', '5 AV/59 ST', '5 AVE', '50 ST', '51 ST', '52 ST', '53 ST', '55 ST', '57 ST', '57 ST-7 AV', '59 ST', '59 ST COLUMBUS', '6 AV', '61 ST WOODSIDE', '63 DR-REGO PARK', '65 ST', '66 ST-LINCOLN', '67 AV', '68ST-HUNTER CO', '69 ST', '7 AV', '71 ST', '72 ST', '72 ST-2 AVE', '74 ST-BROADWAY', '75 AV', '75 ST-ELDERTS', '77 ST', '79 ST', '8 AV', '8 ST-NYU', '80 ST', '81 ST-MUSEUM', '82 ST-JACKSON H', '85 ST-FOREST PK', '86 ST', '86 ST-2 AVE', '88 ST', '9 AV', '90 ST-ELMHURST', '96 ST', '96 ST-2 AVE', '9TH STREET', 'ALABAMA AV', 'ALLERTON AV', 'AQUEDUCT N.COND', 'AQUEDUCT RACETR', 'ASTOR PL', 'ASTORIA BLVD', 'ASTORIA DITMARS', 'ATL AV-BARCLAY', 'ATLANTIC AV', 'AVENUE H', 'AVENUE I', 'AVENUE J', 'AVENUE M', 'AVENUE N', 'AVENUE P', 'AVENUE U', 'AVENUE X', "B'WAY-LAFAYETTE", 'BAY 50 ST', 'BAY PKWY', 'BAY RIDGE AV', 'BAY RIDGE-95 ST', 'BAYCHESTER AV', 'BEACH 105 ST', 'BEACH 25 ST', 'BEACH 36 ST', 'BEACH 44 ST', 'BEACH 60 ST', 'BEACH 67 ST', 'BEACH 90 ST', 'BEACH 98 ST', 'BEDFORD AV', 'BEDFORD PK BLVD', 'BEDFORD-NOSTRAN', 'BERGEN ST', 'BEVERLEY ROAD', 'BEVERLY RD', 'BLEECKER ST', 'BOROUGH HALL', 'BOTANIC GARDEN', 'BOWERY', 'BOWLING GREEN', 'BRIARWOOD', 'BRIGHTON BEACH', 'BROAD CHANNEL', 'BROAD ST', 'BROADWAY', 'BROADWAY JCT', 'BRONX PARK EAST', 'BROOK AV', 'BROOKLYN BRIDGE', 'BUHRE AV', 'BURKE AV', 'BURNSIDE AV', 'BUSHWICK AV', 'CANAL ST', 'CANARSIE-ROCKAW', 'CARROLL ST', 'CASTLE HILL AV', 'CATHEDRAL PKWY', 'CENTRAL AV', 'CENTRAL PK N110', 'CHAMBERS ST', 'CHAUNCEY ST', 'CHRISTOPHER ST', 'CHURCH AV', 'CITY / BUS', 'CITY HALL', 'CLARK ST', 'CLASSON AV', 'CLEVELAND ST', 'CLINTON-WASH AV', 'CONEY IS-STILLW', 'CORTELYOU RD', 'CORTLANDT ST', 'COURT SQ', 'COURT SQ-23 ST', 'CRESCENT ST', 'CROWN HTS-UTICA', 'CYPRESS AV', 'CYPRESS HILLS', 'DEKALB AV', 'DELANCEY/ESSEX', 'DITMAS AV', 'DYCKMAN ST', "E 143/ST MARY'S", 'E 149 ST', 'E 180 ST', 'EAST 105 ST', 'EAST BROADWAY', 'EASTCHSTER/DYRE', 'EASTN PKWY-MUSM', 'ELDER AV', 'ELMHURST AV', 'EUCLID AV', 'EXCHANGE PLACE', 'FAR ROCKAWAY', 'FLATBUSH AV-B.C', 'FLUSHING AV', 'FLUSHING-MAIN', 'FORDHAM RD', 'FOREST AVE', 'FOREST HILLS 71', 'FRANKLIN AV', 'FRANKLIN ST', 'FREEMAN ST', 'FRESH POND RD', 'FT HAMILTON PKY', 'FULTON ST', 'GATES AV', 'GRAHAM AV', 'GRAND ARMY PLAZ', 'GRAND ST', 'GRAND-NEWTOWN', 'GRANT AV', 'GRD CNTRL-42 ST', 'GREENPOINT AV', 'GROVE STREET', 'GUN HILL RD', 'HALSEY ST', 'HARLEM 148 ST', 'HARRISON', 'HEWES ST', 'HIGH ST', 'HOUSTON ST', 'HOWARD BCH JFK', 'HOYT ST', 'HOYT-SCHER', 'HUNTERS PT AV', 'HUNTS POINT AV', 'INTERVALE AV', 'INWOOD-207 ST', 'JACKSON AV', 'JAMAICA 179 ST', 'JAMAICA CENTER', 'JAMAICA VAN WK', 'JAY ST-METROTEC', 'JEFFERSON ST', 'JFK JAMAICA CT1', 'JKSN HT-ROOSVLT', 'JOURNAL SQUARE', 'JUNCTION BLVD', 'JUNIUS ST', 'KEW GARDENS', 'KINGS HWY', 'KINGSBRIDGE RD', 'KINGSTON AV', 'KINGSTON-THROOP', 'KNICKERBOCKER', 'KOSCIUSZKO ST', 'LACKAWANNA', 'LAFAYETTE AV', 'LEXINGTON AV/53', 'LEXINGTON AV/63', 'LIBERTY AV', 'LIVONIA AV', 'LONGWOOD AV', 'LORIMER ST', 'MARBLE HILL-225', 'MARCY AV', 'METROPOLITAN AV', 'METS-WILLETS PT', 'MIDDLETOWN RD', 'MONTROSE AV', 'MORGAN AV', 'MORISN AV/SNDVW', 'MORRIS PARK', 'MOSHOLU PKWY', 'MT EDEN AV', 'MYRTLE AV', 'MYRTLE-WILLOUGH', 'MYRTLE-WYCKOFF', 'NASSAU AV', 'NECK RD', 'NEPTUNE AV', 'NEREID AV', 'NEVINS ST', 'NEW LOTS', 'NEW LOTS AV', 'NEW UTRECHT AV', 'NEWARK BM BW', 'NEWARK C', 'NEWARK HM HE', 'NEWARK HW BMEBE', 'NEWKIRK AV', 'NEWKIRK PLAZA', 'NORTHERN BLVD', 'NORWOOD 205 ST', 'NORWOOD AV', 'NOSTRAND AV', 'OCEAN PKWY', 'ORCHARD BEACH', 'OZONE PK LEFFRT', 'PARK PLACE', 'PARKCHESTER', 'PARKSIDE AV', 'PARSONS BLVD', 'PATH NEW WTC', 'PATH WTC 2', 'PAVONIA/NEWPORT', 'PELHAM BAY PARK', 'PELHAM PKWY', 'PENNSYLVANIA AV', 'PRESIDENT ST', 'PRINCE ST', 'PROSPECT AV', 'PROSPECT PARK', 'QUEENS PLAZA', 'QUEENSBORO PLZ', 'RALPH AV', 'RECTOR ST', 'RIT-MANHATTAN', 'RIT-ROOSEVELT', 'ROCKAWAY AV', 'ROCKAWAY BLVD', 'ROCKAWAY PARK B', 'ROOSEVELT ISLND', 'SARATOGA AV', 'SENECA AVE', 'SHEEPSHEAD BAY', 'SHEPHERD AV', 'SIMPSON ST', 'SMITH-9 ST', 'SOUTH FERRY', 'SPRING ST', 'ST LAWRENCE AV', 'ST. GEORGE', 'STEINWAY ST', 'STERLING ST', 'SUTPHIN BLVD', 'SUTPHIN-ARCHER', 'SUTTER AV', 'SUTTER AV-RUTLD', 'THIRTY ST', 'THIRTY THIRD ST', 'TIMES SQ-42 ST', 'TOMPKINSVILLE', 'TREMONT AV', 'TWENTY THIRD ST', 'UNION ST', 'UTICA AV', 'V.CORTLANDT PK', 'VAN SICLEN AV', 'VAN SICLEN AVE', 'VERNON-JACKSON', 'W 4 ST-WASH SQ', 'W 8 ST-AQUARIUM', 'WAKEFIELD/241', 'WALL ST', 'WEST FARMS SQ', 'WESTCHESTER SQ', 'WHITEHALL S-FRY', 'WHITLOCK AV', 'WILSON AV', 'WINTHROP ST', 'WOODHAVEN BLVD', 'WOODLAWN', 'WORLD TRADE CTR', 'WTC-CORTLANDT', 'YORK ST', 'ZEREGA AV']
Or, a subset, for example, only those for Hunter & Roosevelt Island stations:
print(f'All turnstiles: {get_turnstiles(station_dict)}')
which would print:
print(get_turnstiles(station_dict, stations = ['68ST-HUNTER CO','ROOSEVELT ISLND']))
Checking the ridership for a station:
['R259,00-00-00', 'R259,00-00-01', 'R259,00-00-02', 'R259,00-00-03', 'R259,00-05-00', 'R259,00-05-01', 'R177,00-00-00', 'R177,00-00-01', 'R177,00-00-02', 'R177,00-00-03', 'R177,00-00-04', 'R177,00-00-05', 'R177,00-00-06', 'R177,00-03-00', 'R177,00-03-01', 'R177,00-03-02', 'R177,00-03-03', 'R177,00-03-04', 'R177,00-03-05', 'R177,00-03-06']
gives the output:
hunter_turns = get_turnstiles(station_dict, stations = ['68ST-HUNTER CO'])
ridership = compute_ridership(min_dict,max_dict,turnstiles=hunter_turns)
print(f'Ridership for Hunter College: {ridership}.')
Ridership for turnstile, R051,02-00-00: 3096.
Ridership for Hunter College: 49669.
"""
Name: YOUR NAME HERE (as it appears in Gradescope)
Email: YOUR EMAIL HERE (as it appears in Gradescope)
Resources: ANY RESOURCES YOU USED
"""
def make_dictionary(data, kind = "min"):
"""
Creating a dictionary with a key of the remote unit ID + turnstile unit number.
Depending on kind, the resulting dictionary will store the minimum entry
number seen (as an integer), the maximum entry number seen (as an integer),
or the station name (as a string).
Returns the resulting dictionary.
Keyword arguments:
kind -- kind of dictionary to be created: min, max, station
"""
#Placeholder-- replace with your code
new_dict = {}
return new_dict
def get_turnstiles(station_dict, stations = None):
"""
If stations is None, returns the names of all the turnstiles stored as keys
in the inputted dictionary.
If non-null, returns the keys which have value from station in the inputed dictionary.
Returns a list.
Keyword arguments:
stations -- None or list of station names.
"""
#Placeholder-- replace with your code
lst = []
return lst
def compute_ridership(min_dict,max_dict,turnstiles = None):
"""
Takes as input two dictionaries and a list, possibly empty, of turnstiles.
If no value is passed for turnstile, the default value of None is used
(that is, the total ridership for every station in the dictionaries).
Returns the ridership (the difference between the minimum and maximum values)
across all turnstiles specified.
Keyword arguments:
turnstiles -- None or list of turnstile names
"""
#Placeholder-- replace with your code
total = 0
return total
def main():
"""
Opens a data file and computes ridership, using functions above.
"""
file_name = 'turnstile_220611.txt'
#Store the file contents in data:
with open(file_name,encoding='UTF-8') as file_d:
lines = file_d.readlines()
#Discard first line with headers:
data = lines[1:]
#Set up the three dictionaries:
min_dict = make_dictionary(data, kind = "min")
max_dict = make_dictionary(data, kind = "max")
station_dict = make_dictionary(data, kind = "station")
#Print out the station names, alphabetically, without duplicates:
print(f'All stations: {sorted(list(set(station_dict.values())))}')
#All the turnstiles from the data:
print(f'All turnstiles: {get_turnstiles(station_dict)}')
#Only those for Hunter & Roosevelt Island stations:
print(get_turnstiles(station_dict, stations = ['68ST-HUNTER CO','ROOSEVELT ISLND']))
#Checking the ridership for a single turnstile
ridership = compute_ridership(min_dict,max_dict,turnstiles=["R051,02-00-00"])
print(f'Ridership for turnstile, R051,02-00-00: {ridership}.')
#Checking the ridership for a station
hunter_turns = get_turnstiles(station_dict, stations = ['68ST-HUNTER CO'])
ridership = compute_ridership(min_dict,max_dict,turnstiles=hunter_turns)
print(f'Ridership for Hunter College: {ridership}.')
if __name__ == "__main__":
main()