CSci 39542 Syllabus    Resources    Coursework



Syllabus
CSci 39542: Introduction to Data Science
Department of Computer Science
Hunter College, City University of New York
Spring 2022

Description: 3 hours, 3 credits: This topics course focuses on computational methods and statistical techniques to analyze data and make inferences. Topics include data collection and cleaning, exploratory data analysis and visualization, and statistical inference and prediction. Students will acquire a working knowledge of data science through hands-on projects with real-world data. Basic proficiency in statistics and Python programming is assumed, as well as experience with abstract data structures.
Prerequisites: CSci 127, Stat 213, and one of: CSci 133 or CSci 235.
Instructors: Dr. Katherine St. John, professor (office hours) and Susan Sun, Tech-in-Residence (TiR) fellow.
Meetings: Mondays, Thursdays, 2:45-4pm, in-person.
Course Email: datasci@hunter.cuny.edu.

Grading Policy

Course Format: This course is taught in an in-person format:

Expectations: Completing homework is an essential part of the learning experience. Students are expected to learn both the material covered in class and the material in the textbook and other assigned reading.

Honor Code: You are encouraged to work together on the overall design of the programs and homework. However, for specific programs and homework assignments, all work must be your own. As a general rule, do your own typing. Submitting work of others, or not safeguarding your work from copying, are academic integrity violations. You are responsible for knowing and following Hunter College's Academic Integrity Policy:

Hunter College regards acts of academic dishonesty (e.g., plagiarism, cheating on examinations, obtaining unfair advantage, and falsification of records and official documents) as serious offenses against the values of intellectual honesty. The College is committed to enforcing the CUNY Policy on Academic Integrity and will pursue cases of academic dishonesty according to the Hunter College Academic Integrity Procedures.
All incidents of cheating will be reported to the Office of Student Conduct in the Vice President for Student Affairs and Dean of Students office.

Lecture Participation: Participation in lecture is measured by collected classwork. If you miss or do poorly on a classwork, your grade on the written final exam will replace the missing or low grade.

Quizzes: Every week, there will be an quiz on the lecture notes, code demonstrations, classwork, and submitted programs. Quizzes are timed coding challenges, using the HackerRank platform:

Programming Assignments: Assignments are posted on the class website, usually two weeks before the due date. They reinforce concepts covered in lecture and lab and serve as building blocks for the classwork and the semester-long project.

Project: A final project is optional for this course. The grade for the project is a combination of grades earned on the milestones (e.g. deadlines during the semester to keep the projects on track) and the overall submitted program. If you choose not to complete the project, your written final exam grade will replace its portion of the overall grade.

Final Exam: The final exam has two parts:

Both parts are required and are comprehensive, covering all the material of the course. Sample exam questions will available during the last weeks of the term.

Grades: The grading for the course will be based on:

We understand that emergencies happen during the term, and as such, we See individual sections above for details. To respect your privacy, there is no need to provide documentation to take advantage of the dropping/replacing grades above. It is done automatically. If you are going to miss more than 2 weeks of class and associated work, contact us, so we can make arrangements for you to take the course in a future term.

Materials, Resources and Accommodating Disabilities

This is a zero cost course. All textbook materials are freely available to enrolled students.

Textbook & Readings: The following free on-line books are required for the course:

Additional readings and tutorials are available on the resources page.

Technology: This is a programming-intensive course in the Python programming language. See the resources page for obtaining the Python program and the packages used, links for the submitting assignments and assessments. All software used is freely available.

Computer Access: A computer (capable of running Python 3) is needed complete the on-line assessments, and programming assignments and projects. Hunter College is committed to all students having the technology needed for their courses. If you are in need of technology, see Student Life's Support & Resources Page.

Accommodating Disabilities: In compliance with the American Disability Act of 1990 (ADA) and with Section 504 of the Rehabilitation Act of 1973, Hunter College is committed to ensuring educational parity and accommodations for all students with documented disabilities and/or medical conditions. It is recommended that all students with documented disabilities (Emotional, Medical, Physical, and/or Learning) consult the Office of AccessABILITY. For further information and assistance, see their contact page.

Hunter College Policy on Sexual Misconduct: In compliance with the CUNY Policy on Sexual Misconduct, Hunter College reaffirms the prohibition of any sexual misconduct, which includes sexual violence, sexual harassment, and gender-based harassment retaliation against students, employees, or visitors, as well as certain intimate relationships. Students who have experienced any form of sexual violence on or off campus (including CUNY-sponsored trips and events) are entitled to the rights outlined in the Bill of Rights for Hunter College.

See CUNY Policy on Sexual Misconduct Link.

Course Objectives

At the end of the course, students should be able to:
  1. Acquire data sets from multiple sources and write programs that can extract (scrape) the data into a usable form.
  2. Use data mining to extract new insights about the data.
  3. Understand basic storage techniques and constraints.
  4. Analyze data using standard techniques from statistics and linear algebra.
  5. Visualize data using popular Python modules.