Course Syllabus

Course Title - COGS 109 Modeling and Data Analysis

Summer I 2021

Lecture: MTuWeTh 12:30pm - 1:50pm (zoom link on Canvas)

Discussion: TuTh 2:00 - 2:50pm (zoom link on Canvas)

Course Description

Welcome to COGS109!

This course is designed to help you translate your experience making inferences and predictions on real-world events into completing a formal data analysis project. We will discuss some fundamental computational tools that can be used for exploring, analyzing, and deriving insights from data. And we will practice how to present and evaluate data analysis projects. 

While mathematical skills can be useful in statistical modeling and data analysis, you do not need to have a strong math background to produce good data analytical projects. We will go over the mathematical intuition behind some computational methods, which aims to help you understand the assumptions and the limitations of these methods. 

The topics we will discuss include but are not limited to principles of statistical modeling, prediction, linear regression, classification, dimensionality reduction.

Course Information

Instructors

Weiqi (Vicky) Zhao (wez025@ucsd.edu) - Instructor - Office Hour: Wed 2-3pm

Qin Li (qil150@ucsd.edu) - TA/instructor for Tuesday discussion sections

Aditya Mishra (admishra@ucsd.edu) - TA/instructor for Thursday discussion sections

Prerequisites

Statistics: Cognitive Science 14B
Math: Mathematics 18 or 31AH
Programming/computation: Cogs18 or CSE 7 or CSE 8A or CSE 11 or consent of instructor

Time and location

Lecture: MTuWeTh 12:30p - 1:50p. Zoom
Discussion: TuTh 2:00p - 2:50p. Zoom

All lectures will be delivered live through zoom (zoom link will be posted on the course calendar). Lectures will include interactive activities, including discussions, and small-group exercises. You are strongly encouraged to attend the live lectures, but attendance is not required.

Discussion sessions will focus on the programming details of implementing statistical methods in Python. The Tuesday session will cover the course materials covered on Monday and Tuesday, and the Thursday session will cover materials discussed on Wednesday and Thursday. The course materials are different for these discussion sections, and you are encouraged to attend both if you need guidance on programming up a certain analysis method.

 All instructional recordings will be posted on Canvas within 2 hours of class.

Course material

Introduction to Statistical Learning with applications in R (ISLR) by James, Witten, Hastie and Tibshirani. The book, as well as related data sets and code, is available as a free PDF download at: https://www.statlearning.com/

Learning Objectives

By the end of the course, you will be able to:

  • Explain the principles of data science both in precise terminology and using layman's terms
  • Explain the assumptions and limitations of statistical methods
  • Apply data analysis and modeling techniques
  • Use and critically evaluate data analysis projects
  • Understand the conceptual link between different statistical tools

Syllabus

Week Date Topics Chapters in ISLR Assignments
Week 1
6/28 - 06/29 Introduction: Data Exploration, Modeling and Statistical Learning Ch 1, 2
06/30 - 07/01 Linear regression & multiple regression Ch 3 homework 1, quiz1, quiz2
Week 2
07/05 - 07/06 multiple regression, classification Ch 3, Ch 4
07/07 - 07/08 Classification Ch 4 homework 2
Week 3
07/12 - 07/13 Resampling and cross-validation Ch 5
07/14 - 07/15 Model selection and regularization Ch. 6.1, 6.2 data analysis project part 1
Week 4
07/19 - 07/20 Regularization and dimensional reduction; Non-linear regression Ch. 6.3, 6.4 project eval
07/21 - 07/22 Non-linear regression Ch. 7.2-5 data analysis project part 2
Week 5
07/26 - 07/27 Unsupervised learning and clustering Ch 10
07/28 - 07/29 Review, final project presentation Final project

Assessments and Evaluation

5% x 7 quizzes  
You will complete a quiz after each chapter, adding up to 8 quizzes in total. Quizzes will be published on Canvas. Each student’s lowest quiz grade will be dropped. No late submission will be accepted without a valid official document (e.g. doctor’s note)
10% x 2 homework
There will be two homework that are due at the end of week 1 and week 2. These homework will focus on the programming aspect of data analysis. Homework will be posted on Canvas. Turn in your assignment electronically at gradescope.com.
10% data analysis group project [draft 1]
From week 3 to 5, you will work on a data analysis project with your classmates. The first draft of a data analysis project is due at the end of week 3 (15%). 

To search for teammates, please fill out this Google form: Search for teammates

5% project evaluation

After submitting your first draft, you will have the opportunity to evaluate your classmate’s project assignment using the provided grading sheet and give feedback on how the data project can be improved (5%). Your feedback will remain anonymous.

10% data analysis project draft 2
For the assignment for week 4, you will build on your data analysis draft and apply a second modeling technique to address your data question (More details to come). You should also address the comments provided to you by your classmate in your second draft.
15% data analysis project draft 3 
At the end of week 5, you will submit the final draft of your data analysis project by incorporating the comments from the instructors and your classmate.
5% final week presentation
During the final week, you will give a 5-minute presentation on your data analysis project. All group members are encouraged to present. However, you can select a group representative to present after consulting with your instructors.
Extra credits
You may earn up to 2% extra credit (2 hours of SONA) by participating as a SONA research subject. https://ucsd.sona-systems.com/Default.aspx?ReturnUrl=/. For instructions on how to sign up for SONA experiments, please see: http://www.psychology.ucsd.edu/undergraduate-program/undergraduate-resources/sona/index.html.

We will give 1% extra credit to the top 2 Piazza participants. (Signup link: piazza.com/ucsd/summer2021/cogs109). Participation will be judged based on BOTH the quantity and the quality of posts on Piazza. Course-related questions and answers are both encouraged and will count toward participation. We will give 1% of extra credit to the students with the best Final Project, as judged by the students and the instructors.

Grading Scale

Screen Shot 2021-07-27 at 10.06.56 AM.png 

Programming requirements

COGS109 is not a programming class, although we will use programming skills to work with data. We assume that all students have some basic experience with at least one programming language. If you are not familiar with programming for data analysis, we encourage you to attend the discussion sections where TAs will provide live programming demos. Students may choose to complete assignments in Python, Matlab, or R. 

  • Tips for programming - Read the documentation. Every Python/MATLAB/R function has docs and examples you can try.
  • Feel free to read online resources (forums, stackexchange, etc.) — as long as you write and execute your code yourself.
  • Reading documentation or “googling” unfamiliar syntax is not a sign of ignorance; on the contrary, advanced programmers spend a large part of their time reading documentation and forums. Learning to take advantage of documentation and resources to solve new coding problems is an essential skill. Learn to be a Master Googler – i.e. to efficiently search for, understand and digest information from online resources.

Academic Integrity

Instructors and students are expected to honor the UCSD Policy on Integrity of Scholarship (https://senate.ucsd.edu/Operating-Procedures/Senate-Manual/Appendices/2). The following passages are selected from the full policy:

“Students' Responsibility

To uphold academic integrity, students shall:

Complete and submit academic work that is their own and that is an honest and fair representation of their knowledge and abilities at the time of submission.
Know and follow the standards of the class and the institution.

Instructors' Responsibility

The instructor shall state in writing how graded assignments and exams will contribute to the final grade in the course. If there are any course-specific rules required by an instructor for maintaining academic integrity, the instructor shall also inform students of these in writing.

Course Summary:

Course Summary
Date Details Due