Welcome to CSCI 0451!

Prof. Phil Chodrow
Department of Computer Science
Middlebury College








Machine learning is the theory and practice of algorithmically learning patterns in data.







Machine learning is used for…

…automated consumer recommendations for content and shopping.












Machine learning is used for…

…generating realistic synthetic text, images, and code.







Machine learning is used for…

…predictions and recommendations for life-changing decisions: housing, healthcare, criminal justice.







Machine learning is used for…

…search engines, smart homes, computer vision, speech-to-text, scientific discovery, driver assistance systems…








Can you list the times in which you interacted with a machine learning system yesterday?

Big Messages



This class is about something that is already impacting your life, and is likely to do so more in the future.

We are going to grow in math, coding, technical writing, and critical awareness.

This class works by giving you opportunties to push yourself.

This class is fun and rewarding but not easy.









What are we going to learn in this class?

CSCI 0451 is….

Coding
  • Numerical array programming
  • Object-oriented interfaces
  • Experiments and visualization
Math
  • Linear algebra
  • Optimization (\(\implies\) calculus)
  • A bit of probability
Reading, writing, discussion
  • Technical methods
  • Bias, fairness, and impact of ML




NYT, 1957



What We Are Actually Talking About

\[\mathbf{w}^{(t+1)} = \mathbf{w}^{(t)} + \mathbb{1}(y_i \langle \mathbf{w}^{(t)}, \mathbf{x}_i \rangle < 0)y_i \mathbf{x}_i\]

NYT, 2022

What We Are Actually Talking About


xkcd

My Approach

I want you to learn stuff in this class that is hard to learn from the internet.

LR = LogisticRegression()
LR.fit(predictors, target)
LR.predict(new_predictors)

We are going to learn this workflow in a day, then do more interesting things.

Special Focus: Disparity, Fairness, and Impact

Automated decision systems have a history reproducing structural privilege and oppression, especially in relation to race, gender, class, and sexuality.

What does it mean for an automated decision system to be fair? This is a hard question which we will discuss from multiple perspectives.








Rough, tentative plan for the semester

Fundamentals of Prediction (~2 weeks)
  • Data science workflow.
  • Score-based prediction, linear models, decision theory.
Fairness in Machine Learning (~2-3 weeks)
  • Legitimacy of automated decision-systems
  • Formal definitions of bias and fairness.
  • Limitations of formal methods.
Algorithms (math) (~4 weeks)
  • Empirical risk minimization, convexity, optimization.
  • Controlling features: regularization and kernels.
Deep Learning (~2 weeks)
  • Image classification, text classification, word embedding.
  • We are not doing generative language models – take 457.









Ok…so what are we going to do?








Most Days



Warmup Activity
  • Complete ahead of time.
  • Reinforces content from readings and connects them to lecture.
  • Present in groups of 5-6.
  • Random presenter presents to the group.
Lecture
  • Math
  • Live-coding + experiments
  • Your questions and ideas!








Activities and assignments

Blog Posts
  • Substantial projects! Usually require >5 hours.
  • Involves implementation, experiments, and discussion.
  • Published on your blog.
Daily Warmup Activities
  • Relatively quick when you’ve done the readings.
  • One (random) person each day will present to your team.
  • Connects readings to lecture.
Project
  • In groups of your choosing.
  • Work on it throughout the semester, presentations in last week.
  • We’ll have activities etc. to help you pick a path.



Blog Posts

  • Perform experiments in Jupyter notebooks.
  • Create figures, add expository prose, etc.
  • (Sometimes) Implement algorithms in source (.py) files.
  • Render your notebooks into a blog using the Quarto publishing engine.
  • Host source code and rendered blog on GitHub.

Blog Post Feedback

  • E: You have demonstrated excellent and thorough learning in this blog post. You should definitely move on.
  • M: You have demonstrated learning in this blog post, but may have missed some opportunities. You could learn either by moving on or by revising this post.
  • R: You have demonstrated some learning, but have missed some important ideas or techniques. I recommend that you focus your learning on revising this assignment.

These “grades” are always accompanied by written feedback on where to revise if applicable.





Readings and Warmups

Do them!

Readings should be completed ahead of time.

Notes are for in-class.

Let’s practice a warmup activity

Your Affinity Vegetable



1. Split into teams

2. Go around and share your name and:

If you were a vegetable, which vegetable would you be and why?

Your Affinity Vegetable



3. Team leader: lead your team in finding a delicious dish that incorporates all of your vegetables.

Be ready to share!









Collaborative Grading








Collaborative Grading

Initialization:
  • You set goals for your learning and achievement (in week 2).
Main Loop:
  • You attend class, participate in activities, and complete assignments.
  • You get feedback on your assignments from me and the TAs, and you revise.
At End Of Course:
  • You propose a letter grade that reflects your learning and achievement, and discuss it with me.
The feedback you get on individual assignments is data for your proposal. There is no formula.

Collaborative Grading



Opportunity Challenge
No points, no averages You can focus on feedback and set your own goals. You need to motivate based on your interest in the class
Resubmit assignments One of the best ways to learn Need to read feedback and prioritize time for revisions
Can skip assignments No busy work – work on what’s valuable to you. Still need to work enough to learn and meet your goals
No hard due-dates Don’t ask for extensions, take the time you need Need to keep yourself on pace to achieve your goals

What a Grade Sounds Like…

A: I am ready to take the theory, techniques, and ideas of this course into my endeavours outside this classroom: future classes, projects, hobbies, career.

B: With help or review, I might be able to take some of what I learned outside this classroom.

C: I showed up and did stuff, but I don’t really see any ways to take what I learned outside this classroom.

D-F: I didn’t really show up or do much.

I am very likely to accept your proposed grade in the course if you EITHER:

  • Complete most assignments to a high standard (including revisions) OR
  • Work for ~10 productive hours per week outside of class OR
  • Do some of the assignments I give you and also some other things (that you propose) that are relevant to the course learning goals.






What is something that makes you feel excited or empowered about collaborative grading?

What is something that makes you feel nervous or confused about collaborative grading?

The Wisdom Of Those Who Have Gone Before

Stay on top of the blog posts and do the daily warmups. also go to office hours if you are confused, Phil is helpful and there will likely be CS0451 peers there to talk through assignments with.

Review after each class using lecture notes so that you have a solid understanding of the concepts taught in class.

get to know quarto blogs and watch threeblueonebrown essence of linear algebra on Youtube to review some ideas

Be realistic in your goal setting and focus on what you want to get out of the course.

Focus on learning and growing instead of the grade. Be curious and think hard.








Based on what you know about the course so far, what are some ways that success might look like for you?