Python for Data Science

Module 1: Introduction to Python

May 11, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Example Data Set [click here]
Example Code [click here]

Module 2: Loops in Python

May 13, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Activities [click here]
Homework [click here]

Module 3: Functions in Python

May 15, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Activities [click here]
Activities Solutions [click here]

Module 4: Lists and Tuples in Python

May 18, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Activities [click here]
Activities Solutions [click here]
Python file used [click here]

Module 5: Strings and String Processing

May 20, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Homework [click here]
Example Code [click here]

Module 6: Dictionaries and Memoization

May 22, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Homework [click here]
Example Code [click here]
'alice.txt' [click here]

Module 7: Comprehension

May 25, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python Code [click here]

Module 8: File Processing and Data Cleaning

May 27, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python Code [click here]
Link to UCI Machine Learning Repository's data sets [click here]

Module 9: Practicum: Denver Car Accidents Statistics

May 29, 2020. Please download the Denver County crime data (112 MB). You will need the comma separated file and might also find the explanations of the offense codes.

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]
Result [click here]
Denver Crime Data External Link

Module 10: Exceptions

June 1, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]

Module 11: Object Oriented Programming 1

June 3, 2020

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]

Module 12: Simple Classification: Decision Trees

June 5, 2020

Homework: Use the Decision tree technique to develop a decision tree either for the blood donation data set or the Pima Indian diabetes data set. You can use Gini or entropy in order to solve this. If you want to test the accuracy of your model, you can split the dataset randomly into ~80% records for training and ~20% for testing.

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]
Iris data set [click here
Blood donation data set [click here}
Pima Indian diabetes data set [click here]

Module 13: Object Oriented Programming 2

June 8, 2020

doc-strings, Address example with internationalization, k-nearest-neighbor implementation

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]
Python code address.py [click here]
Iris data set [click here
Blood donation data set [click here}

Module 14: Object Oriented Programming 3

June 10, 2020

Homework: Implement a full implementation for the class Gaussian. A Gaussian is a complex number where real and imaginary part are integers. You need to implement:

Initializer, string, and repr
hash function (needs to return an integer)
abs (__abs__)
equality
addition, subtraction, multiplication, exponentiation (__pow__), and division
multiplication with a scalar, you do this using __rmul__

Use rounding to insure that the results of an operation is again a Gaussian and not a complex number. Otherwise, the operations are just defined as for complex numbers.

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code [click here]
Python code complex [click here]

Module 15: Numpy 1

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code 1[click here]
Python code 2[click here]
Python code 3[click here]

Module 16: Numpy 2

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code 1[click here]
Python code 2[click here]

Module 17: Minimization and Curve Fitting with SciPy

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code example[click here]
Python code ex_min[click here]
Python code curve.py[click here]

Homework Week 6

Homework [click here]
Data for Exercise 3 [click here]

Module 18: Pandas

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Python code 1 [click here]
Python code 2 [click here]
Data set SF Salaries [click here]
Apple data [click here]
Google data [click here]
Car theft data [click here]

Module 19: Web Scraping

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Alice in Wonderland [click here]
Milwaukee PD call log copy [click here]
Lawler [click here]
Python code [click here]

Module 20: Statistics

A nice TED talk on statistics Click here: external link
Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Code example [click here]
Code example [click here]

Module 21: Visualization with Pandas

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Code example [click here]

Homework

Homework (pdf) [click here]

Module 22: Visualization 2

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
Code example [click here]
Code example [click here]
Code example [click here]
Code example [click here]
Code example [click here]
california_cities.csv [click here]
births.csv [click here]
in.csv [click here]

Module 23: Forecasting and Time Series

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
airline-passengers.csv [click here]
airline.py [click here]
Chennai.csv [click here]
Milk.csv [click here]
pdexample.py [click here]
pdexample2.py [click here]

Module 24: Linear Regression

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]
code5.py [click here]
code6.py [click here]
code1a.py [click here]
brain-size.txt [click here]
cereals.txt [click here]
housing-prices.txt [click here]
kc_house-data.csv [click here]
studentt.py [click here]

Homework Week 8

hw.pdf [click here]

Module 25: Forecasting and Time Series 2

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]
code5.py [click here]
cairHW.py [click here]
cairL.py [click here]
codeSARIMA.py [click here]
cgoogle.py [click here]
AusBeer.csv [click here]
elecequip.csv [click here]
MAPCPI.csv [click here]
MOPCPI.csv [click here]

Module 26: Visualization again

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]
code5.py [click here]
code6.py [click here]
code7.py [click here]

Module 27: Naive Bayesian Inference

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
irisbayes.py [click here]

Homework:

Homework (pdf) [click here]
data1.csv [click here]
data2.csv [click here]

Module 28: Logistic Regression

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]
quality.csv [click here]
framingham.csv[click here]

Module 29: Support Vector Machines

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]

Module 30: Principal Component Analysis

Presentation Video (mp4) [click here]
Presentation (pdf) [click here]
Presentation (key) [click here]
Presentation (pptx) [click here]
code.py [click here]
code2.py [click here]
code3.py [click here]
code4.py [click here]
code5.py [click here]
code6.py [click here]
code7.py [click here]

Final Homework

Use principal component analysis on the Iris data set and display it in two dimensions.