Introduction
The purpose of this course is to introduce students with previous IT experience quickly to the basics of Python and to the extensions of Python useful for Data Science and Machine Learning. Because the class covers much ground, I decided to not include neural networks, but in my experience, learning and using Keras should not be difficult. Because of the wealth of the material, it is necessary that students consult the instructor whenever they have doubts. I will provide small projects that are best done in groups, even though of course in the middle of a pandemic, working conditions will not be simple.
Because the material already developed is in English, the course will be conducted in English, but interactions can be in Spanish.
Nota bene: The materials are going to be replaced successively as course contents are adjusted. In particular, the presentation videos are going to be replaced. The old ones are from a class given in the Summer.
Times and Contact Information
This class will be given via zoom on Tuesdays and Thursdays at 19:00 (7pm) Chicago time. This is roughly the same as Mexico City time. Class duration is between 1:30 and 1:50 hours. I'll try to use zoom security features to deter Zoom pirates. You can reach me at tschwarz at calprov dot org.
https://us02web.zoom.us/j/88496384434?pwd=bzdnSS80bmRNWExnYWhWOHYxTDNuQT09 Meeting ID: 884 9638 4434 Passcode: 1946- Erik Rene Bojorges Valdez is graciously opening up his google meet to all students, from Monday 18:00 to 20:00 Mexico time. I understand that he is willing to help all who need help and I am very thankful for that.
https://meet.google.com/wxi-umsw-kcz
Module 1: Introduction to Python
August 18, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Example Data Set [click here]
- Example Code [click here]
- Exercises (pdf) [click here
Module 2: Loops in Python
August 20, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Activities [click here]
- Homework [click here]
Module 3: Functions in Python
August 25, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Activities [click here]
- Activities Solutions [click here]
Module 4: Lists and Tuples in Python
August 27, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Activities [click here]
- Activities Solutions [click here]
- Python file used [click here]
Module 5: Strings and String Processing
September 1, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Homework [click here]
- Example Code [click here]
Module 6: Dictionaries and Memoization
September 3, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Advanced Activities [click here]
- Homework [click here]
- Example Code [click here]
- 'alice.txt' [click here]
Module 7: Comprehension
September 8, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python Code [click here]
Module 8: File Processing and Data Cleaning
September 10, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python Code [click here]
- Link to UCI Machine Learning Repository's data sets [click here]
Module 9: Practicum: Denver Car Accidents Statistics
September 15, 2020. Please download the Denver County crime data (112 MB). You will need the comma separated file and might also find the explanations of the offense codes.
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
- Result [click here]
- Denver Crime Data External Link
- Proposal for Homework [click here]
- Examples [click here]
Module 10: Exceptions
September 17, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
- Exercises for Exceptions [click here]
Module 11: Object Oriented Programming 1
September 22, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
Module 12: Simple Classification: Decision Trees
September 24, 2020
Homework: Use the Decision tree technique to develop a decision tree either for the blood donation data set or the Pima Indian diabetes data set. You can use Gini or entropy in order to solve this. If you want to test the accuracy of your model, you can split the dataset randomly into ~80% records for training and ~20% for testing.
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
- Iris data set [click here
- Blood donation data set [click here}
- Pima Indian diabetes data set [click here]
Module 13: Object Oriented Programming 2
September 29, 2020
doc-strings, Address example with internationalization, k-nearest-neighbor implementation
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
- Python code address.py [click here]
- Iris data set [click here
- Blood donation data set [click here}
Module 14: Object Oriented Programming 3
October 1, 2020
Homework: Implement a full implementation for the class Gaussian. A Gaussian is a complex number where real and imaginary part are integers. You need to implement:
- Initializer, string, and repr
- hash function (needs to return an integer)
- abs (__abs__)
- equality
- addition, subtraction, multiplication, exponentiation (__pow__), and division
- multiplication with a scalar, you do this using __rmul__
Use rounding to insure that the results of an operation is again a Gaussian and not a complex number. Otherwise, the operations are just defined as for complex numbers.
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code [click here]
- Python code complex [click here]
Module 15: Numpy 1
October 6, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code 1[click here]
- Python code 2[click here]
- Python code 3[click here]
- earth.jpg[click here]
Module 16: Numpy 2
October 8, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code 1[click here]
- Python code 2[click here]
- Example: Keras [click here]
- Example: Keras (pdf) [click here]
- Example Code: Keras (py) [click here]
Module 17: Minimization and Curve Fitting with SciPy
October 13, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code example[click here]
- Python code ex_min[click here]
- Python code curve.py[click here]
Module 18: Pandas
October 15, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Python code 1 [click here]
- Python code 2 [click here]
- Data set SF Salaries [click here]
- Apple data [click here]
- Google data [click here]
- Car theft data [click here]
Module 19: Web Scraping
October 20, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Alice in Wonderland [click here]
- Milwaukee PD call log copy [click here]
- Lawler [click here]
- Python code [click here]
Module 20: Statistics
October 22, 2020
- A nice TED talk on statistics Click here: external link
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Code example [click here]
- Code example [click here]
Module 21: Visualization with Pandas
October 27, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Code example [click here]
Homework
- Homework (pdf) [click here]
Module 22: Visualization 2
October 29, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- Code example [click here]
- Code example [click here]
- Code example [click here]
- Code example [click here]
- Code example [click here]
- california_cities.csv [click here]
- births.csv [click here]
- in.csv [click here]
Module 23: Forecasting and Time Series
November 3, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- airline-passengers.csv [click here]
- airline.py [click here]
- Chennai.csv [click here]
- Milk.csv [click here]
- pdexample.py [click here]
- pdexample2.py [click here]
Module 24: Linear Regression
November 5, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- code5.py [click here]
- code6.py [click here]
- code1a.py [click here]
- brain-size.txt [click here]
- cereals.txt [click here]
- housing-prices.txt [click here]
- kc_house-data.csv [click here]
- studentt.py [click here]
Homework Week 8
- hw.pdf [click here]
Module 25: Forecasting and Time Series 2
November 10, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- code5.py [click here]
- cairHW.py [click here]
- cairL.py [click here]
- codeSARIMA.py [click here]
- cgoogle.py [click here]
- AusBeer.csv [click here]
- elecequip.csv [click here]
- MAPCPI.csv [click here]
- MOPCPI.csv [click here]
Module 26: Visualization again
November 12, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- code5.py [click here]
- code6.py [click here]
- code7.py [click here]
Homework Week 9
- hw.pdf [click here]
Module 27: Naive Bayesian Inference
November 17, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- irisbayes.py [click here]
Homework:
- Homework (pdf) [click here]
- data1.csv [click here]
- data2.csv [click here]
Module 28: Logistic Regression
November 19, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- spine.py [click here]
- quality.csv [click here]
- framingham.csv[click here]
- spine.csv[click here]
Module 29: Support Vector Machines
November 24, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- code5.py [click here]
- penguins.csv [click here]
Module 30: Principal Component Analysis
November 26, 2020
- Presentation Video (mp4) [click here]
- Presentation (pdf) [click here]
- Presentation (key) [click here]
- Presentation (pptx) [click here]
- code.py [click here]
- code2.py [click here]
- code3.py [click here]
- code4.py [click here]
- code5.py [click here]
- code6.py [click here]
- code7.py [click here]
- peng.py [click here]
Final Homework
Use principal component analysis on the Iris data set and display it in two dimensions.
-->