Machine learning and data science ressources

This post lists some useful sources for learning data science and machine learning. The first two are a good way to start, since they quickly bring you to the point where you can play with real world data.

Machine Learning by Andrew Ng (Stanford)

The best MOOC ever,  which is also the one that started Coursera.

Prerequisites: it helps if you are familiar with scalar products and matrix operations, and if you know how to program basic functions.

What’s inside: starts with basic linear regression and ramps up with gradient descent, logistic regressions, regularisation, neural networks, clustering, svm, recommenders, PCA …

Data Science Specialization by Roger Peng, Brian Caffo, and Jeff Leek (Johns Hopkins University)

This one is a long hike through many areas of data science (not just machine learning), and offers a very pragmatic introduction to the R programming language.

Prerequisites: familiarity with programming and undergrad math. The R exercises are quite close to what you will get asked in a Data Scientist job interview.

What’s inside: how to do everything to data using R (parse, explore, fit), how to connect to an API or parse data from a website, statistics, machine learning, reproducibility … It has some overlap with Andrew Ng’s course.

Probability and Statistics by Khan Academy

The basics of statistics and probabilities (and clearer IMO than the statistics section of the Johns Hopkins course).

Data Science at Scale Specialization by Bill Howe (University of Washington)

A course about the “Big” in “Big Data”, with data visualisation thrown in.

Prerequisites: python programming, familiarity with APIs.

What’s inside: SQL, noSQL, MapReduce, graphs, machine learning …

Prerequisites: depends a lot on the courses, but a bit of background in Computer Science helps.

Books

I found An introduction to Statistical Learning very helpful both for algorithms explanations and R exercises. The authors also made a series of videos following the contents of the books (index of the videos here). The advanced version of this book with All The Math is The Elements of Statistical Learning.

If you want to make beautiful graphs (and you should), books by Edward Tufte show how.

 

This list is subject to changes and additions as I continue watching courses and reading books !

 

Advertisements