# Machine Learning Study Group
The `#ml-study-group` is a study group on the [SRCT Slack](https://srct.slack.com/) to gather those interested in self-studying machine learning and data science topics.
The curriculum is organized similar to the college classes we are familiar with and organized into weekly readings and video lessons. It focuses on practical considerations and getting a theoretical and philsophical underpinning to data science and machine learning.
There are a lot of papers to read, and for good reason. Machine learning is advancing at a rapid pace and a majority of the progress is communicated via papers on arXiv and conference papers. It's very important to get comfortable reading papers, and the study group means that you can ask questions at any time!
## Prerequisites
What do you need for this?
* A functioning computer that can run SSH and read PDFs
* A basic understanding of statistics
* A conceptual understanding of calculus (especially gradients)
I think it's better to learn as you go. If you spend too much time on preliminaries, you'll never get to what you're interested in and you may get disheartened before you get a chance to work on the interesting things. You're also more likely to remember things if you've looked them up as you needed them.
## Tips for Reading
* Skim to get an idea of the structure
* Read through the paper once without taking notes. Then write a 1-2 sentences summary. This readthrough is just to get a high-level understanding of the paper.
* Read through the paper again and take notes. Write a more thorough 2-3 paragraph summary. Make sure to contextualize this summary. What makes this paper interesting?
* If you're stuck for more than 30-45 minutes. Ask lots of questions in `#ml-study-group`.
## Credits
Almost all of the material has come from the following sources:
* Stanford's Spring 2018 [Stats 337: Readings in Applied Data Science](https://github.com/hadley/stats337) taught by Hadley Wickham (of `tidyverse` fame)
* [Deep Learning Papers Reading Roadmap](https://github.com/floodsung/Deep-Learning-Papers-Reading-Roadmap)
* [Practical Deep Learning For Coders, Part 1](https://course.fast.ai/) from [Fast.ai](https://fast.ai/)
* [Introduction to Statistical Learning](http://www-bcf.usc.edu/~gareth/ISL/) (ISL) [[pdf]](http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf) (The famous text, [Elements of Statistical Learning](https://web.stanford.edu/~hastie/ElemStatLearn/), by the same authors is also fantastic, but requires significantly more math/stat preliminaries.)
## Curriculum
### Week 1
Week 1 is an easy week. This should give you some background to get started.
* [Data scientists mostly just do arithmetic and thatâ€™s a good thing](https://m.signalvnoise.com/data-scientists-mostly-just-do-arithmetic-and-that-s-a-good-thing-c6371885f7f6); Noah Lorang (2016).
* [Enterprise Data Analysis and Visualization: An Interview Study](https://idl.cs.washington.edu/files/2012-EnterpriseAnalysisInterviews-VAST.pdf); Sean Kandel, Andreas Paepcke, Joseph Hellerstein, Jeffrey Heer (2012).
* [50 years of data science](https://www-tandfonline-com.mutex.gmu.edu/doi/abs/10.1080/10618600.2017.1384734); David Donoho (2017).
* Three Giants' Survey: [Deep learning](http://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf); LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton (2015).
* Deep Learning 2018 [Lesson 1: Recognizing Cats and Dogs](https://course.fast.ai/lessons/lesson1.html)
* Introduction to Statistical Learning: Chapter 2 [[pdf]](http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf)
License
--
[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)