First Steps for Learning Data Science
30 Aug 2020This list of resources is designed for someone who has completed high school and would like a practical, hands-on introduction to data science/machine learning, then eventually find a job/internship in the field. There are tons of resources and lists of resources online: this one is short and curated (i.e. very biased), with the goal that a sufficiently motivated beginner can do everything on this list within months and be able to apply their skills to real problems.
TL;DR
- Get an overview of data science/machine learning careers from this video.
- Check out Kaggle for introductory micro-courses. I suggest the Python, Pandas, Data Visualization, and Intro to ML courses.
- Check out the fast.ai course for a broad, practical introduction to deep learning.
- Do some projects (the courses have great suggestions)!
- Apply to internships/jobs. Here’s a resume checklist, a more detailed resume guide, and an interview guide.
Overview of Data Science & Career Options
To get a sense of what data science is and what careers in data science look like, I presented a workshop on behalf of Data Science Club to incoming UWaterloo Freshman here. The video briefly describes data science, machine learning, and artificial intelligence (starting at 4:39) as well as careers in the field (12:52). You can also check out the rest of our channel for awesome workshops run by the club.
Courses
Next, you should take a look at the micro-courses created by Kaggle. These provide a practical and succinct introduction to popular techniques in data science. I would recommend starting with the Python, Pandas, Data Visualization, and Intro to ML courses. After those, try a Kaggle competition! The Titanic and House Prices are popular for beginners. Spend some time applying the techniques you’ve learned, then when you’re stuck, take a look at the kernels (a.k.a. notebooks) that others have created for inspiration. The top-voted kernels for each competition have detailed explanations that will help you level up your skills.
Before spending too much time on Kaggle, take a look at fast.ai. The practical deep learning course is, in my opinion, the best way for beginners to understand deep learning on a high level and apply it to real problems (or Kaggle!). As Jeremy Howard emphasizes in the course, the best way to learn and retain the concepts taught are to take the code and apply it to problems you’re interested in. The only caveat is the fast.ai library from the course is not (yet?) commonly used in industry; however, switching to another library will take no more than a couple of days once you understand the concepts taught in the course.
Internships & Jobs
After going through those courses, congrats! You have the skills you need to do solid data science work. The courses I’ve listed provide great suggestions for next steps and projects to pursue to demonstrate your skills. With these projects in your portfolio, you can apply to data science & machine learning jobs/internships. Your resume should tick off every box on this checklist, no questions. Here is a more detailed (and perhaps subjective) guide to creating a great resume and here are some tips for acing technical interviews (both made by experienced UWaterloo students!). The data science technical interview tips are most relevant for data science internships, but the rest is relevant to all types of tech internships and jobs.
Next Steps
Since this guide is designed for absolute beginners, I did not include any resources for rigorously learning about the math behind data science. While theory is crucial to becoming a great data scientist, I find theory often becomes more engaging and intuitive after being motivated by some hands-on experience. To stay true to the purpose of this post, I won’t provide any links to learn theory–there are tons of high-quality resources just a google search away. Good luck!