Resources

Collection of useful/cool links/resources/stuff. If there’s anything you think I should check out, leave a comment below!

Misc.

LaTeX Design Patterns, made me much more productive in LaTeX.
TEXnique, typesetting game to test/practice LaTeX.
Vim Genius, learn and practice VIM.
Paper Digest, get one sentence summaries for papers and digests of conferences.
Connected Papers, see citation graphs for papers.
Index of Python 3.9 Standard Library
Guide to Making Nice Tables
Frieda Rong’s list of useful resources
UC Irvine Machine Leanring Repository, tons of datasets.
Awesome JAX, a curated list of videos, blog posts, papers, and other resources on JAX.
Measure Theory for Dummies by Maya R. Gupta, a gentle introduction to measure theory for those interested in reading probability papers.
Colab P100, this notebook always gets a 16gb P100 on Colab. Make a copy of this! UPDATE: this doesn’t seem to work anymore…
Jason Rennie’s LaTeX tips and tricks, nice LaTeX tricks.
The Matrix Cookbook, a reference on matrix decompositions, derivatives, distributions, and more.

Tools

VSCode Live Share, a real game-changer for pair programming.
VSCode Jupyter, lets you interactively program in Python, even files outside of notebooks.
Fire, the most succinct and least intrustive python CLI library I’ve used (I like argparse, Typer, and fastcore’s script module as well).
FastCore, useful python utilities, like typed dispatch, improved list interface, and more.
Tensor Sensor, visualize tensor operations for easy debugging.
Pylustrator, interactively create matplotlib figures.
Zotero, reference manager.
Vim Repl, a nice VIM plugin enabling cool things like this.
Tensors Fitting Perfectly, catch shape errors at compile time in Swift for TensorFlow.
FastDot, wrapper over pydot to make it more pythonic.
Datacrunch, v100s for $0.45/hr???
Chocolatey, windows package manager.
Altair, declarative visualizations in Python. Makes interactive plots particularly easy.
Finding a Good Learning Rate, 1 Cycle Learning Rate Policy for training neural networks.
Scalene, the best python CPU/memory profiler out there.
JupyterPlot, plots updated in real time in jupyter.
Herbie, automatically find and fix floating point issues (stability).
Einops, write more expressive tensor manipulations.
LaTeX equations in PowerPoint, a nice add-on and guide by Jeremy Howard.
PlotNeuralNet, latex generator for beautiful neural network visualizations.
NN-SVG, an SVG generator for neural network visualizations.
Markdown Slides, create beautiful Reveal.js slides in markdown. Supports LaTeX, code blocks, animations, and media!
NBViewer, view Jupyter notebooks on GitHub (GitHub often has trouble loading large notebooks).
Colab-ssh, SSH into colab sessions.

Courses

Advanced ML/DL Courses (advanced meaning PhD level)
Missing Semester, a course from MIT that teaches a bunch of useful stuff you’d wish you’d learned sooner, like shell (+ scripting), editors, data processing, version control (git), debugging, and more.
Problem-Solving for CS Technical Interviews
Robust Statistics

Talks

How to read a paper, some nice advice from Andrew Ng on reading papers as well as finding a career in ML.
Advanced Numpy, really really really good workshop by Juan Nunuz-Iglesias on advanced numpy, particularly stride tricks.
Automatic Differentiation, nice talk by Matthew Johnson about how autodiff works, plus the basics of the autograd library.
Intro to Nsight Systems. Nsight Systems is NVIDIA’s profiler for CPU/GPU programs. Here’s a nice blogpost that complements the talk.
Is It Time to Rewrite the Operating System in Rust?, not only does it cover writing an OS in Rust, but also talks about the history of OSes (with details on why things are the way they are).
Awkward Array: Manipulating JSON like Data with NumPy like Idioms (a nice library)
Matplotlib 2.0, “new” features in matplotlib.
Using Types Effectively, make your compiler do your work for you with static types.
Python 3 Metaprogramming.
Non-conforming C++, cool obscure C++ features.
Lex Fridman’s Interview with Jeremy Howard.
Andrei Alexandrescu “Speed Is Found In The Minds of People”, optimizing sorting medium sized arrays in C++ (really interesting and good takeaways).
Creating intuitive APIs, talk by Lars Knoll on designing good APIs based on his experience with Qt.

Blog/Other Posts

Advice for CS PhDs, lots of cool links for people aiming to go into or are currently in a CS PhD.
The Sorcerer’s Apprentice Guide to Training LSTMs
Animations in Matplotlib
Docker Tips
Deep Learning Papers Reading Roadmap, great guide for getting started with papers in many areas of deep learning.
Data Analyst Interview Guide
Performance Optimization in PyTorch.
Gumbel Softmax Trick, differentiably sample from a categorical distribution.
Which GPU for Deep Learning, a guide on choosing a GPU and some intuition about what specs to look for.
PyTorch Internals, post on the implementation of PyTorch (particularly the C++ layer and autograd).
Lavanya.ai’s ML Notes, covers linear algebra, stats, data wrangling and more. The basics section is golden.
Python is Cool, guide on some lesser known Python features.
Bash String Operations.
Google Colab Tips
How to Read a Paper
Eval and Query to Improve Pandas Performance
Viewing Matrices and Probability as Graphs
Object Detection from 9 FPS to 650 FPS, accelerating deep learning inference.
Every Complex DataFrame Manipulation, Explained & Visualized Intuitively
Auto formatters for Python, comparison of the major formatters (autopep8, yapf, black) for python. I’m very particular about my code format, so without an auto-formatter, I waste a lot of time getting it right. My personal preference is yapf(google), but black is considered the “best”.
Silvia Sellán posted/wrote some resources about choosing and applying to grad school.
John Schulman’s opinionated guide to ML research
Translating SQL Queries to Pandas Commands
Group-by from scratch.
Tech Interview Handbook, a great guide on getting a job in tech.
Kalpesh Krishna’s grad school resources.
GPT-2 Visualized.
Don’t Select Features, Engineer Them by Demetri Pananos about how feature selection methods are often unhelpful, and instead generating more features (via splines, in this post) often leads to a better model.
GNU Terminal Docs, learn to be more productive using the GNU Terminal!
Useful Inequalities Cheatsheet by László Kozma.
Kaggle Hidden Gems, a collection of underrated notebooks to learn from.
A Recipe for Training Neural Networks by Andrej Karpathy.
Creating an R package that depends on RStan.
Engineering Trade-Offs in Automatic Differentiation, a great blog post by Christopher Rackauckas explaining different AD implementations (Python/Julia libraries).

Little Gems

Disposable Email Addresses
Libby, sign out ebooks and audiobooks with your library card.
Sum numbers in order of increasing magnitude for better accuracy
Git Autostash for pulling even with local changes.
Include YAMLs in other YAMLs (for Python).
Anatomy of Matplotlib
Copy-pasting in TMUX
Latency numbers every programmer should know
Get notified at the end of your python script.
Sync Outlook and Google Calendar, annoyingly hard to do natively, super useful for seeing work and personal events in one place. Sometimes it’ll mess up the sync if you fiddle with events too much, I just disconnect then reconnect my account.
Jump to oldest commit of a repo, you’ll learn some useful stuff by reading the earliest commits of well established projects.
Git rebase from remote fork repo
Git and GitHub fast setup, includes setting name/email and SSH keys.

Helpful Tweets

Useful Git Aliases
ML Testing Tips
difflib.get_close_matches, return “good enough” matches for strings.
jax.mask lets you use ragged arrays with vmap!
Stein’s Paradox, fun statistics paradox.
Visualize Neural Network Architectures.
Better LaTeX defaults.

Nicholas Vadivelu

Resources

Misc.

Tools

Courses

Talks

Blog/Other Posts

Little Gems

Helpful Tweets

Resources I’ve helped put together: