Here is a collection of resources that I’ve found useful in my development as a data engineer. I’ll add to this page over time.
The best reference book I’ve come across for the full picture of the modern data engineering toolkit and workflow. Has lots of useful example code.
This book completely transformed my relationship to legacy code, and data engineers usually work with a lot of legacy code. If you can compose many trivial, behavior-preserving changes into coherently redesigned and simplified code, you will have no fear.
I especially appreciated its first chapter about ipython.
“learn from many, many mistake grug make over long program life”
Effective engineers are highly self-sufficient. You will answer 90% of your own questions before you have finished reformulating them as smart questions.
This is where I started. I am a big fan of the “hard way” learning philosophy. The main idea is to never read about how to do something without actually doing it yourself as you read.
This tutorial carried me through my first year of work. Not necessarily relevant for data engineering, but if you want to know how to throw together a web application in python, this tutorial is excellent.