Data Engineering the Left

(Data) Engineering Resources

Here is a collection of resources that I’ve found useful in my development as a data engineer. I’ll add to this page over time.

Books

Data Pipelines Pocket Reference, Densmore

The best reference book I’ve come across for the full picture of the modern data engineering toolkit and workflow. Has lots of useful example code.

Refactoring, Martin Fowler

This book completely transformed my relationship to legacy code, and data engineers usually work with a lot of legacy code. If you can compose many trivial, behavior-preserving changes into coherently redesigned and simplified code, you will have no fear.

Python Data Science Handbook, VanderPlas

I especially appreciated its first chapter about ipython.

Articles

learn from many, many mistake grug make over long program life”

Effective engineers are highly self-sufficient. You will answer 90% of your own questions before you have finished reformulating them as smart questions.

Tutorials

This is where I started. I am a big fan of the “hard way” learning philosophy. The main idea is to never read about how to do something without actually doing it yourself as you read.

This tutorial carried me through my first year of work. Not necessarily relevant for data engineering, but if you want to know how to throw together a web application in python, this tutorial is excellent.

learning