Dear Analyst
Dear Analyst #64: Architecting revenue data pipelines and switching to a career in analytics with Lauren Adabie of Netlify
Transforming Netlify's data pipeline one SQL statement at a time. Lauren Adabie started her career analyzing data and answering questions about the data at Zynga. As a data analyst at Netlify, she's doing more than just exploratory analysis. She's also helping build out Netlify's revenue data pipeline; something she's never done before. We discuss how her team is transforming data with SQL, how to get her stakeholders to have confidence in the data, and the path that led her to a career in data analytics.
Re-architecting a Revenue Pipeline
Lauren joined the Netlify team near the beginning of this revenue pipeline project project. Currently the pipeline is a combination of a few workflows. There are hourly processes to export the data to CSVs and Databricks jobs to load and aggregate data and then producing topic-specific tables. Lauren is currently helping migrate this workflow to dbt. With the current pipeline, if there's failure downstream, it's hard to find when and where the failure is happening.
Lauren's first task was bringing raw data into the "staging" layer (data lake). She initially tackled it by pulling all data into the staging layer right away. Looking back, she would have done it differently now that she knows more about the tools and processes. The goal is to help her team monitor and catch issues before it reaches the business stakeholders. As we saw with Canva's data pipeline, the benefit for the data team and the people who rely on the data is saving frustration and time.
A good data pipeline is one that doesn't have many issues. More importantly, when issues do come up, it should be very easy for the data team to diagnose the issue. This impact of this revenue pipeline project is reducing time spent triaging issues, increase speed and ease at accessing data, and analyzing data at various levels. Additionally, the team can decrease communication difficulties with a a version-controlled dictionary of their metrics (similar to the data dictionary Education Perfect is creating).
Learning the tools of the trade
As a data analyst, you may not be diving into GitHub and the various workflows engineers typically use for reviewing and pushing code. Lauren's team is a huge proponent of Github Issues to manage internal processes (she had an outstanding GitHub issue to work on as we were speaking). If engineers add new products to Netfliy's product line, they add a new GitHub issue for Lauren's team to address.
I was curious how Lauren gained the skills for some of the tools she uses every day. When you think of the tools a data analyst uses, you might think of Excel, SQL, R, etc. These are not necessarily tools or platforms you take classes for in college, so what was Lauren's learning path?
Lauren has learned most tools on the job. She learned Python after graduating college.
I learned [python] partially because I was trying to do things in Excel that were frustrating. I was pushing Excel to do too much with VLOOKUPs, references, etc.
Here's a tool you don't hear every day: in college, Lauren learned Fortran 90 because people in her environmental engineering department were still using this programming language. She ended up learning SQL solely because she wanted to go into analytics and learned from a book.