Dear Analyst

Dear Analyst


Dear Analyst Episode 25: Structuring data challenge (denormalize data) with Get and Transform

February 18, 2020

This episode is based on a video and Tweet posted by Mr. Excel (Bill Jelen). Bill discusses an Excel challenge someone emailed him about regarding how to "transform" a badly structured table of data into a structure that makes it easy to do PivotTables and other downstream analysis. Interestingly, I received a ticket from a user at work today about a very similar issue on denormalizing a table of data. Here is the Google Sheets file with the problem for you to follow along with.

The person is trying to convert this table of data (notice the quarters and employee sums across the columns):

The actual data set has many more quarters and employee columns along the top

Into a table like this where the employees are listed in a column with corresponding quarters of data:

Original Tweet and video below:

https://twitter.com/MrExcel/status/1226478550552924160

https://www.youtube.com/watch?v=S9ow-1Jq_rE&feature=youtu.be

Get and Transform (Power Query) in Excel

The solution Bill walks through involves the Get and Transform feature in Excel. I'm on a Mac and don't have Get and Transform since you need to be an Office 365 subscriber and my version of Excel is too old, but the method of transforming the data reminds me a lot of the transformations OpenRefine provides you. Major benefit of OpenRefine is that it's free. After you unpivot the columns, Bill's solution involves a bit of text manipulation to get your categories/dimensions to look the way you want.

Personally, I would've approached the problem like this user did with using the INDEX and MATCH formulas to pinpoint the row/column of data you need from the original "bad" dataset and getting the data to show up in your clean table:

INDEX(UGLY_RNG,MATCH(CAT),MATCH(EMPLOYEE#)+QTR#)

I didn't explore this solution entirely, but I prefer solving these data structuring problems with formulas vs. macros and "applied steps" in Get and Transform.

No-Code and keeping your identity small

In the second half of this episode, I mention an episode with Shawn Wang from Netlify on the Software Engineering Daily podcast, and highly recommend checking out a few blog posts related to the talkL:

* Keep Your Identity Small by Paul Graham's * No code is a lie by Shawn Wang

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

* Software Engineering Daily: No Code with Shawn Wang