Dear Analyst
Dear Analyst #48: Working with numbers formatted as text with avocado sales
For spreadsheet newbies, number formatting may seem like a pretty innocuous matter. As you become more familiar with Excel or Google Sheets, you'll find that improper number formats will lead to formulas that don't output what you expect or formulas that straight up don't work. This episode explores what happens when you unknowingly have numbers formatted as text. I also explore the various ways you can try to debug the errors that come from numbers formatted as text. You can copy the Google Sheet here for this episode.
AutoSums and avocados
The topic for this episode came from a question about AutoSum in the Microsoft Excel community forum which has close to 40,000 views and 25 replies (some of them are quite spicy, I might add). I'll be referencing this thread quite a bit in this episode since some of the best ideas come from--you guessed it--the comments. Props to Excel MVP Sergei Baklan for jumping into this thread and trying to help answer a somewhat ambiguous question.
No one: Spending Friday nights perusing the Microsoft Excel forums for interesting questions :)
The data set for this episode is a fun one: avocado sales across different cities (learn more about the dataset on Kaggle here). The prologue for this data set is amazing and will ring true for all millennials out there:
It is a well known fact that Millenials LOVE Avocado Toast. It's also a well known fact that all Millenials live in their parents basements. Clearly, they aren't buying homes because they are buying too much Avocado Toast! But maybe there's hope… if a Millenial could find a city with cheap avocados, they could live out the Millenial American Dream.-Justin Kiggins, Product Manager, Chan Zuckerberg Initiative
One can only make an episode about formatting numbers in Excel/Google Sheets so interesting, so this was my best attempt. And we go on!
Numbers formatted as text mess up formulas
The first thing you'll notice with numbers formatted as text is that they will mess up formulas by giving you an output you would not expect. For instance, in our avocado dataset, cell C53 is simply a sum of all the "numbers" in column C, but the result of the SUM formula is 0:
How is this possible? If you click on the column C header and go to Format->Number, you'll notice that all the cells in this column are formatted as "Plain text."
This means any numbers you type into the cells in this column will be treated as text. Since these cells aren't formatted as numbers, Google Sheets doesn't know how to treat these values which means the SUM formula is trying to sum up a bunch of text values. In Excel, this is similar to the cells being formatted as "Text":
Detecting cells that are formatted as text
If you are really diligent, you can use the ISTEXT formula to test whether a cell is indeed formatted as text. However, when you inherit an Excel model, you are just hoping that the previous modeler did things correctly and you can just plug in numbers and move on with your life. How can you figure out if some cells are improperly formatted before pulling your hair out?
Unfortunately, no answer exists. If you know of one, please comment below. For new spreadsheet users,