Dear Analyst
Dear Analyst #58: Canva’s data warehouse initiative to increase reliability and tooling for analysts with Krishna Naidu
Data warehouses have come a long way since the days of Oracle and Microstrategy. A data warehouse should be able to grow and expand with the business it supports. At Canva, the data amount of data coming into the data warehouse has exploded in the last few years given the platform's surging usage. In this episode, Krisha Naidu, a data engineer at Canva, talks about how his team is making it easier for analysts to get the data they need and the tooling to analyze data. The goal of the data warehouse team at Canva is to maintain reliability, improve access and tooling, and oversee compliance with regulations. At the end of the episode, we also discuss our mutual love for keyboard shortcuts.
A design tool for the rest of us
I never considered myself a "good" designer or artist. I still feel lost sometimes in Photoshop and Figma, but Canva makes the design process super seamless for a newbie like myself. I use the tool at least once a week for a variety of use cases. All the thumbnails on my YouTube channel were created on Canva because I can create a decent design in five minutes or less.
Krishna didn't use Canva before he joined the company. His first foray into Canva was creating a birthday invitation for his daughter. But he quickly saw the power and potential of Canva after seeing his family members use the tool and a family friend who uses Canva for creating marketing brochures. Once Krishna joined Canva, the scope of the mission became clear to him. It's not just about making design easy, but also giving people the ability to get their designs seen by the right people. Like many other SaaS tools, Canva has also added more collaboration features as more teams become distributed.
Canva invitations
Structure of Canva's data team
Given Canva's size (1,500+ employees according to LinkedIn), the data team is quite mature relative to other SaaS companies. They have data analysts, scientists, and engineers.
The data engineering team (where Krishna works) is broken out into three sub-teams:
* Streaming - Internally this team is know as Canvalytics and they focus on capturing all the clickstream data from the product. This team helps Krishna's team with getting data into the data warehouse.* Platforms - They manage the data lake and tooling for data scientists* Data Warehouse - This is Krishna's team, and they provide tooling for the users of the data warehouse. They also enforce controls and governance of the data warehouse, and their primary business stakeholders are Canva's data anlaysts.
The data coming into the data warehouse is constantly growing which is a good sign because it means the number of Canva users is growing. On top of that, new product features being added to the platform means more clickstream data needs to be captured and transformed in the data warehouse. To better cope with the expanding data footprint, Krishna's team has architected some interesting solutions to cope with the company's growth.
A sandboxed build environment for analysts
When the data team was smaller, it was easier for all analysts to work in the same data warehouse environment. If an analyst made a change to a dataset, then they might work with the data engineering team to roll the change out and that change would be communicated out to the rest of the analysts.
With more analysts, it becomes easier to step on each others' toes since one analyst might make a change on one dataset (where they are building and testing their models),