Discover more from Tech Investments
Deep dive on Snowflake, the snowballing cloud data platform
Projected IRR 18%
Snowflake has become a full-fledged data platform in the cloud with data warehousing, databasing and machine learning capabilities. Pricing is based on usage so as customers grow, their data grows, and their machine learning workloads grow, Snowflake’s revenues will continue to grow. The company has a strong market position with more than 25% of companies out of the Forbes Global 2000 list already on their platform. As the capability for customers to share access to data with partnering companies is natively built-in, there is an incentive for others in their supply chains to move onto the platform as well. This creates a highly attractive network effect. The shares are now starting to be attractively valued, after a rather exuberant IPO and subsequent covid tech bubble. Taking reasonable assumptions, I can now model out an 18% IRR over the coming five years resulting in a target price of above $310 by the start of ‘28.
Do me a favor and hit the subscribe button. Subscriptions let me know you are interested in research like this, which is a good motivation to publish more of the analysis I’m carrying out. Special thanks to the 400 subscribers so far!
Snowflake is a data platform built for public cloud environments. The founders’ original vision was for the platform to be a data warehouse in the cloud, possessing unlimited scale and instant elasticity. Elasticity is the term used to scale up and down resources as needed, think of virtual machines running on servers. This can easily be achieved in public cloud environments due to its multi-tenancy structure, where millions of customers can rent resources in the hyperscalers’ datacenters as needed.
A data warehouse can provide business insights from a company’s data, also called analytics. Historically, the data had to be transported from traditional databases such as Oracle to data warehouse environments while undergoing transformations. This required a substantial amount of middleware which needs to be maintained by technically gifted employees. Snowflake simplifies this process as data can easily be drawn in from a wide variety of sources without the need for reprocessing steps in the middle. So you can just store the raw data in Snowflake’s cloud. This allows for continuous changes in how the data should be processed for analytics later on.
With the advent of the world wide web, there has been an explosion of data in semi-structured formats such as JSON. Traditional databases have two main problems in dealing with this. Firstly, due to the high growth in data being generated, a traditional database is ill-suited to deal with this as it is only capable to scale vertically. This means that to process and store larger amounts a data, you basically need to buy a larger server with more CPUs, more RAM and bigger hard drives. A traditional SQL database can’t spread its data over multiple machines. Secondly, these databases store data in predefined columns. This makes them ill-suited to handle JSON-like data which has a variable number of attributes depending on the message being sent.
Traditional data warehouses face similar problems. Due to their lack in capabilities to scale out, customers responded by starting to architect data silos. This means that each business unit in a company can be running its own data system, making it difficult to draw correlations and insights from across a company’s activities.
With the growth in the internet-of-things, where semi-structured data is continuously being transmitted from a plethora of sensors around the world, data volumes will only continue to increase. Traditional data environments aren’t capable to handle these workloads.
Clearly there is a strong need to move data onto platforms capable of scaling in an unlimited fashion. Snowflake has recently started adding also database capabilities within their platform, allowing it to function as a relational database with ACID guarantees. This basically means that there are consistency guarantees for the end-user and that there won’t be any implementation of conflicting transactions. As such Snowflake is becoming a full-fledged data platform in the cloud.
This gives new possibilities to businesses. For example, you can share a selected number of data tables with partners in your supply chain, which enables a real-time sharing of business insights allowing for a tighter collaboration. All while operating in a secure environment as the data doesn’t have to be transferred. Alternatively, you might sell parts of your data via the built-in marketplace. The sharing of covid datasets has been hugely popular on Snowflake’s marketplace.
The other big advantage of running data analytics in the cloud is instant elasticity. Traditional data centers can be hugely wasteful. For example, if you’re only running workloads 6 hours per day, this means that 18 hours per day your installed computing capacity is standing idle. In the cloud you can spin up these resources only when you need them, allowing for huge reductions in cost. Additionally, you can scale up even more resources than you initially had available in your own data center, allowing a six hour workload to be carried out in one hour.
Games developer King, the company behind Candy Crush, ran one of the biggest Hadoop clusters in Europe in a traditional datacenter environment. However, they were experiencing major challenges in scaling this cluster of resources. Especially when demand was exceptionally high with the launch of a new game. This led the company to move workloads onto BigQuery, Google’s data warehouse running on their GCP platform. Within this new environment, the games developer can now comfortably ingest 50 billion events per day.
I’ve discussed earlier in my previous post on Elastic how companies prefer not be locked in at one hyperscaler. The main reason is that they’re able then to shift resources to a cheaper provider. With Snowflake, you can host your data in a multi-cloud environment, e.g. part of your data can be at Amazon AWS and the other part at Google GCP. You can also store your data multi-regionally, allowing for fast access in every region of the globe. For example, if your data is continuously being queried from both the US and Asia, you can store it in both geographies. This gives all your customers low latencies when using your app.
The neat trick that Snowflake uses to enable all these capabilities is storing the data in plain vanilla object stores at the hyperscalers, such as S3 at AWS. Then there is a separate computing layer on top of that which can be initialized as needed. Object stores are not only a very cheap form of storage, but also allow for unlimited scaling. This basically means that you can add more and more hard drives horizontally to save additional data as opposed to having to store everything on one bigger machine. Similarly, when you need to process the data for analytics or machine learning, a near unlimited amount of virtual machines can be spun up on the cloud within seconds. This clever architecture allows for tremendous flexibility as well as cost savings.
Amazon AWS was the pioneer who kickstarted the concept of cloud data warehousing with the launch of Redshift in 2012. Snowflake was founded in that same year, and it took around 2 to 3 years to get the initial product up and running. The subsequent innovation that Snowflake pioneered was the separation of compute and storage. This meant that storage and compute resources could be scaled independently.
The platform also allows developers to build apps on top of Snowflake, which can then be sold over the integrated marketplace. This feature was launched last year and will give the company further opportunities to monetize the platform.
Western Union presented at Snowflake’s recent capital markets day, these are some of the highlights: “We looked at BigQuery, Redshift, Cloudera as well as Snowflake. And from an analysis standpoint, Snowflake really was a no-brainer. It took about 3 months to get the first workloads up and running, subsequently we migrated another 34 data warehouses in 18 months. We were able to take out costs and optimize our data footprint. Over 90% of our workloads perform now 2 to 4x faster. Our fraud rate is the lowest in the company’s history. We have over 150 applications globally running on Snowflake. We need to do pricing in 20,000 corridors across 200 countries, Snowflake enables the right pricing at the right time. We're now able to share real-time data with our agents. In the past, we had to actually lift the data out of the warehouses, bundle it and send it as an SFTP transfer to our agents. Obviously, that has concerns around data privacy and security.”
The growth in data is an attractive area to be exposed to. Gartner forecasts the data management market to grow at a double-digit annual rate of 12 to 15% in the near term. However, within this market, data in the cloud is exposed to high growth rates, whereas in on-premise environments, growth is modest. Below you can see the growth rates for these two segments in the global enterprise IT market i.e. software and services.
In the enterprise market, which are the biggest spenders in the IT market, think multinational corporations and large governments, Gartner estimates cloud penetration to reach 45% in 2023. But the denominator here is all IT spending which they consider able to transition to the cloud. Other estimates I’ve seen include penetration levels of around only 10%. I guess these will include the small, mid and large corporate markets as well. The exact penetration levels will be guesswork and up for debate. As an investor, I expect growth rates to moderate over time, although given the attractiveness of the data market, I still expect a long runway for growth. Even today, the overall enterprise software market is still growing at an annual rate of 10% based on Gartner data. On top of that, Snowflake is in a good position to take market share, which I’ll illustrate below.
Snowflake disclosed that out of the Forbes Global 2000 list, which ranks the largest publicly traded companies worldwide, more than 500 of them are already using their platform. Obviously, the company is starting to have a strong position in the cloud data market, especially at the high-end with large corporations and enterprises.
Out of all customers who moved to Snowflake, 45% came from other cloud platforms, with the remaining 55% coming from traditional on-premise environments. This means that the company has been taking market share from other cloud platforms, explaining the high growth rate they’ve been generating. At the last earnings call, the CFO commented that “We're still replacing some of those first-generation cloud data warehouses. Think Redshift and things like that.”
Additionally, over 60% of their customers already are leveraging the platform for data sharing with other corporations. This creates a highly attractive network effect. Typically companies are highly connected in their supply chains, especially in the manufacturing of physical products or equipment, such as cars or semiconductors. So if one large company moves onto Snowflake, there’s an incentive for the partnering companies around them to follow.
Below are some highlights from an interview with a manager at Cognizant, the large IT consultancy: “Time travel is one of the most valuable features in Snowflake that really helps us out. If there is some data discrepancy at any point, we can get that data back and time travel via fail-safe features. The feasibility to increase horizontal or vertical scaling is great. You can process a huge amount of data in Snowflake. This becomes a challenge in Teradata. When we were trying to read that data, we had to divide it in chunks to load it into Snowflake. Loading into Snowflake was like a cakewalk. Everything is moving into the cloud now. The pricing is fair for all the features that Snowflake is providing. The way Snowflake has emerged in the past few years is impressive. It was just a beginner in the space, and it has become a leader in the market. For all the features it provides at a basic cost, it is impressive. The technical support is really good. We switched clouds at some point. I was using it on AWS, and now it is on Azure. I would rate it nine out of ten.”
Another director of operations at a smaller firm concludes: “Snowflake requires no maintenance on our part. They handle all that. The speed is phenomenal. The pricing isn't really anything more than what you would be paying for a SQL server license or another tool to execute the same thing. It handles unstructured data extremely well too.”
Clearly Snowflake is not only a great platform but it is also cheaply priced. Once growth matures, I expect there will be scope for price increases in the future.
This is the review Gartner highlighted as a critical review of the platform: “The product is a powerful front-end for database/ data-lake management. Documentation of their flavor of SQL could use some work. Snowflake lacks out of the box data analytics capabilities, but is highly customizable if the developer has the right tech-stack knowledge. It's cheap and integrates with a lot of third-party tools. If it had a stronger open-source community, e.g. people who have developed and made code public on GitHub, it would be a real winner.”
Snowflake is currently adding out-of-the-box-analytics tailored to specific verticals, such as health care and supply chain management for example. Also, with the addition of the marketplace, I suspect that a thriving community of developers will spring up around the platform who will offer further functionalities and services. So the issues flagged by this reviewer should get addressed in the near future.
The player that Snowflake is most frequently compared with in the data science world is Databricks. However, Databricks is focused more on advanced analytics and machine learning. The platform has a notebook-based interface for example, which makes it easy to write and share code with other developers. As both Snowflake and Databricks are adding capabilities, both will move more in each other fields over time. For example, a recent addition to Snowflake is that you can now write Python code within the platform itself, a module called ‘SnowPark’. Python is the prime language for coding machine learning algorithms, however, SnowPark also supports other languages. Similarly, Databricks has been adding data warehousing and SQL capabilities.
This addition of machine learning capabilities within Snowflake itself can result in large cost savings for clients. When you have to transfer your data in the cloud, this can result in both substantial egress and transfer fees. So if you can locally process your data where it’s stored, these substantial costs can be avoided. As Snowflake is becoming a huge platform for data storage, I expect machine learning on the platform to become a significant growth driver going forward. To get a feel for the growth we should see in this business, the below chart highlights the number of streaming jobs running in Databricks over time:
Looking at the ‘21 Forrester analysis for the cloud data warehousing market, they see as leaders still Google, Amazon AWS, and Microsoft. The capabilities of each player’s current offering are measured on the vertical axis. Snowflake is ranked in the top three on this metric, within a very close distance to leaders AWS and Google. I expect the company will continue to move towards the upper right corner over time as they win deals and add capabilities. Teradata and Oracle are legacy players in this market and their solutions were designed to operate in on-premise environments.
In the cloud database market, Gartner sees as leaders Amazon AWS, Microsoft, Oracle and Google. Snowflake has only recently started integrating database capabilities within their platform such as transactions with ACID guarantees, so it makes sense they are still behind in this area.
One noticeable risk going through their annual report was that of cybersecurity. If a customer has material damage due to data theft, Snowflake can be held liable for this: “We assume liability for data breaches, intellectual property infringement, and other claims, which exposes us to substantial potential liability. There is no assurance that our applicable insurance coverage, if any, would cover, in whole or in part, any such liability or indemnity obligations. We may be liable for up to the full amount of the contractual claims, which could result in substantial liability or material disruption to our business.”
Obviously this is a large, tail-end type of risk. It’s probably unlikely that Snowflake would have to pay up a seismic amount, but if there would be a substantial breach, it could weigh on the company’s results for some years.
Financials - share price at time of analysis is $139 on the New York Stock Exchange under ticker ‘SNOW’
Based on Gartner data, Snowflake estimates their total addressable market to be around $248 billion by 2026. The company’s goal is to generate around $10 billion of product revenues in 2029, so this would give them an overall market share of around 3% in this growing market. As 80% of the company’s revenues are currently coming from North-America, there’s a plenty of scope to grow in Europe and Asia over the coming decade.
Below are consensus expectations for the coming three years. The sell side expects the company to generate a nearly 4% free cash flow yield in three years’ time, for a company still growing then at a rate of 40%. You can see that on a non-GAAP basis, the company is already becoming a free cash flow machine with 25% FCF margins. However, share based compensation is still running high currently at $860 million. This dilution will be offset in the coming two years by a share buyback program of $2 billion. Looking at the gross margins, the sell side’s projections look very conservative. Snowflake has mentioned several times that due to their increased scale, they’re able to obtain better pricing from the hyperscalers, which they expect to continue. The company’s current guidance is for non-GAAP EBIT margins to go to 20% by 2029, however they’re likely to raise this target at the coming capital markets day.
In the table below, you can see that several peers are generating gross margins of well above 75%. I expect Snowflake’s similarly to go well above 75% over time, given the company’s strong position.
Cloud data companies tend to trade at high multiples, e.g. MongoDB is trading at 9.4x sales while growing at a rate of less than 20%. So if Snowflake’s growth rate in 2029 is still strong - the company is guiding for 30% still at that time - the shares should still be trading at a high multiple then. Even if Snowflake is growing at a rate of only 20% by then, the shares could easily trade on 10x.
Multiples for the cloud names have now normalized again post the covid tech bubble. Typically good cloud stocks tend to trade at 5 to 12x forward revenues. However, during the covid bubble, some of them went to 20-45x.. Snowflake was the complete outlier and traded on 120x forward sales.
Even when growth matures, I expect Snowflake to trade on a high multiple as the market likes high-quality software businesses with wide moats. You can see in the table below that multiples for names like this can still be around the 8 to 12x revenues range. Given that I expect Snowflake to become a hub for machine learning and data sharing, which should result in attractive margins, I expect that the market will attach a high multiple to this business even when growth matures more.
Below is the absolute product revenue growth (excluding services) which the company has been generating quarter on quarter. Over the last six quarters, revenue growth has averaged around $50 million per quarter. However, we have been in a weaker macro environment, especially since the second half of last year, where most cloud companies have been seeing clients pull back some of their planned spending. This highly likely has been weighing on Snowflake’s growth rate.
With the increased investment in the sales force, the building out of the platform’s capabilities such as data sharing, machine learning, and databasing, I expect that once a normal macro environment returns, growth should be able to accelerate again.
Overall, $10 billion of product revenues in 2029 could be achievable. However, this isn’t necessary to make a good return on the shares. If I’m more conservative and model in $9 billion, this gives $9.5 billion of overall revenues including services. Putting those revenues on a 10x multiple while modelling in 2% of dilution per annum (excluding the first coming two years due to the share repurchase program), this gives an overall upside of 124% in the shares. Annualizing that number gives an IRR of 18%, resulting in a projected share price of around $311 by January 2028 (the end of Snowflake’s 2028 fiscal year). Given the size of the market and the strength of the platform, it is likely that this business will generate revenues of more than $10 billion over time. Therefore I’m reasonably comfortable with these numbers.
At this valuation Snowflake should be an attractive investment. The shares have fallen a lot since the IPO where they were exuberantly valued in my opinion. There was even one clown on the sell side who had a next-twelve-months’ price target of $575. At around the current levels of $140, it makes a lot more sense to take a look at these shares.
If you enjoy research like this, hit the like button (at the bottom) and subscribe. Also, share a link to the research on social media with a positive comment, it will help the publication to grow.
For day-to-day updates on financial markets and the tech sector, follow me on Twitter.
You can find an overview of all my research here. Also, I’ve highlighted below a number of other publications that I enjoy reading which might be of interest.
Further Reading writes writes
Francis writeswrites writes
Disclaimer - This article doesn’t constitute investment advice. While I’ve aimed to use accurate and reliable information in writing this, it cannot be guaranteed that all information used is of such nature. The projected IRR is a subjective calculation based on what I estimate a likely and reasonable scenario for the shares to be, however, the shares’ future performance remains uncertain and a more negative scenario could play out. The views expressed in this article may change over time without giving notice. Please speak to a financial adviser who can take into account your personal risk profile before making any investment.