Acceldata recently surveyed a group of data engineers, chief data officers, and other enterprise data team members about their data environments and how they use Snowflake as a part of their overall data and business operations. Here’s what they found.
This post on the Snowflake data experience is sponsored and originally appeared on Acceldata.io.
Snowflake is one of the most popular cloud data warehouses today. In just one decade, the company has grown to more than 6,800 enterprise customers and $1.2 billion in annual revenue, more than double its prior year.
Like other cloud data warehouses, including Databricks, Amazon RedShift, and Google BigQuery, Snowflake boasts an attractive combination of low start-up costs, constant innovation, and “it just works” manageability. And judging by its breakneck growth and enthusiastic customers, Snowflake delivers on these features better than its rivals, especially because of its high availability (with near-zero administration), and instantly-deployable infrastructure.
As the Snowflake footprint grows, it’s helpful to understand how it is being used and the impact it’s having.
The Acceldata team recently surveyed a group of data engineers, chief data officers, and other enterprise data team members about their data environments and how they use Snowflake as a part of their overall data and business operations. The findings shed light on the modern data stack and the concerns that data teams have for ensuring data quality, reliability, and manageability.
See also: Multidimensional Data Observability
Here’s a snapshot of our survey respondents, their top concerns, and insight into their Snowflake environment:
- Cost is the biggest concern.
- Most have been using Snowflake for 1-3 years, most have annual contracts, and contract size is typically between $250,000 – $1 million.
- The Snowflake environment for most is between 1 – 50 warehouses.
- Data teams of respondents typically have fewer than 50 data engineers.
- The majority of respondents use other data platforms.
Let’s take a closer look at the survey responses.
Clearly, cost is top of mind for most Snowflake data teams. What’s notable about this particular metric is that other top concerns – data quality and performance – are both intrinsically related to cost. When an organization has consistent data quality, they usually see a dramatic improvement in performance, and those things can support better resource efficiency and an improvement in cost to value.
Snowflake User Personas
It takes a variety of roles to manage Snowflake environments, but it’s clear that data engineers make up the vast majority of Snowflake users.
Number of Snowflake Users
Here again, we have an indication that as organizations grow, their Snowflake environment grows with them. As such, we see that almost half of respondents have more than 51 Snowflake users in their organization.
The numbers here show that about half of respondents have fewer than 10 warehouses, while half have more than 10. About 36% have between 11-50 warehouses.
Length of Time Using Snowflake
All respondents have been users for five or fewer years, while 36% have less than one year of using Snowflake at their company.
Size of Data Team
There was considerable variation among respondents in terms of team size, with about 20% having teams with more than 20 members.
A large number of data platforms are used by respondents, as this graph shows.
How Many Applications Use Snowflake?
More than half of respondents use Snowflake for less than 10 applications. We are seeing a trend in increase of applications
Daily Snowflake Queries
About a quarter of respondents perform more than 5,000 queries daily. Within that group, 8% run more than 50,000 daily queries.
Types of Workloads
The survey indicates a wide range of workload types being run on Snowflake.
To learn how to maximize the return on your Snowflake investment with insight into cost, data reliability, and best practices, we encourage you to look at Acceldata’s data observability platform for Snowflake. It combines monitoring, analytics, and automation across multiple dimensions of data operations, including spend intelligence, data pipeline reliability, and data best practices.