The Dell and Starburst partnership marries a fast query engine with high-Performance compute and storage platforms.
Late last week, Dell Technologies announced a data virtualization solution, developed in partnership with Starburst, that eliminates data silos to enable fast querying of data sources on-prem and in the cloud.
The new partnership allows Dell to “further walk alongside our customers as they work through their complex data environment and gather valuable insights from a wide variety of newer data types like videos, streams, and more,” said Greg Findlen, Senior Vice President of Product Management of Data Management at Dell Technologies, in a blog announcing the news.
The heart of the announcement is aimed at helping businesses keep up with the explosion of data and data silos they encounter all the time. “The cost and complexity associated with data management and analytics are forcing many organizations to store data across clouds and on-premises locations to optimize performance and stay within budgets,” said Justin Borgman, Starburst co-founder and CEO.
Unfortunately, many organizations have problems with data in multiple places. It prevents businesses from creating new sources of customer, product, service, and operational value from their data. In the past, Gartner noted[i] that because of these issues, only 22% of data management resource time is spent on data innovation and monetization.
Data virtualization and federated queries
The partnership with Starburst allows Dell to offer customers the ability to use data virtualization across their multi-cloud environments. Specifically, the new data virtualization solution uses Dell Technologies hardware and software-driven storage designed to manage data at scale, including, respectively, PowerEdge servers and ECS object storage.
The solution makes use of Starburst’s ability to query data across any database, making it instantly actionable for data-driven organizations. Specifically, Starburst provides a fast and efficient analytics engine for data warehouses, data lakes, or data mesh. It unlocks the value of distributed data by making it fast and easy to access, no matter where it lives.
Starburst is built on top of Trino, the open-source, high-performance distributed SQL engine that’s known for running fast analytic queries against data sources ranging in size from GBs to PBs. (Trino was formerly called PrestoSQL.)
The Starburst Enterprise Platform distribution of Trino was created to help enterprises extract more value from their Trino deployments. It offers global security with fine-grained access controls, stable and reliable releases, connectors, data caching, and enterprise support.
In a technical blog about the partnership, Dell noted, “we chose to partner with Starburst and deploy their software in our labs to evaluate its performance on Dell hardware. We used the industry standard TPC-DS test suite to benchmark Starburst’s performance by measuring the total execution item as well as the per-query execution time. We also varied the hardware resources to model how Starburst’s performance varies. We detailed our set up and experiments for reproducibility in this paper. Our goal was to provide our customers with a validated design reference for deploying Starburst and scale it appropriately as the query volume, concurrency, or data volume scales.”
Additionally, Starburst is based on a distributed Coordinator-Worker architecture. In evaluating the solution, Dell ran coordinator and worker nodes of Starburst Enterprise on Dell PowerEdge servers and used unstructured storage such as Dell Elastic Cloud Storage (ECS).
Dell believes running the solution on its hardware will deliver:
- High-performance computing – delivering up to 43% greater performance by leveraging Intel’s 3rd Gen Xeon Scalable processors.
- Improved throughput via PCIe Gen 4 – doubling the throughput over prior server generations, with eight lanes of data.
- Comprehensive security – with data encryption, the root of trust protection, and supply chain verification.
- Improved energy efficiency – with the latest cooling technology, offering up to a 60% reduction in power consumption.
- Flexible, autonomous management – delivering up to 85% time savings by freeing up the skilled hands of IT professionals for other vital projects.
A final word
The partnership and the ensuing solution allows Dell to become a one-stop shop for finding any needle in an organization’s on-premises, private, or public cloud haystack.
[i] Gartner Survey Analysis: Data Management Struggles to Balance Innovation and Control. March 19, 2020.