SHARE
Facebook X Pinterest WhatsApp

StreamSets Centralizes Management of Data Pipelines

thumbnail
StreamSets Centralizes Management of Data Pipelines

The firm’s goal is to allow application development by providing a cloud service through which data pipelines can be managed.

Written By
thumbnail
Michael Vizard
Michael Vizard
Oct 25, 2018

StreamSets today moved to bridge the divide between DevOps and DataOps by making it simpler to manage data flows across multiple application pipelines.

The goal is to accelerate development of modern applications by providing a cloud service through which a set of graphical data flows and pipelines can be managed, says Clarke Patterson, head of product marketing for StreamSets. “It provides a cloud-based approach to creating and orchestrating pipelines,” he adds.

See also: Maximizing containers with freedom of choice

Via this update to the company’s namesake platform there is now a continuous integration/continuous deployment (CI/CD) framework for automating frequent changes to data flows, says Patterson.

Other new capabilities include support for Kubernetes clusters and a data flow designer that comes with pre-configured connectors for data sources such as Amazon S3, Elastic MapReduce (EMR) and RedShift; Azure Data Lake Storage, HDInsight and Azure Databricks; Google DataProc and Snowflake.

In addition, StreamSets now also provide support for a data drift handling capability, which automatically reflects updates to source schema in Amazon Athena, Azure SQL, and Google BigQuery cloud data services. Finally, a StreamSets Data Protector tool allows policies attached to sensitive data to be detected and enforced.

In general, many of the DevOps concepts and processes originally pioneered to accelerate application development are now being applied to how data gets managed, also known as DataOps. Rather than waiting weeks for a database administrator to construct a schema to expose a set of data pipelines, DataOps enables data pipelines to be exposed in a much faster, agile manner.

Fresh off raising an additional $35 million in funding, StreamSets is applying these concepts at a time when the way applications are being developed and deployed is fundamentally changing thanks to the rise of microservices-based architectures. Rather than having to update an entire monolithic application, new functionality can be added to an application more easily by updating only a limited number of microservices. That microservices approach, in theory, enables IT organizations to build and maintain a much larger portfolio of applications.

But each of those microservices is tapping into a pipeline to process data. Those pipelines are increasingly being connected to platforms that make that data available in real-time. Most microservices are being built using containers running mainly on Kubernetes container orchestration engines that are quickly emerging as a de facto standard.

All those container-based microservices are all trying to access data at a level of concurrent scale that is unprecedented. Given the massive volume of data involved, a more consistent approach to managing that data in the form of DataOps is now required. Organizations will clearly need to meld their DevOps and DataOps initiatives to construct applications capable of analyzing data in real-time.

StreamSets claims that in the previous four fiscal quarters it has doubled its commercial customer count and tripled its revenues and the open source StreamSets Data Collector that is at the core of the platform has been downloaded well over two million times by thousands of companies. Commercial customers include Commercial customers include GSK, Chesapeake Energy and Solera Holdings, and the company notes over two-thirds of StreamSets commercial customers subscribe to one or more of its proprietary software offerings.

The challenge now is getting everyone within IT on the same data pipeline page at a time when microservices-based applications will have more dependencies than ever.

Recommended for you...

The Foundation Before the Speed: Preparing for Real-Time Analytics
Excel: The Russian Tsar of BI Tools
Real-time Analytics News for the Week Ending January 17
Real-time Analytics News for the Week Ending January 10

Featured Resources from Cloud Data Insights

The Foundation Before the Speed: Preparing for Real-Time Analytics
Why AI Needs Certified Carrier Ethernet
Real-Time RAG Pipelines: Achieving Sub-Second Latency in Enterprise AI
Abhijit Ubale
Jan 28, 2026
Excel: The Russian Tsar of BI Tools
RT Insights Logo

Analysis and market insights on real-time analytics including Big Data, the IoT, and cognitive computing. Business use cases and technologies are discussed.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.