At Strata Hadoop, a Focus on Fast Data, IoT

PinIt

Big data is getting fast.

Software vendors usually time their biggest news to coincide with marquee industry events, and this year’s Strata Hadoop conference was no exception. News timed for the conference showed an increasing focus on capabilities for real-time analytics, including IoT use cases.

Confluent strengthens streaming platform

At Strata Hadoop World, Confluent, providers of the first streaming platform powered by Apache Kafka, announced new features for Confluent Enterprise. The new features include multi-datacenter replication, auto data balancing, and cloud migration capabilities. Confluent said the update, available in a variety of popular open-source formats, will be available at the end of October.

SAP: $2 billion in IoT Investments

SAP plans to accelerate innovation in its IoT solution portfolio, increase sales and marketing, scale service, support and co-innovation, and grow its ecosystem of partners and startups in the IoT market. SAP is looking to its HANA Cloud Platform for IoT applications, and also acquired IoT startup Plat.One, which specializes in device interoperability and edge processing for applications in smart cities, logistics, manufacturing and farming.

Related: From cloud to edge: Why SAP acquired Plat.One

MapR announces support for event-driven microservices 

MapR Technologies’ Converged Data Platform now offers support for event-driven microservices, the company announced. The new comprehensive support is for microservices that leverage real-time analytics, rapid response, and automated actions. New features include application monitoring and development. The Converged platform now also supports full monitoring of cluster-wide operations and resources in a single view, and microservices-specific volumes.

Pythian offers Google BigQuery services to enterprise customers

Pythian, a 400-person global IT services company that helps companies adopt disruptive technologies to better compete, announced that it offers expert consulting and managed services for Google BigQuery’s cloud data warehouse.

Pythian’s said recently that one of its retail customers needed to ingest and process a stream of live in-store visitor location data from more than 2,500 stores worldwide, with more than 4,000 daily visitors in each location. Pythian said it built as “a cost-effective real-time data pipeline on Google Cloud Platform with BigQuery at its center serving analytical queries.”

Cloudera announces Spark 2.0

Cloudera, which provides a data management and analytics platform built on Apache Hadoop, announced Apache Spark 2.0, with enhancements to the API experience, performance improvements, and enhanced machine learning capabilities. In addition, Cloudera is working with the community to continue developing Apache Kudu 1.0, which is described as a “ high performance columnar store for Hadoop that enabled the powerful combination of fast analytics on fast data.”

Cloudera said its integrations with Apache open-source projects “recognize the growing need for streaming and analyzing real-time data in high-demand workloads, including machine learning models deployed in production by Cloudera’s enterprise customers.”

Spark 2.0 features include what Cloudera describes as “Structured Streaming,”which is designed for better performance and easier ingestion of traditional structured data for time series, tabular, and IoT data. The release also includes a machine learning model, pipeline persistence, and new machine learning libraries.

Related:

Why Apache Spark is so hot

How to apply machine learning to event processing: an online guide

Sue Walsh

About Sue Walsh

Sue Walsh is News Writer for RTInsights, and a freelance writer and social media manager living in New York City. Her specialties include tech, security and e-commerce. You can follow her on Twitter at @girlfridaygeek.

Leave a Reply