Hortonworks Rises to Data in Motion Challenge

PinIt
Vision of Data Transfers

Hortonworks’ new framework should make it easier for IT teams to track and monitor data flows at a deeper, more granular level.

Data that is regularly in motion is generally much harder to manage than data at rest. To make that challenge more palatable for IT organizations, Hortonworks today added the ability to abstract data flow schemas and programs using a Hortonworks DataFlow (HDF) framework that makes it easier to track and monitor data flow changes at a more granular level.

Based on open source NiFi registry software that provides a central location for managing shared resources, this capability will be critical for organizations constructing data flows that need to span from edge devices processing data in real time to data warehouses based on Hadoop, says Hortonworks CTO Scott Gnau. NiFi is a subproject of the Apache Software Foundation (ASF) that is based on code Hortonworks contributed.

See also: Study finds more reliance on streaming analytics

Version 3.1 of HDF also adds, for the first time, support for Apache Kafka messaging software, which can now be managed as part of the larger data flow model.

In general, Gnau says its now apparent that data frameworks that can be easily extended to new types of data and sources is the foundation on which most digital business initiatives will be built.

“It’s a requirement for any modern data architecture,” says Gnau.

To make the NiFi registry simpler to manage and secure Hortonworks has also updated it Apache Ambari management and Apache Ranger governance software. IT managers can add a NiFi node to an existing cluster without manually updating the cluster. Apache Ranger now allows administrators to define group-based policies for NiFi resources.

Version 3.1 of HDF also adds capabilities to improve overall operations when organizations deploy Hortonworks Streaming Analytics Manager (SAM). In test mode, developers now create mock data for use in unit tests for SAM Apps as part of a larger continuous integration and continuous delivery (CI/CD) framework. Hortonworks has also now included in the SAM Operations Module are tools to test, debug, troubleshoot, and monitor applications.

Should Simplify Data Governance and Security

Gnau says the most significant impact HDF will have will be on simplifying governance and data security. HDF can be integrated with Apache Atlas governance software, as well Hortonworks SmartSense, and Apache Knox software to better manage and secure the overall Big Data environment notes Gnau. That integration enables Apache Atlas software to obtain metadata from NiFi data flows to enable data governance of both data at rest and in motion, says Gnau.

Rather than solely focusing on Hadoop deployments in the data warehouse, Gnau says Hortonworks sees its charter as having been expanded to include data in motion and rest at the edge of the network as well. Regardless of use case, Gnau says IT organizations are making it clear they want a holistic approach to managing all their data.

Historically, most IT organizations have not been especially adept at managing data. But now that data is viewed as being more of a strategic asset, the focus on tools that enable organizations to maximize the value of their data is growing. Naturally, most organizations are still struggling with putting the right processes in place to manage that data. But as most IT leaders already know, fundamental process change is never going to occur until the right tools get placed in the hands of the IT staff tasked with bringing that transformation about.

Leave a Reply