Whether challenges concern security, observability, access to data, data flow, or multicloud, hardware must pay attention to the Open Source world.
Data is value. But merely having usable data doesn’t necessarily translate into taking full advantage of it.
The key question that enterprises are asking is, how can we tap data’s available but inaccessible potential, furthering our customers’ goals and boosting revenues? And for answers, they’re increasingly turning to open source solutions.
Leading companies are evaluating and maturing open architectures—integrated collections of composable compute, networking, and storage resources. The scalable hardware infrastructures enable processes for continuous integration and development of software. Far from a black box, open source software and, increasingly, hardware offer better visibility and control for everyone. The upshot: democratizing the tech industry by removing barriers that stand in the way of collaboration.
See also: Top Open Source Tools for Deep Learning
An IDC TechScape study found that “most of the important emerging technologies are partially or fully made up of open source components, which makes a bold statement about where the industry goes in the future.”
The benefits are significant:
- The openness of source means auditability and ease of improvement; source code is released for peer review and suggested improvements.
- Another prominent benefit of open source development is R&D cost distribution: companies share expenses of solving problems.
- Security, too, is enhanced in the open environment. When more people are looking at the code, it’s easier to find bugs.
Fluent in Software
The company I work for, Seagate Technology—a global leader providing data storage solutions for over 40 years—belongs in the hardware camp.
And yet, for the same reasons that innovation is at home in the open source world, we go beyond that: We want to be fluent in software, and we enable innovations happening in software.
For several years, Seagate has sponsored software-centric consortia and foundations like The Linux Foundation and University of Santa Cruz’s Center for Research in Open Source Software, as well as open source hardware foundations such as RISC-V and OpenTitan. We’re optimizing our systems for data increasingly stored as objects.
What’s a hardware company doing in the software world? Anything that happens in software reverberates in hardware—and vice versa. Hardware is the yin to software’s yang. Each has to innovate to keep up with the other’s demands. The flow of data requires software and hardware to enable it—in tandem. Experience designing hardware offers insights into how data should be processed. And lessons from the software world should inform hardware design.
The problems that open source tackles are the same that data storage solutions take on. Consider several challenging areas: the rise of the multicloud, data flow, access to data, observability, and security.
Multicloud: As many enterprises shift from public cloud to multicloud, they still expect the features of public cloud in hybrid clouds. Open source projects—like Apache Hadoop and Ceph, which enable scale-out storage—innovate to power private clouds with compute and storage deployment.
How can hardware play a role? By enabling scale-out software ecosystems for private cloud with workload-optimized clusters. If the application needs lower latency, an all-flash-array powered by SSDs is the right solution. If the private cloud requires massive storage, the hardware architecture allows configurable disaggregated building blocks.
Data Flow Issues: Given the rise of the edge, the IoT, and other tech, data is exploding from edge to core. In 2025 the datasphere will reach 175ZB. Where and how to store and process all this data? Open source software offers building blocks that allow infrastructure architects to develop application-optimized solutions. Examples include solutions that allow streaming of data (e.g., Kafka), those that inject the data for analysis (e.g., Hive), and those storing data in OS-powered databases (e.g., Redis).
What does this mean for hardware? Which building blocks are combined to ingest desirable data, at what rate, and what tools are used for analysis? All this has a bearing on how the compute and storage components are configured. To facilitate an organic growth of cloud infrastructure, a compossible and disaggregated approach will yield an efficiency of resources (as opposed to a hyperconverged architecture that leaves valuable resources stranded).
Access to Data: As data needs increase exponentially, the access to data grows more important. As the capacity on disks increases to provide the density needed given the demand, so does the need for greater speed of reading and writing the data—while keeping the overall costs down.
How’s hardware helping? Researchers are innovating the NAND technology in order to reduce cost while allowing the same level of latency and bandwidth. Technologies like the dual actuator are providing higher IOPS to the higher-capacity devices. This provides options to architects, enabling them to configure systems that match the needs of various applications.
Observability: Anotheraspect of software-hardware integration is the need for information about the system. The software megatrend is to orchestrate and then manage the multicloud infrastructure autonomously. Container orchestration ecosystems such as Kubernetes (powered to declare infrastructure-as-code) integrate with mature open source tools such as Prometheus to innovate autonomous manageability.
And in the hardware world? Observability of factors such as temperature and vibration means information that can drive value. Hardware innovations can drive better data telemetry (observable metrics) by creating easy-to-use tools for AI to reduce manual interventions and preempt irregularities. Enterprise devices can expose open logs that provide much more granular information. Field Accessible Reliability Metrics is one such log that gives insight into the hard drive health.
Security: With regulations like GDPR and the California Consumer Privacy Act, needs grow for better management of the provenance, flow, compute, and storage of data. This creates an affinity with open source solutions because the openness of the source promotes trust.
How does this show up in hardware? Take RISC-V. It’s an open instruction set for electronics focused on low cost, low power, and high security, which allows companies to leverage and develop electronics architectures faster through a shared model.
There you have it: Whether challenges concern security, observability, access to data, data flow, or multicloud, hardware must pay attention to the software world.
Because when it comes to the business
of data, hardware and software are in it together.