SHARE
Facebook X Pinterest WhatsApp

Summit Explores Open-Source Data Issues and Developments

thumbnail
Summit Explores Open-Source Data Issues and Developments

software development concept - open source word cloud on a binary computer screen background

Summit speakers discussed the open-source data ecosystem and its role in modern businesses.

Nov 25, 2023

Data infrastructure is critical as data volumes continue to explode and as businesses try to get more value and insights out of their data. Open-source technologies and solutions continue to play ever-important and essential roles. These and other themes were the focus of the recent Open Source Data Summit.

According to Onehouse, one of the summit’s sponsors, the live virtual event attracted thousands of registrants from around the world and included more than 30 speakers.

Onehouse Founder and CEO Vinoth Chandar kicked off the day with a keynote address that provided an overview of the role of open source in data infrastructure. Chandar discussed the history of open source and provided an overview of the different tools and technologies in the open data ecosystem, including databases, data lakes, data warehouses, stream processing, and more. Chandar emphasized the need for a thoughtful strategy when adopting open-source data solutions and highlighted the challenges and considerations involved. The talk concluded with a discussion of a blueprint for an open data architecture that offers flexibility, interoperability, and control.

The keynote presentation can be viewed on-demand here.

See also: Open Source and the Data Lakehouse: Apache Iceberg and Project Nessie

Industry leaders talk open-source data

The one-day summit included speakers from Netflix, Uber, Walmart, LinkedIn, Tesla, Wayfair, Google, Microsoft, and more.

One particularly interesting session covered OneTable, a new open-source project that “unlocks omni-directional interoperability between the popular lakehouse projects Apache Hudi, Delta Lake, and Apache Iceberg.” Speakers included Ashvin Agrawal, Senior Researcher at Microsoft; Tim Brown, Engineering at Onehouse; and Anoop Johnson, Senior Staff Software Engineer at Google.

According to the speakers, OneTable offers lightweight conversion mechanisms that can take a source metadata format and sync it into one or more target metadata formats. The session featured a live demo, and participants described how to build open data foundations that could accelerate workloads into a variety of open-source query engines, including Spark, Presto, Trino, Flink, and more. The session is available on-demand here.

See also: Are Data Lakehouses the Panacea We’ve Been Waiting For, Or Is There Something Better?

Other sessions included talks by:

  • Jordan West, Staff Software Engineer at Netflix, on the practicalities of deploying open-source databases.
  • Patrick McFadin, VP of Developer Relations at DataStax, on A petabyte-scale vector store for generative AI.
  • Ankur Ranjan, Data Engineer III, and Ayush Bijawat, Senior Data Engineer, both from Walmart, on enabling Walmart’s data lakehouse with Apache Hudi.
  • Tun Shwe, VP of Data at Quix, and Jay Clifford, Developer Advocate at InfluxData, on data plumbing basics: Build, deploy, and scale ML models for your time series data.
  • Nishith Agarwal, Head of Data & ML Platforms at Lyra Health, on making decisions that are right for your data platform.
  • Siddharth Jain, Senior Engineering Manager at Wayfair, on options for real-time data pipelines.

In addition to these and other sessions, there were several panel discussions sprinkled throughout the day. One focused on batch, streaming, and real-time data processing for ML, with speakers from Eastern Bank, Intuit, and Tecton. Another examined the growing role of open-source technology in today’s data architectures. There were speakers on this panel from Onehouse, Microsoft, Confluent, LinkedIn, Starburst, Uber, and Google.

A complete list of the sessions and panels, all of which are available on-demand, can be found at the summit’s website here.

thumbnail
Salvatore Salamone

Salvatore Salamone is a physicist by training who writes about science and information technology. During his career, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.

Recommended for you...

Designing Data Pipelines for Scale: Principles for Reliability, Performance, and Flexibility
Luis Millares
Dec 19, 2025
Why Most Data Monetization Efforts Fail: How ISVs and SaaS Platforms Can Finally Get It Right
JJ McGuigan
Dec 17, 2025
Why Data, Not Tech, Drives Digital Transformation
Mark Cusack
Nov 19, 2025
2025 Cloud Database Market: The Year in Review
RTInsights Team
Nov 13, 2025

Featured Resources from Cloud Data Insights

The Difficult Reality of Implementing Zero Trust Networking
Misbah Rehman
Jan 6, 2026
Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Why Network Services Need Automation
The Shared Responsibility Model and Its Impact on Your Security Posture
RT Insights Logo

Analysis and market insights on real-time analytics including Big Data, the IoT, and cognitive computing. Business use cases and technologies are discussed.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.