The AresDB analytics engine leverages graphics processing units (GPUs) for growth at scale.
Uber is rolling out a new analytics engine. AresDB is an open source, real-time engine that leverages graphics processing units (GPUs) for growth at scale. It’s designed to help unify and simplify Uber’s real-time analytics database solutions.
Key features of AresDB include:
Column-based storage: AresDB stores data in a columnar format and there are two stores, live and archive. The validity of any values within the columns is stored in a separate vector and the validity of the values is represented by one bit.
Real-time ingestion with upsert support: Clients take in data using the ingestion API and posting an upsert batch. This is a specialized format that reduces space needs but still keeps the data randomly accessible. AresDB can identify late records (those older than the pre-set archive cut-off time) and exclude them through the redo log. The late records are added to a backfill queue and the rest to the live store.
GPU powered query processing: AQL language, created by Uber, is necessary to run queries against AresDB. It uses JSON, YAML and Go objects. AresDB can oversee multiple GPU devices with its device management that models GPU resources in two dimensions, GPU threads, and device memory. The database helps users determine how many resources will be necessary to execute a query. As long as a GPU device is capable of providing all the resources required, AresDB can run one or several queries per device.
In the future, Uber is planning to add new features including a distributed design of AresDB for improved scalability and reduced operational costs, developer support and tools, query engine optimization and an expanded feature set.
AresDB is open sourced under the Apache License and is already in use.