Microservices will be enabled by Kubernetes container orchestration software which could change how next-generation AI applications are built.
IBM today unveiled an analytics platform based on an in-memory database and a microservices architecture enabled by Kubernetes container orchestration software that promises to transform how next-generation AI applications are built, deployed and consumed.
Capable of ingesting data at a rate of one million records per second, the IBM Cloud Private for Data platform is intended to enable analytics infused with AI capabilities to surface insights in real time, says Rob Thomas, general manager for IBM Analytics.
“We can ingest 250 billion records a day,” says Thomas.
The IBM Cloud Private Data platform makes extensive use of Docker containers to both capture data in a standard format and expose analytics as a set of easier to consume microservices.
Data is fed into the IBM Cloud, where an instance of Kubernetes provides a container orchestration capability that feeds data into a Db2 Event Store in-memory database. Thomas says the IBM Cloud Private Data platform essentially provides a framework for ingesting and analyzing data in real-time using an event-driven architecture based on an instance of the Apache Spark in-memory computing framework and the Apache Parquet Data Format to inform AI models constructed using various types of machine and deep learning algorithms.
IBM then makes available an additional suite of complementary tools and services, including IBM Data Science Experience, Information Analyzer, Information Governance Catalogue, Data Stage, Db2 relational databases, and Db2 Warehouse.
Making data scientists’ life easier
Initially planned to be available in the second quarter of this year, IBM plans to take advantage of the portable nature of Kubernetes to make the IBM Cloud Private Data platform available in the future on other public and private clouds. In fact, the IBM Cloud Private Data platform is one of the first large-scale analytics frameworks to be deployed relying on a microservices architecture enabled by Kubernetes.
To make it simpler for data science teams to navigate that data IBM has focused on creating a graphical user experience designed to foster collaboration between teams of data scientists. In addition, IBM today also announced the formation of the Data Science Elite Team, a free consultancy dedicated to solving real-world data science problems as they apply to AI applications. The IBM Data Science Elite Team has 30 members today, which IBM says it is committed to expanding to over 200 specialists.
In general, Thomas says AI models in the future will be able to more easily adjust responses to events as data is ingested in real time. In addition, organizations will be able to more easily apply different AI models as events occur, thereby being able to flexibly take advantage of different AI models as business circumstances warrant.
The biggest immediate issue organizations face today in terms of achieving that state of AI nirvana is basic data management. Algorithms require access to massive amounts of data that is then used to train the AI model. Without some mechanism for exposing algorithms to large quantities of data that is continually refreshed the ability of the algorithms to learn becomes hampered. Docker containers provide a mechanism for capturing data that can then be programmatically ingested using a standard set of application programming interfaces (APIs).
Of course, ingesting the data is only the first step. The heavy lifting then shifts to processing and analyzing data and ultimately storing it. Because of the immensity of that task, IBM and other rivals are betting most AI models fueled by advanced analytics will most often be deployed on public clouds that reduce the cost of both processing and storing all that data.