Real-time Analytics News for the Week Ending December 2

PinIt

In this week’s real-time analytics news: Amazon Web Services and its partners made many announcements at the annual re: Invent conference.

Keeping pace with news and developments in the real-time analytics market can be a daunting task. Fortunately, we have you covered with a summary of the items our staff comes across each week. And if you prefer it in your inbox, sign up here!

Amazon Web Services (AWS) held its annual re:Invent conference this week. One report said the company made “at least 192 announcements” at the event. Some of the most relevant ones for those working with real-time data, applications, and systems include:

Four new zero-ETL integrations that enable users to connect and analyze data without building and managing complex extract, transform, and load (ETL) data pipelines. They include new Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Relational Database Service (Amazon RDS) for MySQL integrations with Amazon Redshift, making it easier to connect and analyze transactional data from multiple relational and non-relational databases in Amazon Redshift. Additionally, customers can also now use Amazon OpenSearch Service to perform full-text and vector searches on DynamoDB data in near real time.

Five new capabilities within Amazon SageMaker to help accelerate the building, training, and deployment of large language models and other foundation models. The announcements include a new capability that enhances SageMaker for scaling models by accelerating model training time. Another new SageMaker capability optimizes managed ML infrastructure operations by reducing deployment costs and latency of models. AWS also introduced a new SageMaker Clarify capability that makes it easier to select the right model based on parameters that support the responsible use of AI. To help customers apply these models across organizations, AWS also introduced a new no-code capability in SageMaker Canvas that makes it faster and easier for customers to prepare data using natural-language instructions.

Three new serverless innovations across its database and analytics portfolio that make it faster and easier for customers to scale their data infrastructure. One announcement introduces Amazon Aurora Limitless Database, a new capability that automatically scales beyond the write limits of a single Amazon Aurora database. Another is Amazon ElastiCache Serverless, which helps customers create highly available caches in under a minute and instantly scales vertically and horizontally to support customers’ most demanding applications. AWS also released a new Amazon Redshift Serverless capability that uses AI to predict workloads and automatically scale and optimize resources.

An expanded strategic collaboration with NVIDIA to deliver the advanced infrastructure, software, and services to power customers’ generative AI efforts. The companies will bring NVIDIA’s newest multi-node systems featuring next-generation GPUs, CPUs, and AI software together with AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability.

In addition to these AWS announcements, many of the company’s partners and other vendors used the conference to introduce other solutions and offerings, including:

Armory announced a new unified declarative deployment capability for AWS Lambda. This streamlines deployment workflows by enabling the configuration of Lambda deployments through the same interface used for Kubernetes.

Couchbase announced a new Capella columnar service on Amazon Web Services (AWS), enabling organizations to harness real-time analytics to build adaptive applications. Capella columnar is a new service that introduces a columnar store and data integration into the Capella Database-as-a-Service (DBaaS), thereby allowing for real-time data analysis on the same platform as operational workloads.

Cribl announced that Cribl Edge, its scalable edge-based data collection system, is now available in AWS Marketplace as an Amazon Elastic Kubernetes Service (Amazon EKS) add-on. As part of Cribl’s relationship with AWS, developers can seamlessly share Amazon EKS data between security and operations teams, optimize observability data collection, and route data to multiple destinations.

Dremio announced AI-powered data discovery capabilities that accelerate and simplify data contextualization and description for analytics. Additionally, with this release, Dremio makes it simple for companies to adopt Apache Iceberg through one-click command ingestion into Iceberg tables. Dremio can seamlessly convert raw data (in JSON, CSV, and Parquet formats) from data lakes, relational databases, data warehouses, and NoSQL databases into Apache Iceberg in the cloud and on-premises.

Fivetran announced support for Delta Lake on Amazon Simple Storage Service (Amazon S3), further broadening its support for Amazon S3 as a data lake destination. This means Fivetran customers can land data in Amazon S3 and easily access their Delta Lake tables.

IBM announced that it has been working with AWS on the general availability of Amazon Relational Database Service (Amazon RDS) for Db2, a fully managed cloud offering designed to make it easier for database customers to manage data for AI workloads across hybrid cloud environments. For customers moving to AWS, Amazon RDS for Db2 can help them migrate their existing, self-managed Db2 databases to the cloud.

Immuta announced a new native integration with Amazon S3 Access Grants that will play a key role in S3 security. With this integration, Access Grants shifts how data security teams approach governance and significantly decreases the manual effort required for data teams to access available unstructured data currently sitting in data storage layers in order to power the AI models.

Informatica announced three new integrations with Amazon Web Services (AWS). First, with Amazon Bedrock now generally available, Informatica has developed deeper integrations with the generative AI service. Second, Informatica earned AWS certification for IDMC integrations with AWS HealthLake, a HIPAA-eligible service that provides Fast Healthcare Interoperability Resource (FHIR) APIs. And third, Informatica has been named a Launch Partner for Amazon S3 Access Grants, a new Amazon Simple Storage Service (Amazon S3) access control feature that helps customers manage Amazon S3 permissions for their data lakes at scale. 

Matillion announced the addition of generative AI functionality to its flagship Data Productivity Cloud using Amazon Bedrock. Specifically, Matillion adds generative AI integration with a prompt component supporting Amazon Bedrock to enable users to operationalize the use of LLMs inside the data pipeline to address intelligent data integration tasks, including data enrichment, data quality, and data classifications.

MongoDB announced plans to integrate MongoDB Atlas Vector Search with Amazon Bedrock to enable organizations to build next-generation applications on Amazon Web Services (AWS). MongoDB Atlas Vector Search uses an organization’s operational data to simplify bringing generative AI and semantic search capabilities into applications for highly engaging and customized end-user experiences.

Precisely announced that its Data Integrity Suite has achieved the Amazon Redshift Service Ready designation. Customers can seamlessly replicate data from on-premises systems to Amazon Web Services (AWS) in near real-time while ensuring that data is accurate, consistent, and contextualized for analytics and other use cases.

Qlik announced an integration with Amazon Web Services (AWS) to help customers embrace and scale the power of Large Language Models (LLMs) and generative AI. With its integration with Amazon Bedrock, Qlik Cloud users can now easily leverage natural language to create new AI-driven insights on AWS with trusted and governed LLMs such as AI21 Labs, Anthropic, Cohere, and Meta; Amazon Bedrock is a fully managed service that makes foundation models (FMs) from leading AI companies accessible via an API to build and scale generative AI applications.

Qumulo announced Global Namespace, a software solution that scales to exabytes anywhere unstructured data is needed, and seamless integration with AWS-based Qumulo file storage. With these new capabilities, Qumulo gives infrastructure owners more choices to store and manage their file data in AWS without compromising enterprise features.

Salesforce has expanded its partnership with AWS,deepening product integrations across data and artificial intelligence (AI) and, for the first time, offering select Salesforce products on the AWS Marketplace. Salesforce will expand its use of AWS, including compute, storage, data, and AI technologies through Hyperforce to further enhance popular services like Salesforce Data Cloud.

Solo.io announced it has achieved the Amazon Elastic Kubernetes Service (Amazon EKS) Ready designation from Amazon Web Services (AWS). This specialization recognizes that Gloo Products by Solo.io are technically validated by AWS Partner Solutions Architects for architectural soundness and adherence to AWS’s best practices for Amazon EKS and Amazon EKS Anywhere.

Starburst announced new features in Starburst Galaxy to help customers simplify development on the data lake by unifying data ingestion, data governance, and data sharing on a single platform. Starburst has added support for near real-time analytics with streaming ingestion, automated data governance, automated data maintenance, universal data sharing with built-in observability, and self-service analytics powered by AI.

Teradata announced that Teradata AI Unlimited is now available in private preview on Amazon Web Service (AWS) through AWS Marketplace. Teradata AI Unlimited is Teradata’s serverless artificial intelligence and machine learning (AI/ML) engine designed to let users explore new use cases on demand, using data at scale.

VAST Data unveiled its newest software release, version 5.0, extending the capabilities of the VAST Data Platform with Amazon Web Services (AWS) to enable cloud cost savings. As such, enterprises can now consolidate all of their structured and unstructured data management into a unified data platform that can be deployed across private clouds and all AWS regions.

Real-time analytics news in brief

Hewlett Packard Enterprise (HPE) announced a series of AI-native and hybrid cloud offerings for machine learning development, data analytics, AI-optimized file storage, AI tuning and inferencing, and professional services. The solutions are all delivered based on an open, full-stack AI-native architecture that incorporates a curated mix of software and infrastructure designed specifically to accelerate the AI lifecycle.

In otherHPE news, the company announced an expanded strategic collaboration with NVIDIA to build an enterprise computing solution for generative AI (GenAI). The co-engineered, pre-configured AI tuning and inferencing solution enables enterprises to quickly customize foundation models using private data and deploy production applications anywhere, from edge to cloud.

Accenture unveiled a comprehensive set of new services designed to help companies customize and scale the value of generative AI. The new set of Accenture gen AI services includes a proprietary gen AI model “switchboard,” customization techniques, model managed services, and specialized training programs.

Airbyte announced availability of certified connectors for MongoDB, MySQL, and PostgreSQL databases, enabling datasets of unlimited size to be moved to any of Airbyte’s 68 supported destinations that include major cloud platforms (Amazon Web Services, Azure, Google), Databricks, Snowflake, and vector databases (Chroma, Milvus, Pinecone, Qdrant, Weaviate), which then can be accessed by AI models.

Altair announced updates to Altair RapidMiner, its data analytics and AI platform. Updates include auto-clustering, expanded SAS, Python, and R coding capabilities, and more. The updates also include advanced tools for integrating LLMs into business applications, as well as expanded AutoML and No-Code development features to make the solution easier to use.

Astronomer announced a new set of Apache Airflow integrations to accelerate LLMOps (large language model operations) and support AI use cases. Modern, data-first organizations are now able to connect to the most widely used LLM services and vector databases with integrations across the AI ecosystem, including OpenAI, Cohere, pgvector, Pinecone, OpenSearch, and Weaviate.

Datadobi announced the latest release of its StorageMAP platform. StorageMAP 6.6 adds enhancements that provide customers with a solution for analyzing, securing, and taking automated actions on all types of unstructured data at a massive scale. Specifically, StorageMAP 6.6 enhancements include StorageMAP support for object storage and richer file copy and file movement functionality.

Fluree unveiled its latest version, a new JSON-LD database, which is now in public preview with extensive cloud management support. The new version focuses on JSON-LD to enable composable, decentralized data management and offers a knowledge graph database with built-in policy, trust, and interoperability.

FusionAuth announced enhanced performance and scalability for webhook signing and search APIs. These improvements eliminate barriers for large-size customers by delivering frictionless authentication and user management for any application at any scale.

Hitachi Vantara today announced Pentaho+, an integrated platform from the Pentaho software business designed to help organizations connect, enrich, and transform operations with refined, reliable data necessary for AI and Generative AI (GenAI) accuracy. Automating the work of complex data management with powerful self-service and cloud-agnostic solutions, Pentaho+ helps improve data quality by allowing organizations to effectively oversee data from inception to deployment.

IBM announced that it has been working with Amazon Web Services (AWS) on the general availability of Amazon Relational Database Service (Amazon RDS) for Db2, a fully managed cloud offering to manage data for AI workloads across hybrid cloud environments. Amazon RDS for Db2 customers now have the option to modernize on-premises, on AWS, or to deploy a hybrid cloud architecture to optimize AI workloads.

KX announced the general availability of KDB.AI Server Edition, a highly-performant, scalable vector database for time-orientated generative AI and contextual search. Deployable in a single container via Docker, the KDB.AI Server offers a smooth setup for various environments, including cloud, on-premises, and hybrid systems.

NVIDIA announced a generative AI microservice that lets enterprises connect custom LLMs to enterprise data to deliver highly accurate responses for their AI applications. NVIDIA NeMo Retriever, a new offering in the NVIDIA NeMo family of frameworks and tools, helps organizations enhance their generative AI applications with enterprise-grade retrieval-augmented generation (RAG) capabilities.

Orion Governance announced that it has released new capabilities in its Enterprise Information Intelligence Graph (EIIG). The advanced features offer enterprise clients automated near real-time detection of metadata changes and instantaneous notifications. Specifically, with the incorporation of AI and machine learning, EIIG facilitates the automatic capture of comprehensive changes, encompassing database system reloading, table additions and removals, alterations or nullifications of column data types, modifications in view definitions, and adjustments in ETL transformations, report definitions or lines of code.

Voltron Data introduced Theseus, a state-of-the-art distributed execution engine built to solve today’s data processing challenges at a scale beyond the capabilities of CPU-based analytics systems like Apache Spark. Theseus is available to enterprises and government agencies as well as through partners – HPE is the first partner to embed Theseus as its accelerated data processing engine as part of HPE Ezmeral Unified Analytics Software.

Partnerships, collaborations, and more

Confluent announced that Confluent is now available on the SAP Store. As such, Confluent integrates with SAP Datasphere and delivers a secure, governed solution for accessing SAP data as fully managed data streams for customers.

Intel announced a new collaboration with Databricks to merge Intel Granulate’s autonomous, continuous optimization solutions with Databricks’ Data Intelligence Platform under the Databricks Partner Program. The collaboration intends to enhance performance, reduce costs, and increase efficiency across data management operations.

SQream announced that SQream is available on Oracle Cloud Marketplace and can be deployed on Oracle Cloud Infrastructure (OCI). SQream splits large tasks into smaller processes, distributing operations between multiple NVIDIA GPU and CPU cores and enhancing data analytics experiences for Oracle customers across industries.

If your company has real-time analytics news, send your announcements to [email protected].

In case you missed it, here are our most recent previous weekly real-time analytics news roundups:

Salvatore Salamone

About Salvatore Salamone

Salvatore Salamone is a physicist by training who has been writing about science and information technology for more than 30 years. During that time, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.

Leave a Reply

Your email address will not be published. Required fields are marked *