Talend’s new bulk data uploader function helps enterprise users by making the sending and analyzing of warehoused data easier.
Moving data into the cloud is hard enough, but it turns out that moving data around and between cloud services is emerging to be just as big a headache. To address that challenge Talend has add a bulk data uploader for Microsoft Azure SQL Data Warehouse to the Talend cloud integration platform.
Available as part of the Talend Cloud Summer ’18, the bulk data uploader enable organizations to upload massive amounts of data that has been stored in either the Microsoft Azure Blob Storage or Azure Data Lake Store cloud services into the Microsoft data warehouse cloud service.
The goal is to make it simpler for IT professionals or end users to query massive amounts of data stored in the Microsoft data warehouse, says Vincent Lam, head of cloud product marketing at Talend.
“All they need to do is push an icon,” says Lam.
Microsoft at recent Microsoft Build 2018 conference demonstrated how a compute-optimized Gen2 tier of its cloud service leveraging caching enabled by Flash storage could be employed to run complex queries on 150 billion rows of data within in seconds. To make that level of real-time analytics the Talend bulk loader makes it possible to commit a global transaction against all the data at once rather than on having to run the same query against ever row. The bulk later also make it possible to cancel a committed transaction in the Azure SQL Data Warehouse.
Because the cost of storage is comparatively inexpensive more IT organizations are shifting data warehouse applications into public clouds. But once that data arrives in those clouds there’s still a need for tools to move data between cloud services as well as optimize queries, says Lam.
Naturally, these same data management and query challenges exist on other public clouds. The Talend cloud integration platform already provides support for Amazon Web Services (AWS) and Google Cloud Platform. Lam notes IT organizations should expect to see Talend make similar capabilities available for AWS RedShift as well as other platforms running on multiple other clouds.
Not every data warehouse is going to be automatically shifted into the cloud. There are a range of security and compliance concerns that will result in many data warehouses continuing to be deployed in on-premises IT environments. But as more data moves into the cloud, it has become apparent that IT organizations will be faced with a much wider range of data management challenges. Regardless of where data is stored the expectation is that will be available to analyze in real time. IT organizations should also expect to be streaming more data than ever into data warehouses to enable organizations to run analytics simultaneously against internal and external data sources.
While public clouds may be more flexible and simpler to access than on-premises IT environments, it’s also easy to underestimate the complexity associated with managing data in the cloud that is likely to be several orders of magnitude greater than any anyone ever imagined a data warehouse running on-premises would ever be.