Digital transformation is more than an aspirational phrase. It is a high-level, strategic initiative for many organizations today, one with C-level sponsorship and visibility. Digital transformation aims to leverage emerging digital technologies in the areas of data and analytics to streamline customer and partner interactions across such diverse areas as sales, support, development, and supply-chain logistics.
Often, such transformations proceed in three phases: First by improving
internal business processes, next by improving external interactions, and
finally by introducing new business models that enable the company to monetize its
assets in new ways. It is not easy to begin in the reverse order and introduce a
new business model without first improving business processes. This is because
the first step requires that companies take a good hard look at their IT
systems to ensure that different systems can interoperate and talk to each
other via capabilities such as application programming interfaces (APIs).
See also: Step up Your Digital Transformation: It’s Now a Means of Survival
APIs have been with us for decades, but within the last several, they
have been used in increasingly powerful ways. APIs expose the data or the
functionality of a specific system and make this information available to other
systems. APIs exist within an ecosystem or economy, and, in the last ten years,
entire businesses have been established on top of APIs.
Consider Uber. We think of the company as having introduced a
brand-new business model, and it most certainly has. But what really sets Uber apart
is that APIs have enabled the company to grow much more quickly than it would
have grown otherwise. Uber can focus on its core business, connecting drivers
with riders, while letting APIs handle all of the “peripheral” activities such
as tracking cars, processing payments, integrating data with a phone, sending
SMS messages to tell passengers that their driver has arrived, or sending out transactional
emails. Uber has not had to build its own mapping technology, as it uses Google
for Android or MapKit for iOS. Similarly, Uber is integrated with Braintree for
all billing-related activities.
Instead of simply consuming information through APIs, Uber is now
also exposing them. Today, the Company works with a network of travel and
hospitality partners, which integrate their applications via APIs. Once
passengers arrive at their destination airport, airlines can use their apps to
offer them an Uber. In this way, Uber has become a platform that extends the
company’s business model in a very important way. According to Fortune magazine,
the top 5 companies of 2018 based on market capitalization were
all platform companies: Apple, Google, Amazon, Microsoft, and Facebook. Just a
decade before, none of the top 5 companies (Exxon, General Electric, Microsoft,
AT&T, and Proctor and Gamble) were. Note that in 2008, Microsoft was still
simply a software company and was not yet a platform company.
Two key API trends to consider: RESTful APIs and microservices
In the 2010s, RESTful APIs became the de facto standard for web services. REST is not a detailed specification but a paradigm or a way of working. With RESTful APIs, the many potential operations are reduced to just a handful, which are the HTTP methods such as get, post, and put. Rather than having to deal with the complexity of the SOAP-based APIs, developers can simply use get statements. In addition to being inherently simple, RESTful APIs are also extremely lightweight, so they can be easily and flexibly deployed. Protocols have been built on RESTful APIs, such as OData, critical for Microsoft products like SharePoint, and OAuth and SAML for authentication. And now we are beginning to see other emerging protocols, like OpenAPI, GraphQL, and Swagger.
In a microservices architecture, applications are built up by
composing and integrating microservices, which easily interface, integrate, and
communicate with each other using RESTful APIs. In this way, one can think of
microservices architecture as “service-oriented architecture (SOA) done
right.” SOA was popular at the start of the millennium as the way to
introduce efficiency by reducing data-integration complexities. Yet, the
combination of SOA, services, and the servers they ran on, grew to be the way
to build large applications. Monolithic application servers hosted services, so
scalability was an issue. Microservices, in contrast, are extremely light and
independent.
However, when companies adopt a microservices architecture, DevOps
has to potentially deploy and manage hundreds of thousands of microservices and
secure them. This led to a rise in the popularity of container technologies, such
as Docker and Kubernetes, which simplify and automate much of the management
process.
Picking up from where container capabilities leave off, API
management tools go on to address the exposure of APIs to the external world.
API management tools enable companies to securely publish select APIs, keep
others internal, and to determine exactly which groups or individuals are
authorized to access the data. Such tools were not as necessary in the SOA era,
but they are becoming more and more necessary in microservices architectures.
Data virtualization in the API ecosystem
Not all data APIs can be real-time data APIs. Traditional data integration techniques used in most digital transformation efforts rely on the physical movement of data from one place to another, where it can then be accessed by the API. But as we all know, sometimes data is delivered via scheduled batches rather than in real-time. Also, such techniques often cannot support modern data types like streaming data and data from social media feeds.
Data virtualization (DV) is a modern data integration approach
that provides real-time views of many different kinds of data without having to
move it to a new location. This technology is critical for real-time data APIs,
as it enables organizations to seamlessly expose their integrated, curated data
assets and data services as RESTful APIs, so they can be easily accessed by
external entities in real-time. These could be straightforward internal or
external data sets, or they could take other more sophisticated forms, such as open
government initiatives to share information with development partners. Data
virtualization enables phone apps to provide real-time data, such as tracking
information. With real-time data and APIs, developers are limited only by their
imaginations.
DV can support the API ecosystem in myriad ways, but it typically follows
three basic patterns:
1) Data virtualization as a service provider: Suppose we had a common microservices
architecture like the one described above in which an organization had several
microservices and exposed them, internally and externally, using an API
management tool.
Typically, microservices are not the only type of information that
an organization might want to expose. In this pattern, DV would be deployed
above the organization’s disparate data sources and provide views of the
combined data to an API gateway, which delivers the data to consumers. In
parallel with the data virtualization layer, the microservices would also
deliver their data to the API gateway. The API gateway would control what is
exposed, how different individuals and/or groups can access it, and how it must
be secured.
2) Data virtualization as the integration layer for microservices: In this pattern, the DV layer would be established above all data
sources, including the microservices, and that layer would provide views of the
combined data to the API gateway similar to the first pattern.
Here, the DV layer is integrating the microservices, treating them
as data sources, and combining views of their data with views from databases
and other applications, such as SAP ECC. The advantage of this pattern is that
it keeps the microservices very lightweight. They are no longer responsible for
such domains as security, auditing, and logging, as those tasks can now be
performed by the data virtualization layer itself.
3) Data virtualization as a data services layer for the microservices layer:In the final pattern, DV is established below the microservices
layer, acting as its data services layer. This abstracts away many of the
complexities surrounding how the microservices get their data, including such
details as to its location or required interface. Companies do not have to
build a JDBC stack into the microservices themselves, just so that they can
access data from an Oracle or SQL Server database, and they do not have to embed
SQL queries directly into their microservices.
The importance of the data
virtualization layer
Administrators do not need to worry about where the data comes
from; all of that is handled by the data virtualization layer, which can also
handle more complex requests, such as aggregations, displacements, or other
types of reports. Such an arrangement is reminiscent of Uber once again. Uber
focused on what it had to do and ignored the “periphery,” which was handled by
APIs. Similarly, in this pattern, the microservices developers can focus just
on what each microservice is supposed to do, without worrying about how the
microservices are going to get data. The data virtualization layer can take
care of that in a simple RESTful API call to get the data. This is a powerful pattern
because it captures one of the core benefits of the API ecosystem; removing
complexity and enabling developers to focus on what they are building.
There is an extension to this pattern as well. So far, we covered
fundamental data APIs, which are “read-only,” but there is also a write-back version
of this pattern. In the write-back version, data virtualization can support separate
APIs for writing and reading, which follows good command-query responsibility
separation (CQRS) practice; whereby, separate APIs for writing and reading are
established. CQRS is considered a best practice for microservices architecture
development.
Real-world example: A fast-food restaurant chain
Consider the case of a fast-food restaurant chain that chose to be
anonymous. This chain has restaurants across the United States and Canada and
had built a smart-phone app that enables customers from all locations to order
food for pickup or take out. Using the app, customers can find the closest restaurant,
see a menu, and even call up nutritional information for their choices. And all
this information is retrieved through RESTful APIs served up by a data
virtualization platform.
While menu information is stored in databases, Excel spreadsheets,
and other sources, the DV layer combines
it and publishes the combined data as an API. The resulting data is location-specific,
as the menu changes by geography. The queries and resulting output are fairly
complex, but all of that is hidden from the developers who write the apps. All
they have to do is make a call to a RESTful API to deliver the appropriate data
given the location, which is usually a restaurant identifier, and then all of
that information is available in the app. The app is now used by millions of
users, and the data is available without noticeable latency.
Data virtualization for digital transformation
If real-time data APIs underpin digital transformation, data
virtualization underpins real-time data APIs. Data virtualization offers a
reliable, flexible way to support real-time data APIs, no matter how complex the
infrastructure. Companies interested in digital transformation should take a
serious look at data virtualization as it might be the missing link to making
data readily available and scarce developer resources more available.