How an online retailer of stock media turned to a system to manage Big Data and enable real-time analytics on hardware, network traffic, and transactions.
Name of Organization: Shutterstock
Industry: Digital media
Location: New York, NY USA
Business Opportunity or Challenge Encountered:
If you’ve ever visited a media website, such as an online magazine, your eyes were likely drawn to the photo accompanying the article. The odds are the author or the publisher did not take the photo him or herself—rather, it was likely to have been purchased through an online photo agency that helps ensure that the photo’s rights are properly secured, and there has been proper compensation made to the original creator. The web is a very visual place yet, in its Wild West atmosphere, oversight services are needed to ensure that creative works are not abused or illegally copied.
“Shutterstock is a two-sided marketplace where creative folks, videographers, photographers, and people generating music can take the great content they produce, upload it to us, so we are able to provide that to customers who are using it for creative works or advertisement or other types of media they need to produce,” Chris Fischer, vice president of technology operations at Shutterstock, explained in a video.
For one of the leading online content providers, this means have an always-available system to serve customers and their readers from across the globe. Shutterstock offers a searchable library of more than 47 million photos, vector graphics, videos and more to download and purchase immediately.
The challenge was to be able to keep these services readily available on a moment’s notice. Uptime can make or break a business that exists on the Internet. The company’s administrators needed to be able to sift through the 20,000 data points being collected per second, and monitor, in real time, site performance, bandwidth, number of users, and other metrics. Shutterstock did not have any systems that supported real-time data analysis at the volume required to make the best use of their data. The company only received alerts after the fact.
“We have a pretty unique challenge,” says Fischer. “We’re collecting tens of thousands of metrics every second. When you’re collecting that volume of data it’s very tough to be able to select or query that dataset.”
How This Business Opportunity or Challenge Was Met:
As Shutterstock’s business and traffic volume grew, it recognized that its cobbled-together systems were no longer effective. The company began to shift to a software-defined network to increase the volume of data traffic moving between three data centers.
The company implemented a database system designed to not only manage big data processing and analytics, but also serve as a real-time monitoring platform. The solution, MemSQL, runs in conjunction with Shutterstock’s existing infrastructure across three data centers to monitor several thousand nodes. The database environment is employed to help aggregate counts of daily visitors to fuel algorithms that help Shutterstock to improve its search functionality and build new online tools.
Currently, Shutterstock has a rack of 16 MemSQL nodes, and each node has 256 GB of memory, notes Alex Woodie in EnterpriseTech. The data flow provides real-time updates on the enterprise’s CPU, disk, and RAM utilization, as well as “inbound and outbound network traffic; concurrent user count and failed authorization attempts; pictures uploaded and downloaded; API utilization; and credit card transaction rates and revenue per minute,” Woodie writes. Data is maintained for a month to provide performance benchmarks.
Measurable/Quantifiable and “Soft” Benefits:
As a result of the Big Data-handling capabilities, Shutterstock is now able to track real-time performance through an in-depth visual dashboard. “We’re able to query those datasets and make intelligent decisions very quickly on very big volumes of data,” says Fischer. “It’s becoming a real competitive differentiator to be able to make sense of the volume of data that we have—to run more tests faster, and ultimately determine what our customers are doing on our site.”
The new environment also provided Shutterstock with “tons more insight into lots of real time metrics,” Fischer continues. “Immediate is just something we couldn’t do before. For us, it has changed the way we think about our operational data. Our real-time details really brought our team closer to what’s happening on our site right now, versus what has happened.”
As a result of the implementation, Shutterstock is now able to monitor thousands of servers and can view trends and anomalies with real-time time series visualization. With the presence of a real-time monitoring system, the company says it also saves “thousands of dollars per year” by preventing outages and downtime.
(Sources: MemSQL, BizTech, EnterpriseTech)
Want more? Check out our most-read content:
Anticipatory Sensors: Your Office’s Sixth Sense
The Big Data Continuum: From Data Scientists to Empowered Business People
Frontiers in Artificial Intelligence for the IoT: White Paper
Building Energy Management: An Optimization Challenge
Liked this article? Share it with your colleagues using the links below!