Microsoft’s device management program allows it to monitor any device with an IP address, including for remote monitoring, predictive maintenance and active alerts.
Name of Organization: Microsoft
Location: Redmond, WA USA
Business Opportunity or Challenge Encountered:
As one of the largest tech companies and employers on the globe, Microsoft has a lot of devices under its roofs. With more than 800 buildings across 190 countries/regions, the company needed a way to assess the health of more than 25,000 monitoring devices in real time. Such devices include security cameras, panic buttons, alarms, access control systems, and sensors – designed to assure physical security and productivity, while protecting company assets.
As detailed in a recent case study published by the company, its approach to managing its smart devices was “more reactive and less holistic. We’d find out about device failures after they occurred, and didn’t have good metrics on device uptime.”
The company’s IT managers needed a platform and telemetry for real-time strategic, forward-thinking, and risk-focused decision-making. “This is especially true with life-safety devices, where time is of the essence and failure requires immediate repairs.” The challenge was there was not a “comprehensive platform for monitoring and maintaining devices, and doing lifecycle management on them.” The platform Microsoft was using “often inaccurately reported that a device was working normally, when it had actually stopped working.”
Microsoft Global Security and Microsoft IT needed to better take advantage of the Internet of Things and monitor the health of its devices by employing data analytics on performance and likelihood of failure, proactively manage risk and perform predictive maintenance, and boost device uptime. The need was urgent, as Microsoft expected the number of workplace devices to grow to 40,000 in the next three years
How This Business Opportunity or Challenge Was Met:
To enable real-time device management and monitoring, Microsoft teams assembled a solution employing the company’s own technology, including Microsoft System Center 2012 R2 Operations Manager to monitor its far-flung network of IoT network devices. The system was designed to monitor servers and switches, but was greatly customized to monitor IoT devices. The setup also required about 5 TBs of storage for four regions.
Within System Center, the team employs Operations Manager, Configuration Manager, and Service Manager, hosted in Microsoft Azure IaaS (infrastructure as-a-service), which handles communication and data transit. Monitoring infrastructure devices such as access control systems, cameras, and video recorders helps administrators with reporting, gathering business intelligence, and controlling our overall costs.
With today’s smart devices, almost all security devices in Microsoft can report on their health. System Center was programmed to gather and track this data, and to give alarms that help administrators detect early device failure. The solution represents one of the first deployments with a ticketing system to track and monitor maintenance issues from security devices end to end, the Microsoft case study reports.
“We see device health, active alerts in buildings, device description, location, model numbers, performance dashboards, and other information. We can also create and view rules, device groups, and health and availability monitors, and create related reports.”
The system also stores and forwards incidents, active alerts, and service requests, is used to track the daily work of teams, and the company’s support team uses it as part of their ticketing system. The development team also plans to enable the self-healing of devices” “when a device experiences a particular interaction of factors, it will take specified actions to fix the problem.”
Benefits From This Initiative:
By collecting vast amounts of data from its IoT security devices, the Microsoft team reports that it is able to perform remote device monitoring, preventative/predictive maintenance throughout device lifecycles, provision and manage device security, and use the data to accelerate insights and improve time to results.
Microsoft teams can now monitor any device with an IP address, the case study reports. “We also quickly identified at least five cameras that were down, and fixed them. We pooled our large inventory of 25,000 devices with details like device manufacturer, model number, and product description. We tested several devices in all device categories—like recording engines, security cameras, card readers, and intrusion panels—to show the telemetry and reporting that are possible in the next phase. And we’re tweaking device configurations to filter out noise. Team members have begun using the system and creating tickets to track issues.”
The team plans to expand its IoT capabilities to get even more data from the devices. “We’ll also add processes, do deep analysis, and make strategic decisions like how we use more analytics with Microsoft Power BI, and define what we want to see in our dashboards/reports,” Microsoft stated. One focus will be “how we use the dashboards to give our leadership visibility into the holistic environment, showing them what increases our ticket-response time, and how to help our team track priorities.”
The team is also exploring adding more automation. “For example, if a device hasn’t been pinged in a specified amount of time, we’ll automatically send a restart command to the switch. There won’t even be a ticket for a technician to work on until that happens. We won’t have a technician see if a camera’s offline or have someone restart it. One instance like this could save us about 20 minutes because we won’t need to sign in to a machine or check the camera.”
The system provides the Microsoft teams with data on active alerts in buildings, the health state of video recorders and other devices. They also receive summaries and performance dashboards. Trending analysis also helps the teams avoid outages by looking at performance trends. They also have learned how many early failure indicators caused them to revise maintenance schedules or reroute technicians based on critical, potential outages.