Real-time analytics of granular information about Summit’s second-to-second operations are used to optimize the system’s performance.
Summit, the world’s fastest supercomputer, has been outfitted with real-time streaming analytics, courtesy of HPC consultancy firm Providentia Worldwide.
Speaking at the 2019 HPC User Forum, the team at Oak Ridge National Laboratory (ORNL) said they are now able to track metrics in real-time, with robust visualizations of the computer’s individual component power usage, and temperatures inside each node.
From this, Summit is able to recognize any surges in temperature or usage and act immediately. They will also be able to optimize job scheduling, by finding the coolest area for a job in real-time.
“There’s no more looking at databases for data. There’s no more waiting until tomorrow to look at the data. It’s basically real-time data. What you see right now is what’s happening right now,” said Arno Kolster, co-founder of Providentia Worldwide. “The largest supercomputer in the world is now being micromanaged by microservices – a cloud thing.”
Speaking to HPC Wire, Kolster said that phase two of Summit’s analytics infrastructure could bring even more benefits. If it were to utilize predictive analytics, the system could automatically schedule jobs to cooler areas, based on historical and real-time metrics.
“You hear about predictive analytics, prescriptive analytics – the basic problem right now is that everyone’s reacting to things instead of being proactive about it. And so we’ve always been more, you know, ‘You’ve got the machinery, you’ve got the computers, you have the analytics – let’s be more proactive about how things are working,’” said Kolster.
IBM, Nvidia, and Mellanox won a $325 million contract in 2014 to build Summit, which was completed late next year. It was recorded as the fastest publicly ranked supercomputer in the world in November 2018, surpassing 150 petaflops. As of this June, it was still the world’s fastest supercomputer. The ORNL team is working on an additional supercomputer, named Aurora, with completion anticipated in 2021.