SHARE

From Chip to Cloud: Understanding the Bottlenecks in Scaling AI Data Centers

Dark servers data center room with storage systems and digital graphs and charts 3D rendering

When scaling AI, rather than looking solely at the component level, organizations need to undertake system-level emulations that reflect the operating environment of an AI data center to optimize performance.

Written By

Marie Hattar

Mar 18, 2026

Investment in AI infrastructure shows no sign of abating, with Meta the latest to announce its latest hyperscale plans. As agentic AI is increasingly adopted, this will further strain data centers. Simply adding more capacity no longer suffices; providers need to scale faster and more efficiently. Bigger models, faster inference, and more efficient training all place new demands on wafers, chips, boards, servers, racks, data centers, and edge deployments. And when one layer is optimized, performance pressure shifts to another.

With the complexity and interconnectedness, building, orchestrating, and scaling is no easy feat. As infrastructure expands, rather than looking solely at the component level, providers need to undertake system-level emulations that reflect the operating environment of an AI data center to understand where pressure points will occur and shift to optimize performance.

Pressure Points
Chip to Board Transition
Board to Server
Server to Rack Transition
Rack to Data Center
Data Center to Edge
Managing Shifting Pressure Points
System Level Visibility

Pressure Points

AI infrastructure spans multiple layers, including pre-silicon design, wafer fabrication, chip integration, board assembly, server configuration, rack deployment, data center operations, and edge distribution. An issue with latency, throughput limits, thermal constraints, and synchronization failures can impact the entire system.

Understanding where these will occur requires visibility across the stack to see how components and layers interact when pushed beyond established limits to withstand AI scale. Below are some examples of how pressure shifts.

Chip to Board Transition

Chiplet architectures, where multiple semiconductor dies are integrated into a single package, extend the limits of bandwidth and signal integrity. At the chip level, validation focuses on die-to-die interconnects such as Universal Chiplet Interconnect Express, high-speed I/O measurements, and memory. However, improving chip performance creates pressure points at the board level.

AI stresses boards with higher frequencies, greater crosstalk risk, and tighter latency margins. Testing must detect electrical interference and noise. Once chip-level constraints are resolved, the focus shifts to maintaining signal quality across the high-speed connections linking GPUs, memory, and other components.

Board to Server

Once boards maintain signal integrity at scale, pressure shifts to how they function within servers. Optimizing individual boards doesn’t guarantee performance when integrated into complete systems. Protocol and interconnect analysis validate timing and throughput for AI workloads.

Power and thermal characterization ensure components can support AI traffic loads without slowdowns. A board that performs well in isolation may struggle when combined with other boards, storage, and networking components. Therefore, operators need to test that components are ready to handle current and future compute standards before deployment.

Server to Rack Transition

Optimizing servers increases compute density by having GPUs, CPUs, memory, and network connections work as a system. Server-level validation ensures the components can handle AI workloads; however, pressure switches to the rack level.

Latency and coordination become critical for the rack to scale. With thousands of GPUs synchronizing during AI model training, one slow connection can throttle the entire rack. Operators need to emulate these workloads to test switches and interconnects to ensure they can support the demands. Then they need to determine whether rack-level networking is up to the task.

Rack to Data Center

As AI workloads scale, what works at the rack level may fail at the facility level. Racks optimized for link speeds such as 800 Gigabit and 1.6 Terabit Ethernet can handle heavy server-to-server traffic, but aggregating hundreds of racks creates new constraints. Challenges include power distribution, thermal management, and optical transport between racks. Multi-terabit backbone links connecting racks must maintain reliability and performance. A single fiber issue or optical transport limitation can create bottlenecks that ripple across the entire facility. Data center operators must validate the end-to-end network performance under AI workload conditions before deployment. Once rack constraints are overcome, the next step is ensuring the optical infrastructure and network perform as needed.

Data Center to Edge

When AI inference moves to edge deployments, performance becomes a network engineering challenge. Unlike data centers, which are optimized for controlled conditions, the edge must account for wireless connectivity, spectrum conditions, and unpredictable latency. Edge applications require low-latency responses, but wireless networks incur variables such as signal strength fluctuations, handoffs between cell towers, and unpredictable network congestion. A model that performs flawlessly in the data center may struggle over wireless networks. Therefore, carriers and operators must validate performance under radio conditions before rolling out edge services to ensure that wireless infrastructure can deliver the required reliability and speed.

Managing Shifting Pressure Points

Managing pressure points requires adopting a system-level perspective that demonstrates how improvements at one layer create subsequent challenges. Organizations that optimize components in isolation are simply moving bottlenecks around. Scaling successfully requires understanding the entire stack to account for upstream and downstream impacts before making changes.

For example, when rack networking is optimized, can the optical backbone handle the bandwidth? The objective is to anticipate where pressure points will occur and prepare accordingly.

System Level Visibility

Pressure points are unavoidable as constraints emerge and cascade across the stack as AI infrastructure scales. The key is to simulate system-level AI traffic to identify where these will occur. Understanding how relieving pressure at one layer creates constraints at another is the only way to scale effectively to meet the AI era’s burgeoning demands. Companies that adopt this approach gain a significant advantage by avoiding costly redesigns, speeding time-to-deployment, and maintaining performance.

Marie Hattar

Marie Hattar is SVP at Keysight Technologies. She has more than 20 years of leadership experience spanning the security, routing, switching, telecom and mobility markets. Before Keysight Technologies, Marie was CMO at Ixia and at Check Point Software Technologies. Prior to that, she was Vice President at Cisco where she led the company’s enterprise networking and security portfolio and helped drive the company’s leadership in networking. Marie also worked at Nortel Networks, Alteon WebSystems, and Shasta Networks in senior marketing and CTO positions. Marie received a master’s degree in Business Administration in Marketing from York University and a Bachelor’s degree in Electrical Engineering from the University of Toronto