A deep expertise convention for processor and system architects from trade and academia has become a critical discussion board for the trillion-dollar information-center computing market.
At Scorching Chips 2024 next week, senior NVIDIA engineers will present the newest developments powering the NVIDIA Blackwell platform, plus an analysis of liquid cooling for information facilities and AI brokers for chip design.
They’ll share how:
- NVIDIA Blackwell combines several chips, methods, and NVIDIA CUDA software programs to power AI in use circumstances, industries, and nations.
- NVIDIA GB200 NVL72 — a multi-node, liquid-cooled, rack-scale solution that connects 72 Blackwell GPUs and 36 Grace CPUs—raises the bar for AI system design.
- NVLink interconnect expertise offers all-to-all GPU communication, enabling file excessive throughput and low-latency inference for generative AI.
- The NVIDIA Quasar Quantization System pushes the boundaries of physics to speed up AI computing.
- NVIDIA researchers are constructing AI fashions that assist in constructing processors for AI.
An NVIDIA Blackwell discussion, happening Monday, Aug. 26, may also highlight new architectural particulars and examples of generative AI fashions working on Blackwell silicon.
It’s preceded by three tutorials on Sunday, Aug. 25, that will cover how hybrid liquid-cooling options can help information facilities transition to more energy-efficient infrastructure and how AI models, together with massive language model (LLM)–powered brokers, can help engineers design the next generation of processors.
Collectively, these shows showcase the methods NVIDIA engineers are innovating throughout each space of information-heart computing and Design to deliver unprecedented efficiency, effectiveness, and optimization.
Be Prepared for Blackwell
NVIDIA Blackwell is the final word for a full-stack computing problem. It includes several NVIDIA chips, the Blackwell GPU, Grace CPU, BlueField information processing unit, ConnectX community interface card, NVLink Change, Spectrum Ethernet change, and Quantum InfiniBand change.
Ajay Tirumala and Raymond Wong, administrators of structure at NVIDIA, will primarily look at the platform and clarify how these applied sciences work collectively to ship a new commonplace for AI and accelerate computing efficiency while advancing vitality effectivity.
The multi-node NVIDIA GB200 NVL72 answer is an ideal instance. LLM inference requires low-latency, high-throughput token technology. GB200 NVL72 acts as a unified system to ship as much as 30x quicker inference for LLM workloads, unlocking the flexibility to run trillion-parameter fashions in actual time.
Tirumala and Wong may also focus on how the NVIDIA Quasar Quantization System — which brings collectively algorithmic improvements, NVIDIA software program libraries and instruments, and Blackwell’s second-generation Transformer Engine — helps excessive accuracy on low-precision fashions, highlighting examples utilizing LLMs and visible generative AI.
Holding Knowledge Facilities Cool
The normal hum of air-cooled information facilities might become a relic of the past as researchers develop more environmentally friendly and sustainable options that use hybrid cooling, a mix of air and liquid cooling.
Liquid cooling methods transfer warmth away from methods more effectively than air, making it simpler for computing methods to remain calm while ceasing massive workloads. The tools for liquid cooling additionally take up much less space and consume much less energy than air-cooling methods, permitting information facilities to add extra server racks—and, therefore, extra computing energy—to their amenities.
Ali Heydari, director of information heart cooling and infrastructure at NVIDIA, will present several designs for hybrid-cooled information facilities.
Some designs retrofit present air-cooled information facilities with liquid-cooling items, providing a fast and straightforward answer so as to add liquid-cooling capabilities to present racks. Different designs require the set up of piping for direct-to-chip liquid cooling utilizing cooling distribution items or by totally submerging servers in immersion cooling tanks. Though these choices demand more significant upfront funding, they result in substantial financial savings in each vitality consumption and operational price.
Heydari may also share his staff’s work as a part of COOLERCHIPS, a U.S. Division of Power program to develop superior information heart-cooling applied sciences. As a part of the mission, the staff is utilizing the NVIDIA Omniverse platform to create physics-informed digital twins that may assist them with mannequin vitality consumption and cooling effectivity to optimize their information heart designs.
AI Brokers Chip In for Processor Design
Semiconductor design is a mammoth problem at the microscopic scale. Engineers developing cutting-edge processors work to fit as much computing energy as possible onto a piece of silicon a couple of inches thick, testing the boundaries of what’s physically doable.
AI fashions support their work by improving Design, high quality, and productivity, boosting the effectiveness of handbook processes, and automating time-consuming duties. The fashions embrace prediction and optimization instruments to assist engineers in quickly analyzing and enhancing rings, in addition to LLMs that may help engine answering, produce code, debug debugs, use, and issue issues.
Mark Ren, director of design automation analysis at NVIDIA, will summarize those fashions and their uses in a tutorial. In a second session, he’ll deal with agent-based AI methods for chip design.
AI brokers powered by LLMs may be directed to finish duties autonomously, unlocking broad functions throughout industries. In microprocessor design, NVIDIA researchers are growing agent-based methods that may purpose and take motion utilizing custom-made circuit design instruments, work together with skilled designers, and be taught from a database of human and agent experiences.
NVIDIA consultants aren’t simply constructing this expertise — they’re utilizing it. Ren will share examples of how engineers can use AI brokers for timing report evaluation, cell cluster optimization processes, and code technology. The cell cluster optimization work recently received the most outstanding paper in the first IEEE Worldwide Workshop on LLM-Aided Design.