NVIDIA H200 now available
Enquire nowLog in

13 May 2024

Industry Insights

Zuckerberg sounds the AI Energy Alarm, we answer

Key Highlights

  • On the Dwarkesh Podcast, Zuckerberg points to an imminent energy crunch in AI. We reckon we’ve got the solution.
  • Our Sustainable AI Factories cut power usage by nearly 50%. It's about smart ideas, not just big power.
  • Committed to Sustainability: We're reducing environmental impacts with every innovation, cutting CO2 emissions by 48%—because every bit counts.

As we delve deeper into the AI revolution, the demand for power in data centres, especially those driving AI computations, is skyrocketing. A few weeks ago on the Dwarkesh Podcast, Mark Zuckerberg brought up a crucial challenge facing the AI industry—the looming energy crisis. He pointed out that despite the easing of GPU production constraints, the real bottleneck isn't capital but rather energy. Predicting that building the necessary infrastructure to meet these demands is more a matter of when than if. He stressed that despite having the capital, the real challenge is the availability of energy, "We would probably build out bigger clusters than we currently can if we could get the energy to do it."

This sets the stage for a critical dialogue on how the tech industry plans to address these looming bottlenecks. It's no secret within the industry that the energy consumption required to power AI is substantial, with total power requirements today running AI workloads at an astonishing 29.3 terawatt-hours annually. Moreover, only 5% of global Internet Data Centers (IDCs) can accommodate rack densities greater than 50kW, which poses significant limitations as we scale AI applications.

“ Energy not compute will be the #1 bottleneck to AI progress. ”

We've taken that challenge head-on, effectively halving the power bill while pushing the boundaries of AI capabilities. This isn't just any solution; it’s a sustainable AI factory that runs on the logic of doing more with less—much less.

Here's why we’re leading the charge:

SMC’s revolutionary GPU cloud services leverage our own immersion cooling technology that cuts power usage by nearly 50% compared to traditional air-cooled systems running in other clouds. Each Sustainable AI Factory used by SMC AZs hosts up to 768 H100 GPUs, networked for seamless integration into DC existing infrastructure. While others are planning, we’re deploying hyperscale solutions that redefine efficiency. It’s like packing a rocket engine’s power into a compact car—watch it soar.

Every component counts to optimise energy efficiency. By implementing direct DC power delivery to our GPU nodes, we eliminate unnecessary conversion losses and the need for extensive cabling. This not only cuts operational costs but also significantly reduces e-waste

Let’s also talk about the environment. We’re by-passing Zuckerberg’s concern about "building large new power plants" through redesigning data centres with entirely new infrastructure that has a much smaller footprint resulting in sustainability and efficiency. With every design and deployment, we prioritise reducing our environmental impact.

“ The HyperCube cuts CO2 emissions by up to 48%, embodying our commitment to sustainable technology development in an era where environmental considerations are no longer optional but essential. ”

We’ve not done this alone. Through strong global partnerships, we’re not just keeping up; we’re setting the pace by collaborating with NVIDIA and Dell Technologies to ensure that customers who build on SMC are doing so with validated architectures and enterprise-grade infrastructure- like H100 SXM GPUs running within Dell’s powerful XE9680 platform. This solution powers the most demanding AI workloads today, all smoothly running in our state-of-the-art immersion environment.

Proving the world's most energy-efficient AI infrastructure.

With so much talk of sustainable computing these days, we’re all wary of greenwashing. This is why we’re super excited to be part of the MLCommons Power working group for the upcoming MLPerf Training benchmark run. For the first time ever, peer-reviewed power consumption data when running AI workloads will be visible for the world to see.

Check back in early June when the results are published to see how this all comes together.

So, what’s the point? While Zuckerberg’s points about energy constraints are spot on, we at SMC aren’t just nodding in agreement; we’re actively delivering solutions.

References

Author

Tim Rosenfield