NVIDIA H200 now available

3 April 2025

Educational

What MLPerf Inference v5.0 means for our customers

We recently participated in the MLPerf Inference v5.0 benchmark, and the results are in: SMC by Firmus delivered consistently strong performance across a range of real-world AI workloads.

For those unfamiliar, MLPerf is a globally recognized benchmark that evaluates how infrastructure handles common AI tasks—think language models, computer vision, recommendation engines, and generative inference. It’s the standard the industry looks to when comparing performance across systems, hardware, and platforms.

We submitted an 8x NVIDIA H200 SXM system, powered by the liquid-everywhere Firmus AI Factory platform and backed by our custom-tuned infrastructure stack, including AI FactoryOS™, to combine industry-best AI performance with the lowest AI token costs and ultra-high SLAs.

Here’s what that means in the real world:

  • LLMs at speed: Llama2-70B, one of the most talked-about benchmarks this round, ran at 34,264 tokens/sec offline on our system. That’s within the top tier of scores globally for H200-class GPUs.
  • No weak spots: From ResNet50 for image classification to Stable Diffusion XL for generative tasks, BERT for NLP, and DLRMv2 for recommendation, our platform delivered consistent, reliable throughput across the board.
  • Efficiency, built-in: Our liquid-everywhere AI Factory platform uses less energy while maintaining thermal and performance headroom. This is one of the key advantages of the AI Factory model—less wasted power and more usable compute.

Why does this matter to customers?

Because the models you’re running, whether it’s fine-tuned LLMs, multimodal workflows, or real-time inferencing, need infrastructure that performs at scale, not just on paper. MLPerf helps validate that. It shows that SMC by Firmus isn’t just keeping up; we’re building the kind of platform that future workloads will require.

Just last week, analyst firm SemiAnalysis named SMC by Firmus one of the top-performing GPU cloud platforms globally, ranking us alongside AWS and above Google Cloud in their first-ever ClusterMAX™ Ratings. Not bad timing. It’s a strong signal that we’re not just performing; we’re proving that SMC delivers real throughput, real efficiency, and real-world readiness for the next era of AI infrastructure.

Every MLPerf round helps us refine that mission—and deliver it to our customers with confidence.

We’d also like to thank our partner, NVIDIA, for including SMC in their MLPerf results release. Their ongoing collaboration and recognition continue to power our mission forward.

Read the full report here: https://mlcommons.org/benchmarks/inference-datacenter/

Author

Team SMC