Nvidia Launches Nemotron 3 Ultra: 550B Parameters, Open-Weight AI

iEXExchanger
Nvidia Launches Nemotron 3 Ultra: 550B Parameters, Open-Weight AI

Nvidia unveiled Nemotron 3 Ultra at Computex 2026 — a 500–550B open-weight model for advanced reasoning. The company claims 5x faster inference and 30% lower cost than rivals, available from June 4.

At the Computex 2026 keynote in Taipei, Nvidia CEO Jensen Huang unveiled Nemotron 3 Ultra — a 500–550 billion parameter open-weight AI model that the company calls the most capable US-built open model to date. It will be freely available for download starting June 4.

What Is Nemotron 3 Ultra

Nemotron 3 Ultra is an open-weight large language model: after its release, any developer or business can download it, self-host it, or access it via API. No proprietary licensing, no dependence on external cloud providers.

The model is built for advanced reasoning and planning, with a particular focus on agentic workflows — scenarios where AI autonomously breaks tasks into steps, executes them sequentially, and iterates without constant human oversight. This is precisely what enterprise AI deployments increasingly require.

Starting June 4, 2026, the model is available on Hugging Face, ModelScope, OpenRouter, and as an NVIDIA NIM microservice at build.nvidia.com.

Key Specifications

  • Scale: 500–550 billion parameters
  • Speed: Nvidia claims 5x faster than the best competing open-weight models
  • Cost: Approximately 30% cheaper per inference than rivals
  • Throughput: Over 300 tokens per second on pre-release endpoints
  • Artificial Analysis Intelligence Index: 48 points

Why This Matters

The launch of Nemotron 3 Ultra signals that Nvidia is no longer just a GPU maker. The company is positioning itself as a full-stack AI platform: chips, software stack, infrastructure — and now its own frontier models.

For developers and enterprises, this means powerful AI without monthly API subscriptions or the risk of price hikes and service shutdowns. Teams can take the model and deploy it wherever they need.

According to Artificial Analysis, Nemotron 3 Ultra is now the most capable open-weight model from a US company. It reportedly still trails leading Chinese open models on some benchmarks, keeping the US–China open-model race very much alive.

What's Next

The model releases on June 4, 2026. Nvidia also unveiled two other major products at Computex: Cosmos 3 (video generation and physical simulation) and the RTX Spark lineup — compact personal AI supercomputers designed for running large models locally.

Together, these announcements reinforce Nvidia's ambition to dominate the full spectrum of AI development — from training clusters to ready-to-use models and personal AI devices. Investors took notice, with Nvidia shares hitting all-time highs during the Computex keynote.

Questions and answers

Frequently asked questions about this article

What is Nemotron 3 Ultra?

It's Nvidia's open-weight large language model with 500–550 billion parameters, announced at Computex 2026. It's free to download or use via API starting June 4, 2026.

How does Nemotron 3 Ultra compare to other open models?

Nvidia claims it runs 5x faster and ~30% cheaper per inference than the best competing open-weight models. It sets a new record for open-weight models built by US companies.

Where can I download Nemotron 3 Ultra?

Starting June 4, 2026, the model is available on Hugging Face, ModelScope, OpenRouter, and as an NVIDIA NIM microservice at build.nvidia.com.

Can Nemotron 3 Ultra be run locally?

Yes, as an open-weight model it can be self-hosted on your own hardware without subscriptions or external API dependencies. Running 550B parameters does require substantial compute infrastructure.

Does Nemotron 3 Ultra lag behind Chinese AI models?

By some benchmarks, yes — it reportedly trails top Chinese open models. However, it's currently the best open-weight model from any US-based company, marking a major milestone for the US AI ecosystem.