Section · Research

Research & Infrastructure

The infrastructure that makes frontier AI possible: Hugging Face, NVIDIA, BAIR, and the tool chains behind the models.

ModelExpress: Distributing Model Artifacts at the Speed of Light

Every byte moved has a cost. As model checkpoints grow to hundreds of gigabytes or even a terabyte, that cost adds up quickly. To make things even worse, moving... Every byte moved has a cost. As model checkpoints grow to hundreds of gigabytes or even a terabyte, that cost adds up quickly. To make things even worse, moving these model weights around the cluster is extremely common. For instance, a cold start may pull weights from remote storage into GPU memory; autoscaling and rolling updates must populate each new replica; and RL post-training continuously… Source

Elizabeth Goodman·2 days ago

Research & Infrastructure

ModelExpress: Distributing Model Artifacts at the Speed of Light

Debugging Ray Tracing Applications Using NVIDIA OptiX Toolkit

Start Customizing NVIDIA Nemotron 3 Nano with Prime Intellect Lab in Minutes

Bringing Nunchaku 4-bit Diffusion Inference to Diffusers

Make Long-Running NVIDIA TensorRT Engine Builds Observable and Cancelable in Python or C++

The State of Simulation for Physical AI: An Overview

NVIDIA Vera CPU: Olympus Cores Built for Maximum Single-Thread Performance in Agentic AI

Inside NVIDIA Rubin GPU Architecture: Powering the Era of Agentic AI

Setting a World Record for MoE Pre-Training on NVIDIA GB300 NVL72

Grabette: an open system to record robot-manipulation data

Introducing Cosmos 3 Edge

NVIDIA NVLink: The Scale-Up Network for AI Factories

Integrate NVIDIA Omniverse RTX Sensor Simulation Into Existing Apps

Fine-tune video and image models at scale with NVIDIA NeMo Automodel and 🤗 Diffusers

Q&A: How Capcom Brought Path Tracing to RE ENGINE Across PRAGMATA and Resident Evil Requiem

Integrating Context-Aware Video AI Agents Into Enterprise Workflows

NVIDIA Nemotron 3 Embed Ranks #1 Overall on RTEB, Advancing Agentic Retrieval

Scaling Agentic AI Factories Through Extreme Co-Design with NVIDIA BlueField

Newer Models, Same Advantage

Security incident disclosure — July 2026

Build a Multi-Camera 3D Tracking Application with NVIDIA DeepStream 9.1 Skills

Develop Lightweight USD Runtimes Faster with AI Agents

Building Faster Cryptography with Carryless Multiplication in NVIDIA CUDA 13.3

What building Shippy taught us about building agents

Model Routing Is Simple. Until It Isn’t.

Introducing Real World VoiceEQ: Measuring the human quality of voice AI

Welcome Inkling by Thinking Machines

Lessons From the Leaderboard: What 5,000+ Kagglers Taught Us About Improving AI Reasoning

How to Run an Autoresearch Workflow with RL Agent Skills and NVIDIA NeMo

Post-Train NVIDIA Cosmos 3 in One Day Using Agent Skills

NVIDIA Ising Decoding Cuts Color Code Logical Error Rates by Over 300X

Extreme Event Likelihoods with Guided Generative Models

How to Evaluate General-Purpose Robot Policies for Real-World Deployment

Reducing High-Bandwidth Memory Bottlenecks in JAX-Based LLM Training with Host Offloading

Kernel Fusion in NVIDIA CUDA: Optimizing Memory Traffic and Launch Overhead

AI Model Co-Design: Hardware-Friendly LLM Design

Accelerating End-to-End Co-Folding Performance with NVIDIA BioNeMo Agent Toolkit

Profiling in PyTorch (Part 3): Attention is all you profile

Synthetic Data Generation for Financial AI Research with NVIDIA NeMo

A Practical Guide to GPU-Initiated Communication for Molecular Dynamics at Scale

Data for Agents

Running Low-Latency Analytical Workloads with GPU-Accelerated Presto on NVIDIA GB200 NVL72

Create a LangChain Deep Agents Harness Profile for NVIDIA Nemotron 3 Ultra to Improve Performance

Native-speed vLLM transformers modeling backend

From Hugging Face to Amazon SageMaker Studio in one click

Develop Humanoid Robot Policies End-to-End with NVIDIA Isaac GR00T

Building an Analysis AI Agent for Industrial Alarm Management with NVIDIA Nemotron

Maximize Spectral Efficiency with AI-Native RAN and NVIDIA AI Aerial

Hugging Face Models on Foundry Managed Compute

NVIDIA Vera CPU Boosts AI Factory Throughput to Accelerate Agentic Workloads