NVIDIA DGX Station for Windows — GB300 Grace Blackwell Ultra, 20 Petaflops FP4, 748GB Memory, Trillion-Parameter AI Agents on the Desktop

NVIDIA has announced DGX Station for Windows, a deskside AI supercomputer designed to run frontier AI models of up to 1 trillion parameters locally, directly within the Windows ecosystem. Announced at NVIDIA GTC Taipei and developed in collaboration with Microsoft, DGX Station for Windows is built on the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip and is expected to be available from ASUS, Dell Technologies, GIGABYTE, HP, MSI, and Supermicro in Q4 2026.

The system targets enterprise developers, researchers, engineers, designers, and data scientists who need frontier-class AI compute — historically only available in data centers running Linux — connected directly to the Windows applications and workflows they already use. DGX Station for Windows can run hundreds of agents simultaneously, supports pretraining and fine-tuning of large models, and scales workloads seamlessly to GB300 systems in the data center or cloud.

Key capabilities include up to 20 petaflops of FP4 AI performance, up to 748GB of coherent memory, 800Gb/s networking via ConnectX-8 SuperNIC, support for Windows security primitives and NVIDIA OpenShell, and optional pairing with an NVIDIA RTX PRO 6000 Blackwell Workstation GPU for physical AI workflows combining frontier compute with ray-traced visualization.

GB300 Grace Blackwell Ultra — The Superchip Inside

DGX Station for Windows is powered by the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip, which connects an NVIDIA Blackwell Ultra GPU to a 72-core NVIDIA Grace CPU via NVLink-C2C chip-to-chip interconnect. The unified memory pool reaches up to 748GB of coherent memory, accessible by both CPU and GPU without data transfer overhead, enabling the system to load and run trillion-parameter AI models locally.

AI compute tops out at up to 20 petaflops of FP4 performance, which NVIDIA positions as sufficient for pretraining, fine-tuning, large-scale inference, and multi-agent deployment on a single deskside unit.

The system also integrates the NVIDIA ConnectX-8 SuperNIC, supporting network speeds of up to 800Gb/s. This enables fast data ingestion from enterprise storage and allows multiple DGX Station units to be connected for even larger distributed workloads.

AI Workflows DGX Station for Windows Supports

The system is designed to handle the full range of enterprise AI workloads, all within the Windows environment:

AI Agents — Build and run multiple frontier agents in parallel, connected directly to enterprise Windows applications and workflows. Hundreds of agents can execute simultaneously on a single DGX Station.

AI Development — Pretrain, fine-tune, and iterate on large AI models within Windows, with access to Linux AI toolchains via Windows Subsystem for Linux (WSL).

Data Science — Ingest large datasets directly into up to 748GB of coherent memory, removing data movement bottlenecks across data preparation, machine learning, and analytics pipelines.

AI Inference — Run high-throughput inference on AI models, including models up to 1 trillion parameters.

Physical AI — Pair the GB300 Superchip with an additional NVIDIA RTX PRO 6000 Blackwell Workstation GPU to combine frontier AI compute with ray-traced visualization and simulation in a single deskside unit, for agents that operate across virtual-to-physical environments.

DGX Station for Windows can function as a dedicated AI supercomputer for a single developer or as a shared local compute node for entire teams, with workloads scaling to GB300 data center systems or the cloud.

NVIDIA OpenShell — Secure Agent Runtime on Windows

Autonomous agents need a runtime that governs how they act, use tools, and interact with other system components. DGX Station for Windows supports NVIDIA OpenShell, an open-source, secure-by-design agent runtime built on the new Windows security and containment primitives from Microsoft.

OpenShell creates an individual, isolated sandbox for each agent and separates application-layer operations from infrastructure-layer policy enforcement. Security and privacy policies are applied at the system level — outside the agent’s reach — rather than relying on behavioral system prompts that agents could potentially bypass. The goal is to enforce constraints on the environment the agent runs in, preventing credential leaks or private data exposure.

For enterprise IT teams, this means agents deploy and operate within the same managed Windows environment, governed through familiar Microsoft security, compliance, and fleet management tools. Linux workloads receive the same manageability support through Windows Subsystem for Linux.

Enterprise IT and Fleet Management

One of the design priorities for DGX Station for Windows is integration with existing enterprise IT infrastructure. Organizations running Windows environments can manage DGX Station deployments using the same tools they already use for fleet management, deployment, and system updates — without building separate Linux-based infrastructure for AI workloads.

The system is positioned as both a dedicated workstation for individual developers and a shared local compute node for teams, making it applicable to engineering groups, research labs, design studios, and data science teams within the same organization.

Availability

OEM PartnerAvailability
ASUSQ4 2026
Dell TechnologiesQ4 2026
GIGABYTEQ4 2026
HPQ4 2026
MSIQ4 2026
SupermicroQ4 2026

DGX Station for Windows extends the NVIDIA and Microsoft collaboration that also covers NVIDIA RTX Spark, the superchip for slim Windows laptops and compact desktops targeting personal AI agents, creative workloads, and gaming.

FAQ / Common Questions

What is NVIDIA DGX Station for Windows?
It is a deskside AI supercomputer designed for enterprise developers, researchers, and data scientists. Built on the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip, it brings data-center-class AI compute into the Windows environment, capable of running AI models up to 1 trillion parameters locally.

What are the key specs of DGX Station for Windows?
The system delivers up to 20 petaflops of FP4 AI performance, up to 748GB of coherent unified memory, a 72-core NVIDIA Grace CPU, a Blackwell Ultra GPU, and ConnectX-8 SuperNIC networking at up to 800Gb/s.

What is NVIDIA OpenShell and why does it matter for enterprises?
OpenShell is an open-source secure runtime for autonomous agents. It uses new Windows security and containment primitives to create isolated sandboxes for each agent and enforces security policies at the system level rather than relying on behavioral prompts. This allows enterprises to deploy agents within their existing Windows compliance and fleet management frameworks.

When will DGX Station for Windows be available?
It is expected from ASUS, Dell Technologies, GIGABYTE, HP, MSI, and Supermicro in Q4 2026.

Can DGX Station for Windows run existing Linux AI toolchains?
Yes. Access to Linux AI toolchains is available through Windows Subsystem for Linux, allowing developers to use Python-based frameworks, model training libraries, and other Linux-native tools within the Windows environment.

How does DGX Station for Windows relate to NVIDIA RTX Spark?
The two products form the ends of NVIDIA and Microsoft’s joint agent platform for Windows. RTX Spark targets slim laptops and compact desktops for personal agents and creative work. DGX Station for Windows targets enterprise deskside deployments requiring frontier-class AI compute and multi-agent infrastructure.


Note: Details above are based on NVIDIA’s announcement at GTC Taipei 2026 and are subject to change. Final feature availability, rollout timing, and supported configurations may vary. Verify against NVIDIA’s and the respective manufacturers’ official channels before relying on any specific detail.

Disclaimer: This post summarizes an NVIDIA product announcement for informational purposes. It is not affiliated with or endorsed by NVIDIA, Microsoft, or any manufacturer mentioned.


NVIDIA RTX Spark Superchip Unveiled — 1 Petaflop AI, 128GB Unified Memory, Windows-Native Agents, Blackwell GPU + Grace CPU in One Chip

NVIDIA has unveiled RTX Spark, a new superchip designed to bring personal AI agents, creative workloads, and gaming to slim Windows laptops and compact desktop PCs. The announcement was made at NVIDIA GTC Taipei, alongside a collaboration with Microsoft to deliver a native Windows platform for on-device agents. RTX Spark-powered devices from ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI are expected to arrive this fall, with models from Acer and GIGABYTE following.

RTX Spark combines an NVIDIA Blackwell RTX GPU with 6,144 CUDA cores and fifth-generation Tensor Cores with a 20-core NVIDIA Grace CPU, connected via NVLink-C2C chip-to-chip interconnect. The superchip delivers 1 petaflop of AI compute and supports up to 128GB of unified memory. MediaTek collaborated with NVIDIA on the custom CPU design, contributing to power efficiency and connectivity.

The chip targets three simultaneous use cases: running 120B-parameter LLMs locally with 1 million token context, handling creative workflows including 12K 4:2:2 video editing and 90GB+ 3D scene rendering, and playing AAA games at 1440p at over 100 frames per second with ray tracing, DLSS, and Reflex.

Related blog to check out: NVIDIA’s Vera CPU for AI Agents — 1.8x Faster Than x86, 88 Olympus Cores, Adopted by Anthropic, OpenAI, Oracle Cloud, Dell, HPE, and More.

Blackwell GPU + Grace CPU — The RTX Spark Architecture

RTX Spark is built around two interconnected components on a single package. The GPU side carries the NVIDIA Blackwell RTX architecture with 6,144 CUDA cores, fifth-generation Tensor Cores with FP4 precision, and a new Blackwell video decoder capable of handling 12K 4:2:2 content. The CPU side is a 20-core NVIDIA Grace processor, co-designed with MediaTek for efficiency and connectivity in thin-and-light form factors.

The two dies communicate via NVLink-C2C, NVIDIA’s chip-to-chip interconnect, which enables a single unified memory pool of up to 128GB accessible by both the CPU and GPU simultaneously. This unified memory architecture is what allows RTX Spark to run frontier-class language models locally — models that would otherwise require GPU memory and system RAM to be managed separately.

The full NVIDIA AI and graphics stack ships with RTX Spark: CUDA, TensorRT, OptiX, DLSS, Reflex, and G-SYNC are all supported.

Windows-Native Agents — NVIDIA OpenShell and Microsoft Security Primitives

NVIDIA and Microsoft are partnering to bring a secure, on-device agent platform to Windows. The collaboration centers on two components.

New Windows security primitives provide identity, containment, policy, and end-to-end security for agents running natively on the device. These primitives are being built into Windows and are designed to let agents execute tasks across applications, run code, and handle files while remaining under user control.

NVIDIA OpenShell is a runtime layer that adds additional policy controls on top of the Windows primitives. It lets users define what agents can and cannot access, intelligently routes queries to local models based on privacy policies, and can strip or mask personal information before any query is sent to a cloud model.

Agent developers OpenClaw and Hermes Agent (from Nous Research) are among the first to adopt OpenShell and the Microsoft security primitives in their Windows apps. From the Windows taskbar, users will be able to invoke agents that can execute tasks inside applications, run cross-app workflows, generate images and video, write code, and search local files semantically.

Microsoft CEO Satya Nadella described the goal as delivering “unmetered intelligence to every home and every desk with Windows.”

Creative Capabilities — Adobe Rearchitects Premiere and Photoshop for RTX Spark

Adobe is rebuilding Photoshop and Premiere specifically for RTX Spark, targeting up to 2x faster AI, editing, coloring, and effects performance compared with existing workflows.

Adobe Premiere is getting a new video pipeline that uses RTX Spark’s unified memory, Blackwell GPU, and TensorRT software stack. The reworked pipeline targets real-time performance for editing and color correction, GPU-accelerated AI effects, and more efficient rendering of complex timelines. Adobe Substance 3D Painter and Stager will also run natively on RTX Spark.

Adobe Photoshop’s next-generation engine is being optimized for GPU-accelerated compositing, live filters, high dynamic range workflows, and natural brushing. The engine is built to use TensorRT. Both Premiere and Photoshop will also integrate with Windows agents, allowing creators to offload tasks to an on-device AI assistant from within the apps.

Firefly-powered Generative Fill in Photoshop and Generative Extend in Premiere are among the tools that will see direct performance gains from RTX Spark. Updates are expected to roll out alongside RTX Spark device availability in fall 2026.

Other software partners include Blackmagic Design, Blender (with DLSS 4.5 Ray Reconstruction coming to version 5.3), ComfyUI (which gains 4K AI video generation via RTX Video with 4x Frame Generation), OTOY Octane, CapCut, and llama.cpp for optimized local model inference.

Gaming on RTX Spark — DLSS 4.5, Ray Tracing, G-SYNC

For gaming, RTX Spark supports AAA titles at 1440p and over 100 frames per second with ray tracing, DLSS, and Reflex. RTX technology is active in over 1,000 games and applications, and over 100 Windows software providers are embracing the platform.

New RTX capabilities coming with RTX Spark include DLSS 4.5 Ray Reconstruction, which uses a second-generation transformer model and is coming to Blender 5.3 and dozens of games. RTX Video with 4x Frame Generation is coming to ComfyUI.

Game developers embracing the platform include KRAFTON, NetEase (NARAKA: BLADEPOINT), Remedy Entertainment, Riot Games, and XBOX. NetEase noted that RTX Spark enables its titles to run as intended on ultrathin, high-performance laptops.

Device Form Factors — Slim Laptops and Compact Desktops

RTX Spark laptops are engineered to be as slim as 14mm and as light as three pounds, available in 14- to 16-inch sizes. The chassis uses precision-machined aluminum. Displays are color-accurate tandem OLED panels with NVIDIA G-SYNC, targeting both creative color work and gaming visuals. All-day battery life is a stated design goal for the laptop line.

Compact RTX Spark desktop PCs are also in development, positioned for agentic AI workloads, creative production, gaming, and everyday productivity in a small-footprint chassis.

Named devices and OEM commitments:

  • Dell XPS 16 Creator Edition — RTX Spark with large unified memory, designed for creators
  • HP OmniBook — described as one of the thinnest RTX Spark laptops
  • Microsoft Surface Laptop Ultra — targeting creators, developers, and engineers
  • Additional designs from ASUS, Lenovo, MSI, with Acer and GIGABYTE following

NVIDIA DGX Station for Windows will extend the Blackwell architecture to enterprise developers who need a deskside AI supercomputer for running agents at scale.

Rollout Timing — What’s Live When

PhaseWhenScope
AnnouncementGTC Taipei 2026 (now)RTX Spark superchip unveiled
Windows agent developer detailsMicrosoft Build, June 2–3, 2026Security primitives, OpenShell for developers
RTX Spark devices availableFall 2026Laptops and compact desktops from ASUS, Dell, HP, Lenovo, Microsoft Surface, MSI
Acer and GIGABYTE modelsAfter fall 2026Additional OEM devices to follow
Adobe app updatesAlongside fall 2026 RTX Spark availabilityPremiere, Photoshop, Substance 3D updates

FAQ / Common Questions

What is NVIDIA RTX Spark?
RTX Spark is a superchip that combines an NVIDIA Blackwell RTX GPU and a 20-core NVIDIA Grace CPU on a single package, connected via NVLink-C2C. It is designed for Windows laptops and compact desktops, targeting AI agent execution, creative workloads, and gaming in thin, portable form factors.

How much AI compute and memory does RTX Spark offer?
The superchip delivers 1 petaflop of AI compute and supports up to 128GB of unified memory shared between the GPU and CPU. This unified memory pool allows it to run 120-billion-parameter language models locally with 1 million token context.

Which laptops and PCs will use RTX Spark?
RTX Spark-powered devices are confirmed from ASUS, Dell (XPS 16 Creator Edition), HP (OmniBook), Lenovo, Microsoft Surface (Surface Laptop Ultra), and MSI for fall 2026. Acer and GIGABYTE will follow with additional models.

What is NVIDIA OpenShell?
OpenShell is a runtime for on-device agents that works alongside new Windows security primitives from Microsoft. It lets users set policies for what agents can access, routes queries to local or cloud models based on privacy preferences, and masks personal information before sending queries externally.

Will Adobe apps like Photoshop and Premiere work differently on RTX Spark?
Adobe is rebuilding both apps specifically for RTX Spark. The new engines use TensorRT, the Blackwell GPU, and unified memory to target up to 2x faster AI and graphics performance. Updates are expected to roll out when RTX Spark devices ship in fall 2026.

When will RTX Spark devices be available?
Laptops and compact desktops powered by RTX Spark are expected to be available from system builders and cloud partners starting fall 2026.


Note: Details above are based on NVIDIA’s announcement at GTC Taipei 2026 and are subject to change. Final feature availability, rollout timing, and supported devices may vary by region. Verify against NVIDIA’s and the respective manufacturers’ official channels before relying on any specific detail.

Disclaimer: This post summarizes an NVIDIA product announcement for informational purposes. It is not affiliated with or endorsed by NVIDIA, Microsoft, Adobe, or any device manufacturer mentioned.

NVIDIA’s Vera CPU for AI Agents — 1.8x Faster Than x86, 88 Olympus Cores, Adopted by Anthropic, OpenAI, Oracle Cloud, Dell, HPE, and More

NVIDIA has unveiled Vera, its first CPU built specifically for AI agents. Now in full production, Vera is a new class of processor designed to handle the CPU-side workloads that modern AI factories generate — agentic task execution, reinforcement learning, code compilation, Python and Java runtimes, and data processing pipelines. The announcement was made at NVIDIA GTC Taipei.

Key capabilities include 1.8x faster task completion compared with x86 CPUs, a custom Olympus CPU core engineered for AI factory workloads, 88 cores with Spatial Multithreading, and up to 1.2TB/s of LPDDR5X memory bandwidth. Vera also serves as the host CPU for NVIDIA’s Vera Rubin GPU platforms via second-generation NVLink-C2C, delivering up to 1.8TB/s of coherent CPU-to-GPU bandwidth.

NVIDIA positions Vera as the successor to its Grace CPU line, which has shipped nearly 2.5 million units to date. The shift in AI factory economics — from cores per dollar to tokens per dollar — is driving the need for CPUs that can complete orchestration, tool use, and sandbox execution faster and at greater concurrency.

Olympus Core — NVIDIA’s Custom CPU Architecture

At the heart of Vera is Olympus, a custom CPU core NVIDIA engineered specifically for the workloads that sit on the critical path of AI agent execution. These include Python runtimes, sandboxed code execution, orchestration logic, and analytics pipelines — the steps that happen between GPU kernel calls and determine how quickly agents can complete tasks.

Vera features 88 Olympus cores paired with Spatial Multithreading, a technique for processing more instructions across large numbers of concurrent environments, queries, and data processing tasks simultaneously. The LPDDR5X memory subsystem delivers up to 1.2TB/s of bandwidth, reducing the time agents spend waiting on CPU-bound steps and keeping accelerators active.

According to benchmarks from Phoronix, Vera delivered the fastest overall performance across agentic workloads — including code compilation, Python, Java, and database processing — compared with competing processors tested.

Vera in the AI Factory — From Standalone Servers to GPU-Coupled Systems

Vera is designed to run across the entire AI factory stack, not just in one configuration. It powers three distinct system types:

  • Standalone Vera CPU servers — standard CPU-only configurations for data processing, orchestration, and agentic AI workloads, offered by Dell Technologies, HPE, Lenovo, and Supermicro as an alternative to x86
  • NVIDIA Vera Rubin systems — Vera serves as the host CPU tightly coupled to Rubin GPUs via second-generation NVLink-C2C, providing up to 1.8TB/s of coherent bandwidth between processor and GPU
  • NVIDIA Vera BlueField-4 STX — integrates Vera with high-performance networking, storage acceleration, and in-silicon security for AI-native storage platforms

Vera also extends NVIDIA Confidential Computing at rack scale, protecting agentic workloads end-to-end across the data center.

Deployment Plans — AI Labs, Hyperscalers, and NYSE

A broad set of customers are planning to adopt Vera for production workloads.

Anthropic, the company behind Claude, is evaluating Vera for CPU-intensive agentic workloads. James Bradbury, head of compute at Anthropic, noted that scaling compute is an important accelerant for model growth and called Vera a promising part of the ecosystem for agentic workloads.

Oracle Cloud Infrastructure is planning to deploy Vera CPUs to support high-throughput reasoning and data processing across next-generation AI environments. Mahesh Thiagarajan, EVP of OCI, described it as the next frontier in hyperscale AI supercomputing.

NYSE is collaborating with Redpanda and HPE to use Vera CPUs to scale capacity and further optimize latency across its market infrastructure, which processes more than 1.1 trillion messages per day.

Other customers exploring or planning to deploy Vera include OpenAI, SpaceXAI, ByteDance, CoreWeave, Lambda, Nebius, Nscale, and Cloudflare, among others.

System Builders and Cloud Providers

Vera CPUs are available in two form factors: dense, liquid-cooled racks for large-scale agentic AI and reinforcement learning environments, and flexible two-socket air-cooled systems for enterprise, cloud, data processing, and AI factory deployments.

Infrastructure providers building Vera-based systems include Aivres, ASRock Rack, ASUS, Compal, Dell Technologies, Foxconn, GIGABYTE, HPE, Hyve Solutions, Inventec, Lenovo, MiTAC Computing, MSI, Pegatron, Quanta Cloud Technology (QCT), Supermicro, Wistron, and Wiwynn.

Cloud service providers planning to offer Vera CPU capacity include Akamai, ByteDance, Cloudflare, CoreWeave, Crusoe, Lambda, Nebius, Nscale, Oracle Cloud Infrastructure, Redpanda, Starburst, Together AI, and Vultr.

Rollout Timing — What’s Live When

PhaseWhenScope
ProductionNow (announced at GTC Taipei 2026)Vera CPU in full production
System availabilityFall 2026Vera-based servers from system builders and cloud partners

FAQ / Common Questions

What is NVIDIA Vera and what does it do?
Vera is NVIDIA’s first CPU designed specifically for AI agent workloads. It handles the CPU-intensive tasks in AI factories — orchestration, code execution, Python and Java runtimes, data processing — and is built to complete these steps 1.8x faster than x86 processors, keeping GPU accelerators busy and improving agent throughput.

What makes Vera different from NVIDIA’s Grace CPU?
Vera is built on Olympus, a new custom CPU core NVIDIA engineered from the ground up for AI agent execution. Grace focused on general high-performance computing in data centers; Vera targets the token-per-dollar economics of AI factories, with Spatial Multithreading and LPDDR5X memory bandwidth optimized for concurrent agent environments.

Which companies are planning to use NVIDIA Vera?
AI labs including Anthropic, OpenAI, and SpaceXAI are evaluating or planning to adopt Vera. Hyperscalers ByteDance, CoreWeave, Lambda, Nebius, Nscale, and Oracle Cloud Infrastructure are also among the planned deployments. NYSE is using Vera in collaboration with HPE and Redpanda for its market infrastructure.

When will Vera-based servers be available?
Vera systems from system builders and cloud partners are expected to be available starting Fall 2026.

What is Vera BlueField-4 STX?
It is a processor that integrates the Vera CPU with high-performance networking, storage acceleration, and in-silicon security, creating a secure-by-design AI-native data platform for storage workloads in AI factories.


Note: Details above are based on NVIDIA’s announcement at GTC Taipei 2026, and are subject to change. Final feature availability, rollout timing, and supported configurations may vary. Verify against NVIDIA’s official channels before relying on any specific detail.

Disclaimer: This post summarizes an NVIDIA product announcement for informational purposes. It is not affiliated with or endorsed by NVIDIA or any manufacturer mentioned.


Exit mobile version