Crypto Games
  • Crypto Games
    • News
    • Presales
    • Airdrops & Giveaways
    • Tournaments & Events
    • Reviews
    • Guides
    • Editorials
  • Reviews
  • Others
    • Blockchains
      • News
    • Dapps
    • Press Release
  • Regular Games
Reading: Nvidia's New AI Model Turns 1 Photo Into a 60-Second Controllable World on a Single GPU
Share
Telegram News
Crypto Games Crypto Games
Font ResizerAa
  • Crypto Games
  • Reviews
  • Others
  • Regular Games
Search
  • Crypto Games
    • News
    • Presales
    • Airdrops & Giveaways
    • Tournaments & Events
    • Reviews
    • Guides
    • Editorials
  • Reviews
  • Others
    • Blockchains
    • Dapps
    • Press Release
  • Regular Games
Follow US
Copyright © 2026 CryptoGames.GG. All Rights Reserved.
Crypto Games > Blog > Artificial Intelligence (AI) > Nvidia’s New AI Model Turns 1 Photo Into a 60-Second Controllable World on a Single GPU
Artificial Intelligence (AI)

Nvidia’s New AI Model Turns 1 Photo Into a 60-Second Controllable World on a Single GPU

Staycalm4now By Staycalm4now - Owner Last updated: May 16, 2026 7 Min Read
We may include affiliate links in our content, meaning we could earn a commission—or receive blockchain-based assets—if you click a link and make a purchase or take a specific action. Additionally, we use generative AI to help draft and refine our posts for clarity and grammar. All content is fact-checked and reviewed by a human editor before publication.
Nvidia's New AI Model Turns 1 Photo Into a 60-Second Controllable World on a Single GPU
SHARE

Give SANA-WM one image and a camera path. Thirty-four seconds later, you have a minute-long, 720p video where the camera moves exactly where you told it to go. No multi-GPU cluster. No cloud rental. One consumer graphics card.

That’s the pitch from Nvidia‘s latest open-source release, and the benchmarks back it up.

What SANA-WM does

Ad image Ad image

SANA-WM is a 2.6 billion-parameter world model, a system that takes a single image, a text prompt and a six-degrees-of-freedom (6-DoF) camera trajectory, then synthesizes a photorealistic video that follows that trajectory frame by frame. The output is 720p, runs up to 60 seconds, and maintains camera precision that beats models five times its size.

This isn’t image-to-video generation where you type a prompt and hope for the best. You’re defining the camera path through a 3D-consistent scene. The model tracks rotation, translation and movement with metric-scale accuracy across the full minute.

NVIDIA just unleashed SANA-WM and it’s an absolute MONSTER for the future of open source AI!

A blazing-fast 2.6B-parameter open-source world model that doesn’t just generate video… it creates controllable, physics-rich, high-fidelity worlds on demand.
Why this is insanely… pic.twitter.com/fQ4IOHGhEK

— Brian Roemmele (@BrianRoemmele) May 16, 2026

The single-GPU trick

Most competing world models require eight GPUs just for inference. LingBot-World uses 14 billion parameters across two models on eight GPUs and produces 0.6 videos per hour. HY-WorldPlay needs eight GPUs for 1.1 videos per hour at 480p.

SANA-WM generates 24.1 videos per hour on a single GPU at 720p. With the full refinement pipeline running on eight H100s, throughput hits 22.0 videos per hour, a 36x advantage over LingBot-World at comparable visual quality scores.

The distilled inference variant, running on a single RTX 5090 with NVFP4 quantization, denoises a complete 60-second 720p clip in 34 seconds. Total memory footprint for the full pipeline: 74.7 GB, which fits inside an H100’s 80 GB budget.

How it handles minute-long video without melting

Standard attention in video diffusion models scales quadratically with sequence length. A 60-second video at 720p means 961 latent frames. On a single GPU, standard softmax attention simply runs out of memory.

SANA-WM solves this with a hybrid architecture. Fifteen of its 20 transformer blocks use frame-wise Gated DeltaNet (GDN), a recurrent mechanism that maintains a constant-size memory state regardless of how long the video gets. A decay gate forgets stale frames. A delta-rule correction updates only the difference between what the model predicts and what it needs, not the entire state.

The remaining five blocks use traditional softmax attention, placed at regular intervals to anchor long-range spatial consistency where the recurrent approach alone falls short.

To prevent the gradient instability that killed earlier attempts, the team developed a key-scaling formula (1/√(D·S)) that keeps the transition matrix bounded. Without it, training crashes with NaN errors within the first 16 steps.

Camera control that tracks two timescales

Controlling a camera through a minute-long scene requires precision at two different temporal rates. SANA-WM uses a dual-branch system to handle both.

The coarse branch operates at the latent-frame rate, applying Unified Camera Positional Encoding (UCPE) to capture the global trajectory structure across the full sequence.

The fine branch addresses a compression problem: each latent token represents eight raw video frames, each with its own camera pose. The fine branch computes pixel-wise Plücker ray maps from all eight frames, packs them into a 48-channel tensor, and injects this data after each self-attention output. This recovers the intra-stride camera motion that the coarse branch can’t see.

In ablation tests, UCPE alone achieved a Camera Motion Consistency (CamMC) score of 0.2453. Adding Plücker mixing dropped it to 0.2047, the best among all compared methods.

A second-stage refiner cleans up drift

Raw output from the first stage is spatiotemporally consistent but can develop structural artifacts over long sequences. A second-stage refiner, built on the 17 billion-parameter LTX-2 model with rank-384 LoRA adapters, corrects these issues in just three denoising steps.

The impact is measurable. On hard camera trajectories, visual quality degradation from the first 10 seconds to the last 10 seconds (ΔIQ) dropped from 3.09 to 0.31 after refinement. On simple trajectories, it went from 3.79 to 1.17.

Training took 18.5 days on 64 H100s

The entire training pipeline used 212,975 video clips drawn from seven public and synthetic sources, with metric-scale 6-DoF pose annotations generated by a modified version of the VIPE camera-pose annotation engine.

Training proceeded in four progressive stages over approximately 15 days, preceded by a 3.5-day VAE adaptation step. The process started with short five-second clips to establish the GDN architecture, then scaled to full 60-second sequences with camera control, and finished with distillation to reduce inference to four denoising steps.

Custom fused Triton kernels for GDN scan and gate operations contributed 1.5x to 2x throughput gains throughout training.

Benchmark results against the field

On Nvidia’s purpose-built 60-second benchmark (80 scenes across four categories with simple and hard trajectory splits), SANA-WM with the refiner achieved rotation errors of 4.50° and 8.34° on simple and hard splits respectively, versus 10.47° and 18.99° for LingBot-World. VBench visual quality scores hit 80.62 and 81.89, matching LingBot-World’s 81.82 and 81.89 while outputting at 720p instead of 480p on a single GPU instead of eight.

The model has acknowledged limitations. There’s no explicit 3D scene memory, which means it can drift in dynamic scenes or unusual viewpoints. The authors suggest using the fast stage-one model to search through trajectory options, then selectively running the refiner on promising results.

Open source under Apache 2.0

SANA-WM is available through the NVlabs/Sana GitHub repository with Apache 2.0 licensing for the code. Individual dataset and model weight licenses vary. The repo also hosts SANA, SANA-1.5, SANA-Sprint, and SANA-Video.

Three inference variants ship with the release: a bidirectional generator for highest-quality offline synthesis, a chunk-causal autoregressive generator for sequential streaming, and the distilled autoregressive variant that hits the 34-second mark on a single RTX 5090.

You Might Also Like

My Lovely Planet Unveils $120K Pink Diamond Contest

Shard Legends: Clan Wars Mining Season Opens with SLCW Token Rewards

EVE Frontier: Play Free Until December 9

Grand Arena Preseason Launches With Training, Leaderboards and Path to the 1M Dollar Prize Pool

Axie Classic Competitive Season 14 launches with new challenges.

TAGGED:AllNVIDIASANA-WM
Share This Article
Facebook X Whatsapp Whatsapp Reddit Telegram Copy Link Print
Share
By Staycalm4now
Owner
Follow:
George Tsagkarakis, known as Staycalm4now is a professional author in the crypto gaming industry since early 2018. He has experienced all the growth of Blockchain Gaming and helped multiple projects achieve their goals and established a player base. He is the co-founder of egamers.io and now the Founder and owner of CryptoGames.gg He is also the COO of MyStage, an AI x Crypto Startup.
Previous Article Bored Club Hosts Multiple Dinners For BAYC & MAYC Holders Across Canada Bored Club Hosts Multiple Dinners For BAYC & MAYC Holders Across Canada
Next Article Myth Legends Sets May 22 Google Play launch, Confirmed For CROSS Gamechain Myth Legends Sets May 22 Google Play launch, Confirmed For CROSS Gamechain
Leave a Comment
Subscribe
Login
Notify of
Please login to comment
0 Comments
Oldest
Newest Most Voted
FacebookLike
XFollow
YoutubeSubscribe
TiktokFollow
TelegramFollow

Stay Updated

Join our telegram Channel and stay in the loop with the most important news.
Latest News
Gigaverse launches GIGATHON #1 hackathon for web3 developers
Gigaverse launches GIGATHON #1 hackathon for web3 developers
June 16, 2026
Immortal Rising 2 launches pet system update with new companions
Immortal Rising 2 launches pet system update with new companions
June 16, 2026
A hooded person reaching for a Download Now button as black tentacles with binary code emerge from the screen
Grand Theft Auto 6 hype fuels data theft risks, warns NordVPN
June 16, 2026
Toncoin rebrands to Gram as TON token switch launches today
Toncoin rebrands to Gram as TON token switch launches today
June 16, 2026
Fishing Frenzy shuts down as developer Uncharted closes operations
Fishing Frenzy shuts down as developer Uncharted closes operations
June 15, 2026
A penguin with a backpack looks toward a glowing ice tower in a snowy fantasy world with banners saying Farewell and Explore
Pudgy Party Ends Development To Prioritize Pudgy World Project
June 15, 2026
LOL Land launches Fun House Club feature with PvP leaderboard
LOL Land launches Fun House Club Feature With PvP leaderboard
June 12, 2026
NAKA Merge 2048 launches with play-to-earn and free-to-play modes
NAKA Merge 2048 Launches: New Dual-Mode Puzzle Game on Polygon
June 12, 2026

You Might Also Like

Ethereum Price Prediction 2026: Can ETH Maintain Its Edge as New Web3 Ecosystems Emerge?
CryptoNews

Ethereum Price Prediction 2026: Can ETH Maintain Its Edge as New Web3 Ecosystems Emerge?

9 Min Read
BountyKinds Season 8 Head Start Event: Hall Raid Now Live
Crypto GamesNews

BountyKinds Season 8 Head Start Event: Hall Raid Now Live

4 Min Read
"Mythical Games Launches Innovative USDC Feature: Revolutionizing In-Game Transactions"
Crypto GamesNews

Mythical Games to Launch Pulse Market

3 Min Read
Xion Blockchain Launches $720K eSports Extravaganza
Crypto GamesNewsTournaments & Events

Xion Blockchain Initiates $720K Gaming Tournament

3 Min Read

Always Stay Up to Date

Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Crypto Games GG Logo. Crypto Games GG Logo.

CryptoGames.GG is a Crypto Games List and News Portal.

We share valuable information about Play To Earn Games and Other Web3 Projects.

While CryptoGames.GG uses AI to produce and draft content; every piece of information is fact-checked by a human, reviewed, and edited as needed.

News

  • Crypto Games
    • News
    • Presales
    • Airdrops & Giveaways
    • Tournaments & Events
    • Reviews
    • Guides
    • Editorials
  • Reviews
  • Others
    • Blockchains
      • News
    • Dapps
    • Press Release
  • Regular Games

The Boring Stuff

  • About Us
  • RSS Feeds
  • Contact
  • Disclaimer
  • Terms and Conditions
  • Privacy Policy
  • Review Process Statement

Join Our New Telegram Group

Discover the most importa news, from presales to giveaways and game updates.
Join Now
2026 CryptoGames.GG All Rights Reserved
wpDiscuz
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?