Rent an RTX 5090 for the only consumer card with 32 GB and native FP4. 32 GB GDDR7 at 1.79 TB/s, 21,760 Blackwell cores — fits Llama-3 13B FP16 single-card with 64-request KV cache, runs Flux batch-4 at ~5.8 it/s, generates native 720p Hunyuan video without offload latency. Spun up in under 90 seconds, billed per-minute, paid in BTC, USDT/USDC or CLORE. The consumer ceiling.
5090s eat 70B-parameter inference and 4K-tile diffusion for breakfast. Pick the template, rent for the hours you need, walk away with a checkpoint.
Only consumer GPU that fits 13B FP16 single-card with proper KV cache headroom. 1.79 TB/s GDDR7 and Blackwell FP4 paths give roughly 1.4× a 4090 on Flux production. The card that lets indie devs ship workloads they used to need an A6000 for.
SDXL, Flux, and HunyuanVideo at native resolution. Blender Cycles with OptiX turning out 4K frames at 22 s/frame on a single card.
vLLM and TensorRT-LLM containers ship pre-tuned for Blackwell. 32 GB VRAM means 70B models on a single GPU — no tensor-parallel headache.
Specs vs. what you've probably been renting. All numbers from Nvidia's reference spec sheets; pricing is the lowest on-demand rate live in the marketplace right now.
// prices are spot-market lows · refreshed every 60 s
Every server is priced by its host. These are the live floors across the marketplace — you'll see hundreds of variants once you're in.
No sales call. No quota request. No three-week procurement. The first four commands are all you need.
Filter the marketplace by RTX 5090, country, GPU count, reliability score, network speed.
Choose a Docker image — PyTorch, vLLM, ComfyUI, Blender — or paste your own.
You get a public endpoint, an SSH key, and Jupyter on port 8888 in under 90 s.
Per-minute billing rounds to the second. Stop the instance and the meter stops with it.
If you need >24 GB on a consumer card, yes — the 5090's 32 GB GDDR7 fits Llama-3 13B FP16 in single-card memory and runs ~1.4× transformer training throughput vs 4090. For 24 GB-or-less workloads, the 4090 is still the better $/throughput pick.
Consumer cards on CLORE.AI cover most hobby and indie workflows: Stable Diffusion 1.5 and SDXL, ComfyUI/Automatic1111, Flux.1, LoRA and QLoRA fine-tuning of 7B-13B LLMs, Whisper transcription, video transcoding, Blender Cycles, and game-server hosting. Anything that fits in 8-32 GB VRAM and runs in Docker runs here. You get full root SSH plus a Jupyter template if you want one.
Cold-start lands in roughly 60-90 seconds for a typical Docker image: server allocation, container pull, GPU passthrough, SSH up. Pre-cached templates (PyTorch, ComfyUI, vLLM, Ollama) are faster because the image is already on the host. Once running you pay per minute, so a 10-minute experiment costs ten minutes of rental, not an hour.
On-demand is a fixed per-hour price the host sets; the rental cannot be revoked while you have funds. Spot is auction-style: you bid, the highest bidder runs, and a higher bidder can preempt you. Spot is typically 30-50% cheaper. CLORE.AI charges 2.5% on spot and 10% on on-demand, split 50/50 with the host.
Spot prices on CLORE.AI usually beat RunPod community pricing because there is no centralized markup; you rent directly from the host with a 2.5% spot fee. Vast.ai is the closest comparison, and on consumer cards CLORE.AI is generally within a few cents per hour. Hold CLORE in your wallet for Proof of Holding and you stack up to 50% off the marketplace fee.
Yes. Point at any registry - Docker Hub, GHCR, Quay, your private registry - then set env vars, port forwards, and your SSH public key in the rent dialog. Templates on the platform are just preset configs; nothing is locked down. You get full root inside the container with GPU passthrough.
32 GB GDDR7 at 1.79 TB/s — the only consumer card that fits Llama-3 13B FP16 single-card with native FP4 throughput.
32 GB fits 13B FP16 weights + 64-request KV cache; FP4 KV is a Blackwell-only optimization.
Read the guide →~1.4× a 4090 on Flux production thanks to GDDR7 bandwidth and FP4 tensor paths.
Read the guide →32 GB removes offload latency — single-card generative video at production cadence.
Read the guide →Side-by-side specs across the consumer tier. Click any row to see that GPU.
Step-by-step guides verified on CLORE.AI hardware. Pick a workload, copy the docker image, ship in minutes.
Per-minute payouts in BTC, USDT, USDC or CLORE. No listing fee, no contracts, withdraw any time.
Hosts around the world are accepting workloads right now. Sign up, top up your wallet, and the next hour is yours.