Rent an RTX 5080 for Blackwell FP4 inference on a 16 GB consumer card. 16 GB GDDR7 at 960 GB/s — Flux at FP4 doubles Ada throughput, Llama-3 13B INT8 serves at 1,800 tok/s, Wan2.1 video gen with offload. Spun up in under 90 seconds, billed per-minute, paid in BTC, USDT/USDC or CLORE. New silicon, new tensor paths, real efficiency wins on production inference.
5080 brings Blackwell tensor cores down to consumer pricing — 16 GB GDDR7, FP4 throughput, and 360 W of bandwidth-rich silicon. Best for 16 GB-class production diffusion and quantized inference.
First consumer GPU with native FP4 tensor cores. Flux at FP4 roughly doubles 4080-class throughput, GDDR7 lifts 13B serving by ~1.3× over a 4080 on the same quant. The efficiency king for 7B/13B inference where 16 GB is enough but you want the 2026 silicon advantage.
SD 1.5, SDXL, ComfyUI workflows. Blender Cycles with OptiX delivers solid 1080p–4K renders at hobbyist-friendly cost.
vLLM and TGI containers run 7B–13B FP16 models with comfortable batch sizes. The cheapest path to production-grade open-source inference.
When the 4080's 16 GB still fits but you want Blackwell's energy and FP4 throughput. Specs from Nvidia's reference sheet.
// prices are spot-market lows · refreshed every 60 s
Every server is priced by its host. These are the live floors across the marketplace — you'll see hundreds of variants once you're in.
No sales call. No quota request. No three-week procurement. The first four commands are all you need.
Filter the marketplace by RTX 5080, country, GPU count, reliability score, network speed.
Choose a Docker image — PyTorch, vLLM, ComfyUI, Blender — or paste your own.
You get a public endpoint, an SSH key, and Jupyter on port 8888 in under 90 s.
Per-minute billing rounds to the second. Stop the instance and the meter stops with it.
The 5080 has 16 GB GDDR7 vs the 4090’s 24 GB GDDR6X — less VRAM but newer Blackwell tensor cores with FP4 support. For 16 GB-class workloads (SDXL, 7B fine-tune, 13B INT8) the 5080 wins on energy and FP4 throughput. For 24 GB workloads (34B, 70B INT4 across 2 cards) the 4090 still wins.
Consumer cards on CLORE.AI cover most hobby and indie workflows: Stable Diffusion 1.5 and SDXL, ComfyUI/Automatic1111, Flux.1, LoRA and QLoRA fine-tuning of 7B-13B LLMs, Whisper transcription, video transcoding, Blender Cycles, and game-server hosting. Anything that fits in 8-32 GB VRAM and runs in Docker runs here. You get full root SSH plus a Jupyter template if you want one.
Cold-start lands in roughly 60-90 seconds for a typical Docker image: server allocation, container pull, GPU passthrough, SSH up. Pre-cached templates (PyTorch, ComfyUI, vLLM, Ollama) are faster because the image is already on the host. Once running you pay per minute, so a 10-minute experiment costs ten minutes of rental, not an hour.
On-demand is a fixed per-hour price the host sets; the rental cannot be revoked while you have funds. Spot is auction-style: you bid, the highest bidder runs, and a higher bidder can preempt you. Spot is typically 30-50% cheaper. CLORE.AI charges 2.5% on spot and 10% on on-demand, split 50/50 with the host.
Spot prices on CLORE.AI usually beat RunPod community pricing because there is no centralized markup; you rent directly from the host with a 2.5% spot fee. Vast.ai is the closest comparison, and on consumer cards CLORE.AI is generally within a few cents per hour. Hold CLORE in your wallet for Proof of Holding and you stack up to 50% off the marketplace fee.
Yes. Point at any registry - Docker Hub, GHCR, Quay, your private registry - then set env vars, port forwards, and your SSH public key in the rent dialog. Templates on the platform are just preset configs; nothing is locked down. You get full root inside the container with GPU passthrough.
16 GB GDDR7 with Blackwell tensor cores and FP4 support — the efficiency king for 7B/13B inference.
FP4 on Blackwell roughly doubles Flux throughput vs Ada FP8 at the same VRAM footprint.
Read the guide →GDDR7 bandwidth (960 GB/s) lifts 13B serving throughput ~1.3× vs a 4080 on the same quant.
Read the guide →Wan2.1 fits with offload on 16 GB; scale to 5090 for native 5-second clips without offload latency.
Read the guide →Side-by-side specs across the consumer tier. Click any row to see that GPU.
Step-by-step guides verified on CLORE.AI hardware. Pick a workload, copy the docker image, ship in minutes.
Per-minute payouts in BTC, USDT, USDC or CLORE. No listing fee, no contracts, withdraw any time.
Hosts around the world are accepting workloads right now. Sign up, top up your wallet, and the next hour is yours.