# CLORE.AI — Full Content Index > Single-file Markdown snapshot of clore.ai's public content. Generated for LLM ingestion. For the short index, see https://clore.ai/llms.txt ## Metadata - **Generated:** 2026-05-04 (deterministic — re-running produces byte-identical output) - **Source:** structured JSON under `scripts/seo/` (no HTML parsing) - **URL:** https://clore.ai/llms-full.txt - **Pages indexed:** 50 - **GPUs covered:** 24 models, 48 landing pages (rent + host) - **License:** Content available for AI training and citation under the terms at https://clore.ai/terms-and-conditions ## Table of Contents - Homepage (URL: https://clore.ai/) - Marketplace overview (URL: https://clore.ai/marketplace) - GPU landing pages (rent + host) — grouped by tier: - **Consumer** - RTX 3070 — rent: https://clore.ai/rent-3070.html, host: https://clore.ai/host-3070.html - RTX 3080 — rent: https://clore.ai/rent-3080.html, host: https://clore.ai/host-3080.html - RTX 3090 — rent: https://clore.ai/rent-3090.html, host: https://clore.ai/host-3090.html - RTX 4070 — rent: https://clore.ai/rent-4070.html, host: https://clore.ai/host-4070.html - RTX 4080 — rent: https://clore.ai/rent-4080.html, host: https://clore.ai/host-4080.html - RTX 4090 — rent: https://clore.ai/rent-4090.html, host: https://clore.ai/host-4090.html - RTX 5080 — rent: https://clore.ai/rent-5080.html, host: https://clore.ai/host-5080.html - RTX 5090 — rent: https://clore.ai/rent-5090.html, host: https://clore.ai/host-5090.html - RTX 4070 Ti — rent: https://clore.ai/rent-4070-ti.html, host: https://clore.ai/host-4070-ti.html - **Professional / workstation** - RTX A4000 — rent: https://clore.ai/rent-a4000.html, host: https://clore.ai/host-a4000.html - RTX A5000 — rent: https://clore.ai/rent-a5000.html, host: https://clore.ai/host-a5000.html - RTX A6000 — rent: https://clore.ai/rent-a6000.html, host: https://clore.ai/host-a6000.html - RTX 6000 Ada — rent: https://clore.ai/rent-6000-ada.html, host: https://clore.ai/host-6000-ada.html - A40 — rent: https://clore.ai/rent-a40.html, host: https://clore.ai/host-a40.html - **Inference** - NVIDIA L4 — rent: https://clore.ai/rent-l4.html, host: https://clore.ai/host-l4.html - NVIDIA L40S — rent: https://clore.ai/rent-l40s.html, host: https://clore.ai/host-l40s.html - Tesla T4 — rent: https://clore.ai/rent-t4.html, host: https://clore.ai/host-t4.html - A10 — rent: https://clore.ai/rent-a10.html, host: https://clore.ai/host-a10.html - **Datacenter / training** - Tesla V100 — rent: https://clore.ai/rent-v100.html, host: https://clore.ai/host-v100.html - A100 40GB — rent: https://clore.ai/rent-a100-40gb.html, host: https://clore.ai/host-a100-40gb.html - A100 80GB — rent: https://clore.ai/rent-a100-80gb.html, host: https://clore.ai/host-a100-80gb.html - H100 — rent: https://clore.ai/rent-h100.html, host: https://clore.ai/host-h100.html - H200 — rent: https://clore.ai/rent-h200.html, host: https://clore.ai/host-h200.html - B200 — rent: https://clore.ai/rent-b200.html, host: https://clore.ai/host-b200.html - Tier comparison tables: consumer, pro, inference, datacenter - Tier FAQs (rent + host): consumer, pro, inference, datacenter - Per-GPU FAQs (24 entries) - Payment methods - Fees and pricing model - Brand and network - Skipped routes --- ## URL: https://clore.ai/ ### CLORE.AI — Decentralized GPU Cloud Marketplace > CLORE.AI is the world's largest decentralized GPU cloud marketplace. Rent GPU servers for AI training, ML inference, rendering, and compute workloads. Pay per minute in Bitcoin, CLORE tokens, USDT, or USDC with no minimum commitment. CLORE.AI is a peer-to-peer marketplace connecting GPU owners with users who need compute power. Hosts list their servers; renters choose from thousands of GPUs at competitive prices. All workloads run inside Docker containers for isolation and portability. The platform supports both on-demand rentals (fixed price) and a spot market (auction-style bidding for lower prices). **Network at a glance** | Metric | Value | | --- | --- | | Founded | 2022 | | Country | Czech Republic | | GPUs available | 12,000+ | | Servers online | 3,000+ | | Countries covered | 50+ | | Billing | Per-minute, no minimum commitment | | Payment methods | BTC, CLORE, USDT, USDC | **Key differentiators** - Largest decentralized GPU network with 12,000+ cards across 50+ countries. - Crypto-native payments — BTC, CLORE, USDT, USDC, all per-minute. - Spot market with auction-style bidding for the lowest possible price. - On-demand rentals with fixed host pricing for production workloads. - Proof of Holding (PoH) — lock CLORE for marketplace fee discounts up to 50%. - MFP (Maximum Fair Price) staking — daily emission rewards up to +200% of rental price. - Auto-mining fallback for hosts when GPUs are idle. - DAO governance for protocol direction and treasury allocation. --- ## URL: https://clore.ai/marketplace ### GPU Marketplace — On-demand and spot rentals > Browse and rent GPU servers from 12,000+ GPUs across 3,000+ servers in 50+ countries. Filter by GPU model, VRAM, country, reliability score, and price. Pay per minute, in BTC, CLORE, USDT, or USDC. The marketplace is the entry point for renters. Two markets coexist on the same listings: - **On-demand** — fixed per-hour price set by the host. The rental cannot be revoked while the renter has funds. Marketplace fee 10% total, split 50/50 between host and renter. - **Spot** — auction-style bidding. Highest bidder runs; a higher bidder can preempt. Spot is typically 30–50% cheaper than on-demand. Marketplace fee 2.5% total, split 50/50. Every listing exposes the canonical filters renters care about — GPU model, VRAM, CUDA cores, country, reliability score, 30-day uptime, NVLink/MIG capability, and inbound network speed. Templates (PyTorch, ComfyUI, vLLM, Ollama) cover the common boot images; bring your own Docker image from any registry to run a custom stack with full root SSH and your own SSH key. --- # Consumer GPUs ## URL: https://clore.ai/rent-3070.html ### Rent NVIDIA GeForce RTX 3070 > Rent an **RTX 3070** for the price of a coffee per hour. 8 GB GDDR6 is the right floor for first-time AI dev boxes — Stable Diffusion 1.5, SDXL at 768², Llama-3 8B INT4 chat, Whisper transcription, YOLOv8 inference. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Cheaper than a Colab Pro subscription and the GPU is yours for the whole minute. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 8 GB GDDR6 | | TDP | 220 W | | Memory bandwidth | 448 GB/s | | CUDA cores | 5,888 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.10/hr | | Spot (average) | $0.12/hr | | On-demand (average) | $0.18/hr | **Typical workloads:** Stable Diffusion 1.5, SDXL with optimization, small LLM inference (Llama-3 8B INT4), Whisper transcription, video transcoding, game servers An 8 GB Ampere card that punches above its weight on quantized LLMs, SD 1.5 production, and lightweight inference pipelines. **Workload spotlights:** - **SDXL Turbo 1-step generation** — stack: ComfyUI + xformers + fp16, tiled VAE; metric: ~1.4 it/s @ 768², batch 2. 8 GB VRAM is tight for SDXL — stick to 768² with tiled VAE, or fall back to SD 1.5 for batch 4 at 512². - **Llama-3 8B INT4 chat (Ollama)** — stack: Ollama + llama.cpp Q4_K_M; metric: ~55 tok/s single-stream. Quantized 8B weights consume ~5 GB VRAM — leaves headroom for 8K-context chats and a small embedding model. - **Whisper-large v3 transcription** — stack: faster-whisper + CTranslate2 fp16; metric: ~12× realtime on 16 kHz audio. Streaming transcription jobs cost a few cents per hour of audio at 3070 spot prices. **Why this card:** *8 GB at the absolute floor of pricing* — The cheapest entry to the AI dev box. Quantized 8B LLMs, SD 1.5 production, and YOLO inference run comfortably — and a full hour of compute costs less than a takeaway coffee. Perfect first card for hobbyists, students, and anyone benchmarking before scaling up. **Q: Is 8 GB VRAM enough for SDXL on a 3070?** A: Yes — SDXL runs at 768² with optimizations like xformers, fp16, and tiled VAE. For full 1024² batch-2 you'll want a 3080 or 3090. Quantized 8B LLMs and SD 1.5 fit comfortably. **Related GPUs:** [RTX 3080](https://clore.ai/rent-3080.html), [RTX 3090](https://clore.ai/rent-3090.html), [RTX 4080](https://clore.ai/rent-4080.html) --- ## URL: https://clore.ai/host-3070.html ### Host NVIDIA GeForce RTX 3070 on CLORE.AI > List your **RTX 3070** on Clore.ai. Even an 8 GB card pays — hobbyists running SD 1.5, Ollama, and Whisper rent these all day for low-cost AI experiments. Net **~$91/month** per card before any MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Lock CLORE behind the server and earn up to **+200%** daily emission on top. **Estimated earnings:** $91/month net at $0.10/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 8 GB GDDR6 | | TDP | 220 W | | Memory bandwidth | 448 GB/s | | CUDA cores | 5,888 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.10/hr | | Spot (average) renter price | $0.12/hr | | On-demand (average) renter price | $0.18/hr | | Estimated monthly net | $91/month | **Why host this card:** *Old card, real demand still* — 8 GB Ampere is still the cheapest path into AI for new developers — and the rental volume is steady. A single 3070 net-pays around $91/month at typical spot, more on-demand. Good way to monetize a desk card before retiring it from your gaming rig. **Related GPUs:** [RTX 3080](https://clore.ai/host-3080.html), [RTX 3090](https://clore.ai/host-3090.html), [RTX 4080](https://clore.ai/host-4080.html) --- ## URL: https://clore.ai/rent-3080.html ### Rent NVIDIA GeForce RTX 3080 > Rent an **RTX 3080** when 8 GB is too tight but 24 GB is overkill. 10 GB GDDR6X at 760 GB/s is the cheapest card that runs SDXL natively at 1024², Llama-3 8B FP16 inference, and 7B QLoRA fine-tunes. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The graduation card from SD 1.5 hobbyism into real Stable Diffusion XL production. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 10 GB GDDR6X | | TDP | 320 W | | Memory bandwidth | 760 GB/s | | CUDA cores | 8,704 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.14/hr | | Spot (average) | $0.18/hr | | On-demand (average) | $0.26/hr | **Typical workloads:** SDXL 1024² batch-1, Llama-3 8B FP16 inference, QLoRA on 7B, Stable Video Diffusion, Blender Cycles renders 10 GB GDDR6X and 760 GB/s bandwidth make the 3080 the entry point for full-resolution SDXL and 7B fine-tuning. **Workload spotlights:** - **SDXL 1024² production** — stack: Automatic1111 + xformers + fp16; metric: ~3.1 it/s @ 1024² batch 1. Tiled VAE pushes to batch 2 — the 3080 is the cheapest card that runs SDXL natively at full res. - **Llama-3 8B FP16 inference** — stack: llama.cpp server, batch 1, 8K context; metric: ~70 tok/s. FP16 8B weights fit with KV cache headroom; switch to INT8 for 13B with offload. - **XTTS v2 voice cloning** — stack: Coqui XTTS + fp16; metric: ~0.18 RTF (5.5× realtime). Real-time voice cloning pipeline with 6-second reference samples — production-grade for podcast tooling. **Why this card:** *Cheapest native SDXL 1024 card* — 10 GB GDDR6X at 760 GB/s is exactly the spec where SDXL stops needing tiled VAE workarounds and starts running natively at 1024² batch-1. Spot floor sits around $0.14/hr — the budget-conscious upgrade path from 3070-class hobbyist work into real diffusion pipelines. **Q: Can a 3080 run SDXL at full 1024² resolution?** A: Yes — 10 GB GDDR6X is enough for SDXL at 1024² batch-1, and with tiled VAE you can push to batch-2. For batch-4 production pipelines, step up to a 3090 or 4080 with 16+ GB. **Related GPUs:** [RTX 3070](https://clore.ai/rent-3070.html), [RTX 3090](https://clore.ai/rent-3090.html), [RTX 4080](https://clore.ai/rent-4080.html) --- ## URL: https://clore.ai/host-3080.html ### Host NVIDIA GeForce RTX 3080 on CLORE.AI > List your **RTX 3080** and let SDXL hobbyists rent it through the night. 10 GB is the sweet spot for diffusion workloads that have outgrown 8 GB cards — net around **$133/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps. Stake CLORE behind the server for up to **+200%** daily emission on top of rental. **Estimated earnings:** $133/month net at $0.15/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 10 GB GDDR6X | | TDP | 320 W | | Memory bandwidth | 760 GB/s | | CUDA cores | 8,704 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.15/hr | | Spot (average) renter price | $0.18/hr | | On-demand (average) renter price | $0.26/hr | | Estimated monthly net | $133/month | **Why host this card:** *The diffusion workhorse renters look for* — Renters specifically search for 10 GB SDXL-capable cards at the price floor — and the 3080 nails that bracket. Higher utilization than 3070-class because SDXL is the entry workload most users run for hours, not minutes. Steady rental fill at attractive on-demand premiums. **Related GPUs:** [RTX 3070](https://clore.ai/host-3070.html), [RTX 3090](https://clore.ai/host-3090.html), [RTX 4080](https://clore.ai/host-4080.html) --- ## URL: https://clore.ai/rent-3090.html ### Rent NVIDIA GeForce RTX 3090 > Rent an **RTX 3090** for the cheapest 24 GB VRAM on the consumer tier. 24 GB GDDR6X at 936 GB/s — runs Flux production at 1024², 13B QLoRA fine-tuning with bitsandbytes, and dual-card 70B INT4 serving via ExLlamaV2 with tensor parallel. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The legacy value pick of 2026. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6X | | TDP | 350 W | | Memory bandwidth | 936 GB/s | | CUDA cores | 10,496 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.18/hr | | Spot (average) | $0.22/hr | | On-demand (average) | $0.30/hr | **Typical workloads:** QLoRA on 13B–34B models, SDXL batch-4, Flux.1, ComfyUI production, 70B INT4 across 2 cards 24 GB on a consumer card unlocks 13B–34B QLoRA, Flux production, and dual-GPU 70B INT4 — the legacy value pick of 2026. **Workload spotlights:** - **Flux.1 dev 1024²** — stack: ComfyUI + fp8 dev checkpoint; metric: ~1.6 s/it @ 1024² batch 1. 24 GB lets you keep T5-XXL encoder resident; cuts cold-start latency vs swapping on a 16 GB card. - **QLoRA fine-tune Llama-3 13B** — stack: PEFT + bitsandbytes 4-bit + Flash Attn 2; metric: ~1,900 tokens/s, ~14 GB peak VRAM. Standard 13B QLoRA fits with 4K context and gradient checkpointing — a complete fine-tune in a few hours of spot. - **70B INT4 across 2× 3090** — stack: ExLlamaV2 + tensor parallel; metric: ~22 tok/s aggregated. Two 3090s cost less than a single 4090 and serve 70B INT4 with 32K context for solo developers. **Why this card:** *Cheapest 24 GB VRAM you can rent* — Same VRAM ceiling as a 4090 at half the price. No FP8 and slower bandwidth, but for budget-conscious 13B–34B QLoRA, Flux production, and dual-card 70B INT4 the 3090 is the value pick — and there is plenty of supply on the spot market. **Q: Why pick a 3090 over a 4090?** A: The 3090 has the same 24 GB VRAM as the 4090 at roughly 60% of the rental price. Slower memory bandwidth (936 vs 1,008 GB/s) and no FP8, but for budget-sensitive 24 GB workloads it's the value pick. **Related GPUs:** [RTX 4080](https://clore.ai/rent-4080.html), [RTX 4090](https://clore.ai/rent-4090.html), [RTX A4000](https://clore.ai/rent-a4000.html) --- ## URL: https://clore.ai/host-3090.html ### Host NVIDIA GeForce RTX 3090 on CLORE.AI > List your **RTX 3090** on Clore.ai. 24 GB consumer cards pull steady rental volume from solo AI developers running Flux, 13B fine-tunes, and dual-card 70B serving — net around **$165/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps. Stake CLORE for up to **+200%** daily emission on top of every rental hour. **Estimated earnings:** $165/month net at $0.22/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6X | | TDP | 350 W | | Memory bandwidth | 936 GB/s | | CUDA cores | 10,496 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.22/hr | | Spot (average) renter price | $0.22/hr | | On-demand (average) renter price | $0.30/hr | | Estimated monthly net | $165/month | **Why host this card:** *24 GB at consumer pricing fills 24/7* — Solo developers and indie ML teams gravitate to the cheapest 24 GB rental on the marketplace, and the 3090 is exactly that. NVLink-pair listings command an extra premium for 70B INT4 serving. High-utilization card with predictable monthly net well above 3070/3080 tier. **Related GPUs:** [RTX 4080](https://clore.ai/host-4080.html), [RTX 4090](https://clore.ai/host-4090.html), [RTX A4000](https://clore.ai/host-a4000.html) --- ## URL: https://clore.ai/rent-4070.html ### Rent NVIDIA GeForce RTX 4070 > Rent an **RTX 4070** when you want Ada efficiency on a budget. 12 GB GDDR6X at a 200 W envelope — runs SDXL 1024² with ControlNet, Llama-3 8B FP16 inference at ~80 tok/s, DreamBooth on SD 1.5, ComfyUI graphs with adapter stacks. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Cooler, quieter and newer than Ampere at a friendly hourly rate. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 12 GB GDDR6X | | TDP | 200 W | | Memory bandwidth | 504 GB/s | | CUDA cores | 5,888 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.16/hr | | Spot (average) | $0.20/hr | | On-demand (average) | $0.28/hr | **Typical workloads:** SDXL 1024² batch-1, ComfyUI, Llama-3 8B FP16, ControlNet, Whisper, video transcode 12 GB Ada at a 200 W envelope — the efficient pick for SDXL, 8B inference, and ControlNet-heavy ComfyUI graphs. **Workload spotlights:** - **SDXL 1024² with ControlNet** — stack: ComfyUI + ControlNet + fp16; metric: ~2.6 it/s @ 1024² batch 1. 12 GB fits SDXL + a single ControlNet adapter; for stacked adapters move to 4070 Ti or 4080. - **Llama-3 8B FP16 inference** — stack: llama.cpp server, 8K context; metric: ~80 tok/s single-stream. Ada cores edge out Ampere on transformer inference per watt — 200 W TDP keeps host costs low. - **DreamBooth on SD 1.5** — stack: Diffusers + 8-bit Adam; metric: ~9 min for 1,500 steps, batch 2. Subject DreamBooth runs comfortably on 12 GB; SDXL DreamBooth wants a 16 GB card or tight memory tricks. **Why this card:** *Ada Lovelace at 200 W envelope* — 12 GB GDDR6X with modern Ada cores at a fraction of the power draw of a 3090. Runs SDXL 1024² + one ControlNet, 8B FP16 chat at ~80 tok/s, and SD 1.5 DreamBooth without breaking a sweat. The energy-conscious pick for daily AI development work. **Q: Can a 4070 handle SDXL and 7B LLMs?** A: Yes — 12 GB GDDR6X fits SDXL 1024² batch-1 and 7B Llama FP16 inference comfortably. Tighter than a 3090 but cheaper, modern Ada cores, and lower power (200 W vs 350 W). Step up to 4070 Ti or 4080 for batch-2 SDXL or 13B. **Related GPUs:** [RTX 3070](https://clore.ai/rent-3070.html), [RTX 4070 Ti](https://clore.ai/rent-4070-ti.html), [RTX 4080](https://clore.ai/rent-4080.html) --- ## URL: https://clore.ai/host-4070.html ### Host NVIDIA GeForce RTX 4070 on CLORE.AI > List your **RTX 4070** on Clore.ai. 12 GB Ada cards rent at premium $/W ratios because their 200 W envelope keeps your power bill down while clearing the full SDXL and 8B-LLM workload bracket. Net around **$155/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps, no minimum balance. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $155/month net at $0.22/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 12 GB GDDR6X | | TDP | 200 W | | Memory bandwidth | 504 GB/s | | CUDA cores | 5,888 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.22/hr | | Spot (average) renter price | $0.20/hr | | On-demand (average) renter price | $0.28/hr | | Estimated monthly net | $155/month | **Why host this card:** *Best $/watt in the consumer tier* — 200 W TDP at 12 GB Ada keeps your electric bill at floor while still booking the same workloads renters look for at the 10–12 GB bracket. Hosts in expensive-electricity regions consistently report better net margins on 4070 than on hotter Ampere cards of similar throughput. **Related GPUs:** [RTX 3070](https://clore.ai/host-3070.html), [RTX 4070 Ti](https://clore.ai/host-4070-ti.html), [RTX 4080](https://clore.ai/host-4080.html) --- ## URL: https://clore.ai/rent-4080.html ### Rent NVIDIA GeForce RTX 4080 > Rent an **RTX 4080** for production diffusion and 7B/13B serving. 16 GB GDDR6X with 9,728 Ada cores — the cheapest card that runs SDXL batch-4 at ~6.5 it/s, serves Llama-3 8B FP16 via vLLM, and finishes 8B QLoRA fine-tunes in 2–3 hours of spot rental. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The 16 GB production tier. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 16 GB GDDR6X | | TDP | 320 W | | Memory bandwidth | 716 GB/s | | CUDA cores | 9,728 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.27/hr | | Spot (average) | $0.32/hr | | On-demand (average) | $0.42/hr | **Typical workloads:** SDXL/SD3/Flux production, LoRA fine-tuning on 7B, 13B inference, Unreal Engine cinematics, AV1 video encode 16 GB Ada — the production pick for SDXL/Flux at scale, 7B fine-tunes, and 13B INT8 inference. **Workload spotlights:** - **vLLM serving Llama-3 8B FP16** — stack: vLLM + continuous batching; metric: ~1,400 tok/s aggregated, p50 35 ms. 16 GB fits 8B FP16 plus 16-request KV cache — the cheapest card to run a real serving stack. - **SDXL batch-4 production** — stack: ComfyUI + xformers + fp16; metric: ~6.5 it/s @ 1024² batch 4. Batch-4 generation pipeline for client work — 16 GB clears all VAE/CLIP/UNet caches simultaneously. - **Llama-3 8B QLoRA fine-tune** — stack: PEFT + 4-bit + Flash Attn 2; metric: ~3,100 tokens/s, ~12 GB peak. 8B fine-tunes complete in 2–3 hours of 4080 spot rental — fits 8K context with gradient checkpointing. **Why this card:** *16 GB Ada — real serving stack territory* — 16 GB is the floor where vLLM with 16-request KV cache fits cleanly for 8B FP16 serving — the spec where hobby diffusion turns into real production batch work. ~70% of 4090 throughput at ~55% of the rental price, and FP8 inference paths supported. **Q: When should I pick a 4080 over a 4090?** A: Pick the 4080 when 16 GB is enough — SDXL batch-2, 7B fine-tuning, 13B INT8 inference. ~70% of 4090 throughput at ~55% of the rental price. Step up to 4090 for 24 GB and 70B INT4 work. **Related GPUs:** [RTX 3090](https://clore.ai/rent-3090.html), [RTX 4090](https://clore.ai/rent-4090.html), [RTX 5090](https://clore.ai/rent-5090.html) --- ## URL: https://clore.ai/host-4080.html ### Host NVIDIA GeForce RTX 4080 on CLORE.AI > List your **RTX 4080** and rent to teams running real diffusion production and 8B serving. 16 GB Ada is the cheapest card that fits a real vLLM stack — net around **$258/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps. Stake CLORE for up to **+200%** daily emission on top of rental. **Estimated earnings:** $258/month net at $0.28/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 16 GB GDDR6X | | TDP | 320 W | | Memory bandwidth | 716 GB/s | | CUDA cores | 9,728 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.28/hr | | Spot (average) renter price | $0.32/hr | | On-demand (average) renter price | $0.42/hr | | Estimated monthly net | $258/month | **Why host this card:** *16 GB clears the vLLM threshold* — 16 GB GDDR6X is the inflection point where renters move from hobby projects to production serving stacks — and they pay accordingly. On-demand floors run measurably above 4070 Ti tier because 4080 fits batch-4 SDXL pipelines and 8B FP16 vLLM with proper KV cache. **Related GPUs:** [RTX 3090](https://clore.ai/host-3090.html), [RTX 4090](https://clore.ai/host-4090.html), [RTX 5090](https://clore.ai/host-5090.html) --- ## URL: https://clore.ai/rent-4090.html ### Rent NVIDIA GeForce RTX 4090 > Rent an **RTX 4090** — the most-rented consumer card on the network since 2023. 24 GB GDDR6X at 1,008 GB/s, 16,384 Ada cores. Production Flux batch-4, 8B FP16 vLLM serving at 2,200 tok/s, 34B QLoRA, single-card Hunyuan video, dual-card 70B INT4. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The reference consumer GPU for ComfyUI commercial work. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 24 GB GDDR6X | | TDP | 450 W | | Memory bandwidth | 1,008 GB/s | | CUDA cores | 16,384 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.31/hr | | Spot (average) | $0.39/hr | | On-demand (average) | $0.49/hr | **Typical workloads:** Flux/SDXL production at 1024² batch-4, Llama-3 8B full fine-tunes, 34B QLoRA, Blender 4K, 70B INT4 across 2× cards 24 GB GDDR6X and 1 TB/s bandwidth — the canonical consumer card for Flux production, 34B QLoRA, and 70B INT4. **Workload spotlights:** - **Flux.1 dev 1024² batch 4** — stack: ComfyUI + fp8 dev + Flash Attn 2; metric: ~4.2 it/s @ 1024² batch 4. Production-ready Flux pipeline — 4090 is the de facto reference card for ComfyUI commercial work in 2026. - **vLLM Llama-3 8B FP16 serving** — stack: vLLM 0.6+ + chunked prefill; metric: ~2,200 tok/s aggregated, 32 concurrent. Serves a small product/feature with one card; horizontal scale by adding cards behind a load balancer. - **Hunyuan-Video 5-second clip** — stack: ComfyUI + fp8 + sequence parallelism; metric: ~7 min per 5 s @ 720p. 24 GB is the floor for Hunyuan; 1× 4090 generates short clips, multi-card scales linearly. **Why this card:** *Most-rented consumer card on the network* — Battle-tested inventory across 34 regions, 312+ live listings on a typical day, spot floor near $0.31/hr. The de facto reference for ComfyUI commercial pipelines, 34B QLoRA, and dual-card 70B INT4. Whatever you want to run, somebody has already published the recipe for it on a 4090. **Q: Can I run 70B models on a 4090?** A: Yes — Llama-3 70B INT4 fits across two 4090s with tensor parallelism via vLLM or ExLlamaV2. For single-card 70B you'll want an H100 or H200. 13B and 34B fit comfortably on one 4090. **Related GPUs:** [RTX 3090](https://clore.ai/rent-3090.html), [RTX 5090](https://clore.ai/rent-5090.html), [RTX A6000](https://clore.ai/rent-a6000.html) --- ## URL: https://clore.ai/host-4090.html ### Host NVIDIA GeForce RTX 4090 on CLORE.AI > List your **RTX 4090** on Clore.ai. 24 GB Ada is the highest-utilization consumer GPU on the marketplace — net around **$312/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. The reference card every commercial Flux/SDXL pipeline targets, so on-demand fill stays dense around the clock. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $312/month net at $0.49/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 24 GB GDDR6X | | TDP | 450 W | | Memory bandwidth | 1,008 GB/s | | CUDA cores | 16,384 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.49/hr | | Spot (average) renter price | $0.39/hr | | On-demand (average) renter price | $0.49/hr | | Estimated monthly net | $312/month | **Why host this card:** *Highest-utilization consumer card on the platform* — Demand for 4090s does not slow down — commercial Flux pipelines, 34B QLoRA jobs, and 70B serving via dual-card tensor parallel all converge on this exact spec. Hosts consistently report the densest rental fill of any consumer-tier card, especially in NVLink pair listings. **Related GPUs:** [RTX 3090](https://clore.ai/host-3090.html), [RTX 5090](https://clore.ai/host-5090.html), [RTX A6000](https://clore.ai/host-a6000.html) --- ## URL: https://clore.ai/rent-5080.html ### Rent NVIDIA GeForce RTX 5080 > Rent an **RTX 5080** for Blackwell FP4 inference on a 16 GB consumer card. 16 GB GDDR7 at 960 GB/s — Flux at FP4 doubles Ada throughput, Llama-3 13B INT8 serves at 1,800 tok/s, Wan2.1 video gen with offload. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. New silicon, new tensor paths, real efficiency wins on production inference. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 16 GB GDDR7 | | TDP | 360 W | | Memory bandwidth | 960 GB/s | | CUDA cores | 10,752 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.28/hr | | Spot (average) | $0.34/hr | | On-demand (average) | $0.42/hr | **Typical workloads:** Flux/SD3 production, 7B FP16 fine-tuning, 13B INT8 inference, Hunyuan/Wan2.1 video gen, FP4/FP8 quantized inference 16 GB GDDR7 with Blackwell tensor cores and FP4 support — the efficiency king for 7B/13B inference. **Workload spotlights:** - **Flux.1 dev FP4** — stack: ComfyUI + Blackwell FP4 path; metric: ~1.9 s/it @ 1024² batch 2. FP4 on Blackwell roughly doubles Flux throughput vs Ada FP8 at the same VRAM footprint. - **Llama-3 13B INT8 serving** — stack: vLLM + GPTQ 8-bit; metric: ~1,800 tok/s aggregated, p50 28 ms. GDDR7 bandwidth (960 GB/s) lifts 13B serving throughput ~1.3× vs a 4080 on the same quant. - **Wan2.1 video generation** — stack: ComfyUI + sequence parallel + fp8; metric: ~5 min per 4 s @ 720p. Wan2.1 fits with offload on 16 GB; scale to 5090 for native 5-second clips without offload latency. **Why this card:** *Blackwell FP4 on a 16 GB consumer card* — First consumer GPU with native FP4 tensor cores. Flux at FP4 roughly doubles 4080-class throughput, GDDR7 lifts 13B serving by ~1.3× over a 4080 on the same quant. The efficiency king for 7B/13B inference where 16 GB is enough but you want the 2026 silicon advantage. **Q: How does the 5080 compare to a 4090?** A: The 5080 has 16 GB GDDR7 vs the 4090’s 24 GB GDDR6X — less VRAM but newer Blackwell tensor cores with FP4 support. For 16 GB-class workloads (SDXL, 7B fine-tune, 13B INT8) the 5080 wins on energy and FP4 throughput. For 24 GB workloads (34B, 70B INT4 across 2 cards) the 4090 still wins. **Related GPUs:** [RTX 4080](https://clore.ai/rent-4080.html), [RTX 4090](https://clore.ai/rent-4090.html), [RTX 5090](https://clore.ai/rent-5090.html) --- ## URL: https://clore.ai/host-5080.html ### Host NVIDIA GeForce RTX 5080 on CLORE.AI > List your **RTX 5080** on Clore.ai. New Blackwell silicon at a 16 GB price bracket — renters chase the FP4 path for inference workloads, so on-demand demand runs hot. Net around **$235/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Stake CLORE for up to **+200%** daily emission on top of rental. **Estimated earnings:** $235/month net at $0.38/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 16 GB GDDR7 | | TDP | 360 W | | Memory bandwidth | 960 GB/s | | CUDA cores | 10,752 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.38/hr | | Spot (average) renter price | $0.34/hr | | On-demand (average) renter price | $0.42/hr | | Estimated monthly net | $235/month | **Why host this card:** *Blackwell FP4 inventory, still scarce supply* — Consumer Blackwell supply is still tight in 2026 — listings get noticed. ML developers chase FP4 paths for production inference workloads they can't run efficiently on Ada, which keeps on-demand rates above 4080 levels even at the same VRAM bracket. **Related GPUs:** [RTX 4080](https://clore.ai/host-4080.html), [RTX 4090](https://clore.ai/host-4090.html), [RTX 5090](https://clore.ai/host-5090.html) --- ## URL: https://clore.ai/rent-5090.html ### Rent NVIDIA GeForce RTX 5090 > Rent an **RTX 5090** for the only consumer card with 32 GB and native FP4. 32 GB GDDR7 at 1.79 TB/s, 21,760 Blackwell cores — fits Llama-3 13B FP16 single-card with 64-request KV cache, runs Flux batch-4 at ~5.8 it/s, generates native 720p Hunyuan video without offload latency. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The consumer ceiling. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 32 GB GDDR7 | | TDP | 575 W | | Memory bandwidth | 1,792 GB/s | | CUDA cores | 21,760 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.39/hr | | Spot (average) | $0.49/hr | | On-demand (average) | $0.62/hr | **Typical workloads:** Flux/SD3 production, Llama-3 13B FP16 fine-tuning single-card, Hunyuan/Wan2.1 video gen, 34B QLoRA, FP4/FP8 quantized inference 32 GB GDDR7 at 1.79 TB/s — the only consumer card that fits Llama-3 13B FP16 single-card with native FP4 throughput. **Workload spotlights:** - **Llama-3 13B FP16 single-card** — stack: vLLM + Blackwell FP4 KV cache; metric: ~3,200 tok/s aggregated, 64 concurrent. 32 GB fits 13B FP16 weights + 64-request KV cache; FP4 KV is a Blackwell-only optimization. - **Flux.1 dev batch-4 FP4** — stack: ComfyUI + Blackwell FP4; metric: ~5.8 it/s @ 1024² batch 4. ~1.4× a 4090 on Flux production thanks to GDDR7 bandwidth and FP4 tensor paths. - **Hunyuan-Video native 720p** — stack: ComfyUI + sequence parallel + fp8; metric: ~5 min per 5 s @ 720p, no offload. 32 GB removes offload latency — single-card generative video at production cadence. **Why this card:** *32 GB on a consumer card, FP4 native* — Only consumer GPU that fits 13B FP16 single-card with proper KV cache headroom. 1.79 TB/s GDDR7 and Blackwell FP4 paths give roughly 1.4× a 4090 on Flux production. The card that lets indie devs ship workloads they used to need an A6000 for. **Q: Is the 5090 worth the premium over a 4090?** A: If you need >24 GB on a consumer card, yes — the 5090's 32 GB GDDR7 fits Llama-3 13B FP16 in single-card memory and runs ~1.4× transformer training throughput vs 4090. For 24 GB-or-less workloads, the 4090 is still the better $/throughput pick. **Related GPUs:** [RTX 4090](https://clore.ai/rent-4090.html), [RTX A6000](https://clore.ai/rent-a6000.html), [RTX 6000 Ada](https://clore.ai/rent-6000-ada.html) --- ## URL: https://clore.ai/host-5090.html ### Host NVIDIA GeForce RTX 5090 on CLORE.AI > List your **RTX 5090** on Clore.ai. 32 GB consumer Blackwell is the new top of the consumer tier — net around **$354/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Renters pay Blackwell premiums for FP4 paths and 32 GB headroom that no other consumer card matches. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $354/month net at $0.55/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 32 GB GDDR7 | | TDP | 575 W | | Memory bandwidth | 1,792 GB/s | | CUDA cores | 21,760 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.55/hr | | Spot (average) renter price | $0.49/hr | | On-demand (average) renter price | $0.62/hr | | Estimated monthly net | $354/month | **Why host this card:** *Consumer ceiling — pricing power follows* — Nothing else on the consumer tier offers 32 GB plus FP4 — and that translates straight into pricing power on the marketplace. Indie ML teams pay the premium for the only card that runs 13B FP16 single-card and native 720p Hunyuan without offload, repeatedly. **Related GPUs:** [RTX 4090](https://clore.ai/host-4090.html), [RTX A6000](https://clore.ai/host-a6000.html), [RTX 6000 Ada](https://clore.ai/host-6000-ada.html) --- ## URL: https://clore.ai/rent-4070-ti.html ### Rent NVIDIA GeForce RTX 4070 Ti > Rent an **RTX 4070 Ti** for the AI hobbyist sweet spot. 12 GB GDDR6X with 7,680 Ada cores — Flux production batches at ~1.1 s per image, 7B QLoRA fine-tunes finishing in under six hours, Llama-3 13B INT8 served via vLLM. Spun up in under **90 seconds**, billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The card that turns a weekend into a finished fine-tune. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 12 GB GDDR6X | | TDP | 285 W | | Memory bandwidth | 504 GB/s | | CUDA cores | 7,680 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.20/hr | | Spot (average) | $0.24/hr | | On-demand (average) | $0.33/hr | **Typical workloads:** SDXL/Flux production batch-2, 7B QLoRA fine-tuning, 13B INT8 inference, Stable Video Diffusion with offload, Blender Cycles 12 GB Ada with 7,680 cores — the sweet spot for hobbyist Flux production and 13B INT8 serving. **Workload spotlights:** - **Flux.1 schnell 1024²** — stack: ComfyUI + 4-step schnell + fp8; metric: ~1.1 s per image. Schnell variant is purpose-built for 4-step generation — ideal for batch image pipelines on a 12 GB card. - **7B QLoRA fine-tune** — stack: Axolotl + 4-bit NF4 + Flash Attn 2; metric: ~2,300 tokens/s, ~10 GB peak. Mistral/Llama 7B QLoRA at 4K context; 4070 Ti finishes a 50K-sample run in under 6 hours. - **Stable Video Diffusion 14-frame** — stack: Diffusers + cpu offload; metric: ~95 s per 14-frame clip @ 576×1024. SVD with offload fits in 12 GB; longer 25-frame variants want a 4080 with 16 GB. **Why this card:** *Hobbyist's Flux + QLoRA sweet spot* — Sits between the 4070 and 4080 on every benchmark while staying friendlier on price. 12 GB Ada handles Flux Schnell at ~1.1 s/image, 7B QLoRA finishes in under 6 hours, 13B INT8 serves via vLLM. The card hobbyists pick when they're starting to ship real fine-tunes. **Q: Is the 4070 Ti the right pick for AI hobbyists?** A: Often yes — it sits right between the 4070 and 4080 in throughput at a friendlier price. 12 GB Ada VRAM runs Flux/SDXL production, 7B QLoRA, and 13B INT8 inference. If you need 16 GB go 4080; if budget-tight stick with 4070. **Related GPUs:** [RTX 4070](https://clore.ai/rent-4070.html), [RTX 4080](https://clore.ai/rent-4080.html), [RTX 3090](https://clore.ai/rent-3090.html) --- ## URL: https://clore.ai/host-4070-ti.html ### Host NVIDIA GeForce RTX 4070 Ti on CLORE.AI > List your **RTX 4070 Ti** on Clore.ai. 12 GB Ada with 7,680 cores books at noticeable premiums in its bracket because it clears 7B QLoRA and 13B INT8 inference cleanly — workloads renters pay extra to finish overnight. Net around **$185/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $185/month net at $0.27/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 12 GB GDDR6X | | TDP | 285 W | | Memory bandwidth | 504 GB/s | | CUDA cores | 7,680 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.27/hr | | Spot (average) renter price | $0.24/hr | | On-demand (average) renter price | $0.33/hr | | Estimated monthly net | $185/month | **Why host this card:** *+25% throughput, sub-300 W envelope* — 7,680 Ada cores at 285 W gives roughly 25% more throughput than a plain 4070 at the same VRAM bracket — and that lifts your on-demand pricing materially. Strong utilization for hosts who already run a mix of consumer Ada cards in the same chassis. **Related GPUs:** [RTX 4070](https://clore.ai/host-4070.html), [RTX 4080](https://clore.ai/host-4080.html), [RTX 3090](https://clore.ai/host-3090.html) --- # Professional / workstation GPUs ## URL: https://clore.ai/rent-a4000.html ### Rent NVIDIA RTX A4000 > Rent an **RTX A4000** for ECC-protected production work at the lowest pro-tier rates. 16 GB GDDR6 ECC, single-slot 140 W form factor, ISV-certified for V-Ray, SolidWorks, and academic ML pipelines. Background SDXL inference, 24/7 YOLOv8 detection, Blender Cycles renders — all under bit-flip-safe memory. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Same professional driver branch your studio already validates against. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 16 GB GDDR6 ECC | | TDP | 140 W | | Memory bandwidth | 448 GB/s | | CUDA cores | 6,144 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.13/hr | | Spot (average) | $0.18/hr | | On-demand (average) | $0.28/hr | **Typical workloads:** SolidWorks, Rhino + V-Ray, DaVinci Resolve, batch SDXL (4× per node), background ML inference, ECC-required research 16 GB ECC at 140 W single-slot — the workstation pick for ECC-required CAD, V-Ray, and academic ML. **Workload spotlights:** - **Blender Cycles + V-Ray RTX** — stack: Blender 4.x OptiX + V-Ray 6; metric: ~1.0× RTX 3070 with ECC integrity. Single-slot 140 W form factor lets hosts pack 4× A4000 in one node — multi-tenant render farms scale linearly. - **Background SDXL inference** — stack: Automatic1111 + xformers + fp16; metric: ~2.4 it/s @ 1024² batch 1. ECC catches memory bit-flips during 24/7 batch jobs — critical for unattended pipelines processing thousands of images. - **YOLOv8 object detection** — stack: Ultralytics YOLOv8 + TensorRT; metric: ~210 FPS @ 640², batch 8. Industrial CV pipeline — ECC + ISV certification is mandatory for production manufacturing-floor deployments. **Why this card:** *ECC + ISV certs at the pro floor* — Single-slot 140 W form factor lets hosts pack four A4000s per workstation, so supply is plentiful and prices stay friendly. Same ISV certifications as A5000/A6000 at a fraction of the rental — the entry point when ECC matters but workloads fit in 16 GB. **Q: Why pick an A4000 over a 3070 with similar VRAM?** A: ECC memory and ISV certification — required for production CAD, V-Ray, and academic ML where bit-flip integrity matters. Single-slot 140 W form factor lets hosts pack 4× A4000 in one workstation chassis. Quieter and lower-power than consumer Ampere. **Related GPUs:** [RTX A5000](https://clore.ai/rent-a5000.html), [RTX 3090](https://clore.ai/rent-3090.html), [RTX A6000](https://clore.ai/rent-a6000.html) --- ## URL: https://clore.ai/host-a4000.html ### Host NVIDIA RTX A4000 on CLORE.AI > List your **RTX A4000** on Clore.ai. ECC-class hardware commands a premium even at 16 GB — academic researchers, V-Ray studios, and 24/7 inference operators pay extra for ISV-certified silicon. Net around **$148/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps. Stake CLORE for up to **+200%** daily emission on top. **Estimated earnings:** $148/month net at $0.24/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 16 GB GDDR6 ECC | | TDP | 140 W | | Memory bandwidth | 448 GB/s | | CUDA cores | 6,144 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.24/hr | | Spot (average) renter price | $0.18/hr | | On-demand (average) renter price | $0.28/hr | | Estimated monthly net | $148/month | **Why host this card:** *Single-slot 140 W — pack four per chassis* — The single-slot, 140 W form factor lets you put four A4000s in one workstation node and bill them as four independent listings. Best per-chassis revenue density in the pro tier — and the ECC + ISV branding keeps utilization high among research and CAD renters. **Related GPUs:** [RTX A5000](https://clore.ai/host-a5000.html), [RTX 3090](https://clore.ai/host-3090.html), [RTX A6000](https://clore.ai/host-a6000.html) --- ## URL: https://clore.ai/rent-a5000.html ### Rent NVIDIA RTX A5000 > Rent an **RTX A5000** for studio-grade 24 GB ECC compute. The datacenter-validated workstation card with server thermals, NVLink in pairs, and ISV certification for production V-Ray, Octane, and Citrix virtual workstation pools. 13B fine-tunes on bit-flip-safe weights, 12-hour batch renders without active-fan failure risk. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Same professional driver branch your render farm already certifies against. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6 ECC | | TDP | 230 W | | Memory bandwidth | 768 GB/s | | CUDA cores | 8,192 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.22/hr | | Spot (average) | $0.28/hr | | On-demand (average) | $0.42/hr | **Typical workloads:** Production V-Ray and Octane, 13B fine-tuning under ECC, multi-tenant Citrix workstations, ISV-certified studio pipelines 24 GB ECC + NVLink pair option — datacenter-validated alternative to the 3090 for production studios. **Workload spotlights:** - **Production V-Ray RTX rendering** — stack: V-Ray 6 + OptiX denoiser; metric: ~1.05× RTX 3080 with ECC stability. ISV-certified for production VFX studios — ECC and datacenter thermals matter for 12-hour batch renders. - **Llama-3 13B fine-tune (ECC)** — stack: PEFT QLoRA + Flash Attn 2; metric: ~1,700 tokens/s, ECC-clean weights. Academic research and regulated ML pipelines mandate ECC — A5000 keeps the same 24 GB envelope as a 3090 with bit-flip safety. - **Multi-tenant Citrix workstation** — stack: NVIDIA RTX vWS + Citrix VAD; metric: Up to 4 simultaneous users at 1080p. Virtual workstation pools — GRID/vWS license + ECC make A5000 the de-facto remote-CAD hardware in 2026. **Why this card:** *Studio-grade 24 GB with ECC + NVLink* — ECC-protected 24 GB and NVLink-pair option make the A5000 the production-stable cousin of a 3090. ISV certification for V-Ray and Citrix vWS, server thermals for 12-hour batch renders, and a professional driver branch your existing pipeline already validates against. **Q: When does A5000 beat a 4090 for studios?** A: When you need datacenter validation and ECC. ISV certification, NVLink in pairs, and server-grade thermals — important for production V-Ray pipelines, virtual workstations, and academic research. Lower throughput than 4090 but production-stable. **Related GPUs:** [RTX A4000](https://clore.ai/rent-a4000.html), [RTX A6000](https://clore.ai/rent-a6000.html), [RTX 4090](https://clore.ai/rent-4090.html) --- ## URL: https://clore.ai/host-a5000.html ### Host NVIDIA RTX A5000 on CLORE.AI > List your **RTX A5000** on Clore.ai. 24 GB ECC + NVLink + ISV certification command real on-demand premiums from VFX studios and ML research labs. Net around **$234/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. NVLink-pair listings book at meaningful premiums for shared 48 GB pool work. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $234/month net at $0.38/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6 ECC | | TDP | 230 W | | Memory bandwidth | 768 GB/s | | CUDA cores | 8,192 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.38/hr | | Spot (average) renter price | $0.28/hr | | On-demand (average) renter price | $0.42/hr | | Estimated monthly net | $234/month | **Why host this card:** *NVLink pair = 48 GB shared pool listing* — Two NVLink-bridged A5000s expose a 48 GB shared memory pool — and renters pay a premium for that exact spec when running 70B QLoRA or large V-Ray scenes. List the pair as a single listing and capture the bandwidth-bound studio market that won't run on consumer cards. **Related GPUs:** [RTX A4000](https://clore.ai/host-a4000.html), [RTX A6000](https://clore.ai/host-a6000.html), [RTX 4090](https://clore.ai/host-4090.html) --- ## URL: https://clore.ai/rent-a6000.html ### Rent NVIDIA RTX A6000 > Rent an **RTX A6000** for 48 GB ECC at workstation pricing. The default pick when 24 GB runs out — 34B FP16 inference single-card at ~1,000 tok/s, Hunyuan-Video 720p production keeping T5-XXL resident, NVLink-paired 96 GB unified pool for 70B QLoRA via FSDP. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Studio-friendly pricing for production work that doesn't need full HBM datacenter bandwidth. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 768 GB/s | | CUDA cores | 10,752 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.42/hr | | Spot (average) | $0.52/hr | | On-demand (average) | $0.78/hr | **Typical workloads:** Full-precision 13B–34B inference single-card, Unreal Engine 8K cinematics, ANSYS/COMSOL CFD, production Blender exceeding 24 GB, NVLink pair → 96 GB 48 GB ECC at 300 W — the workstation default for 34B inference, 8K VFX, and Blender scenes that exceed 24 GB. **Workload spotlights:** - **34B FP16 inference single-card** — stack: vLLM + Flash Attn 2; metric: ~1,000 tok/s aggregated, 16 concurrent. 48 GB fits 34B FP16 weights plus KV cache for moderate concurrency — no offload, no model splitting. - **Hunyuan-Video 720p production** — stack: ComfyUI + sequence parallel + fp8; metric: ~6 min per 5 s @ 720p. 48 GB lets Hunyuan keep T5-XXL + transformer + VAE all resident — lower latency than 24 GB cards with offload. - **FSDP fine-tune on 70B (NVLink pair)** — stack: Accelerate + FSDP + 8-bit Adam; metric: ~580 tokens/s across 2 cards (96 GB pool). NVLink pair gives 96 GB unified pool — fits Llama-3 70B QLoRA without offloading optimizer state. **Why this card:** *48 GB ECC at half of A100 pricing* — When you need more than 24 GB but the workload is not bandwidth-bound, the A6000 is the spec to pick. Runs 34B FP16 single-card, 8K Unreal cinematics, ANSYS CFD, and Blender scenes that exhaust 24 GB — at roughly half the rental price of an A100 80GB. **Q: When do I need 48 GB instead of 24 GB?** A: For 34B FP16 single-card inference, full-precision LoRA on 70B with FSDP across 2 cards, Unreal cinematics at 8K, and Blender scenes that exhaust 24 GB. The default pick when you need >24 GB but aren't paying H100 rates. **Related GPUs:** [RTX A5000](https://clore.ai/rent-a5000.html), [RTX 6000 Ada](https://clore.ai/rent-6000-ada.html), [A100 80GB](https://clore.ai/rent-a100-80gb.html) --- ## URL: https://clore.ai/host-a6000.html ### Host NVIDIA RTX A6000 on CLORE.AI > List your **RTX A6000** on Clore.ai. 48 GB ECC commands real money — VFX studios, ANSYS shops, and 34B fine-tuning teams hold these for days at a time. Net around **$425/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. NVLink-pair listings clear premium rates for 96 GB unified pool work. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $425/month net at $0.69/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 768 GB/s | | CUDA cores | 10,752 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.69/hr | | Spot (average) renter price | $0.52/hr | | On-demand (average) renter price | $0.78/hr | | Estimated monthly net | $425/month | **Why host this card:** *48 GB workstation cards rent in days* — Renters who need 48 GB are running multi-day workloads — 34B fine-tunes, Unreal cinematics, ANSYS simulations. Average rental duration on A6000 listings runs measurably longer than consumer-tier cards, which means fewer cold-starts, less idle time, and steadier monthly net. **Related GPUs:** [RTX A5000](https://clore.ai/host-a5000.html), [RTX 6000 Ada](https://clore.ai/host-6000-ada.html), [A100 80GB](https://clore.ai/host-a100-80gb.html) --- ## URL: https://clore.ai/rent-6000-ada.html ### Rent NVIDIA RTX 6000 Ada Generation > Rent NVIDIA RTX 6000 Ada Generation on CLORE.AI. Production text-to-3D, high-fidelity VFX, 34B-class fine-tuning, large-batch SDXL exceeding 4090 VRAM, 4th-gen RT cores, FP8 inference **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 960 GB/s | | CUDA cores | 18,176 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.55/hr | | Spot (average) | $0.69/hr | | On-demand (average) | $1.10/hr | **Typical workloads:** Production text-to-3D, high-fidelity VFX, 34B-class fine-tuning, large-batch SDXL exceeding 4090 VRAM, 4th-gen RT cores, FP8 inference **Q: How does the RTX 6000 Ada compare to the A6000?** A: Roughly 2× transformer throughput at the same 300 W envelope and 48 GB ECC. Ada's FP8 tensor cores and 4th-gen RT make it the upgrade for studios already running A6000-class workloads who need more headroom without moving to H100 pricing. **Related GPUs:** [RTX A6000](https://clore.ai/rent-a6000.html), [RTX 5090](https://clore.ai/rent-5090.html), [A100 80GB](https://clore.ai/rent-a100-80gb.html) --- ## URL: https://clore.ai/host-6000-ada.html ### Host NVIDIA RTX 6000 Ada Generation on CLORE.AI > List your NVIDIA RTX 6000 Ada Generation on Clore.ai. Estimated $586/month net at average host listing prices. **Estimated earnings:** $586/month net at $0.95/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 960 GB/s | | CUDA cores | 18,176 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.95/hr | | Spot (average) renter price | $0.69/hr | | On-demand (average) renter price | $1.10/hr | | Estimated monthly net | $586/month | **Related GPUs:** [RTX A6000](https://clore.ai/host-a6000.html), [RTX 5090](https://clore.ai/host-5090.html), [A100 80GB](https://clore.ai/host-a100-80gb.html) --- ## URL: https://clore.ai/rent-a40.html ### Rent NVIDIA A40 > Rent an **NVIDIA A40** for 48 GB ECC in proper datacenter form factor. Passive-cooled, NVENC-equipped, ISV-certified — the rack-friendly workstation-class card built for 24/7 batch renders, 34B FP16 inference, Omniverse pipelines, and DreamBooth on SDXL with bit-flip-safe weights. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. Server-grade passive thermals for unattended overnight jobs that simply can't tolerate fan failure. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 696 GB/s | | CUDA cores | 10,752 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.32/hr | | Spot (average) | $0.42/hr | | On-demand (average) | $0.55/hr | **Typical workloads:** Production rendering (V-Ray, Octane, Cycles), 34B FP16 inference, ECC-required research, virtual workstation pools, Omniverse 48 GB ECC in datacenter form factor — the rack-friendly cousin of the A6000 for render farms and Omniverse. **Workload spotlights:** - **Octane / V-Ray batch rendering** — stack: Octane 2024 + RTX path; metric: ~0.95× A6000 with passive DC cooling. Passive cooling and full server-rack form factor — ideal for 24/7 batch render fleets without active fan failure. - **34B FP16 inference** — stack: vLLM + Flash Attn 2; metric: ~880 tok/s aggregated. Lower bandwidth than A6000 (696 vs 768 GB/s) but otherwise identical compute envelope — cheaper at hyperscaler-style availability. - **DreamBooth on SDXL** — stack: Diffusers + 8-bit Adam + Flash Attn 2; metric: ~22 min for 2,000 steps, batch 2. ECC + 48 GB makes A40 the safe choice for client-deliverable subject DreamBooth runs. **Why this card:** *48 GB ECC in datacenter form factor* — Passive cooling, server-rack chassis, no fan-failure surface area — the A40 is the A6000 spec built for 24/7 unattended batch jobs. Render farms, Omniverse pipelines, and SaaS inference tenants prefer this form factor when uptime matters more than NVLink topology. **Q: When should I pick A40 over A6000?** A: When ECC + datacenter form factor matter and your workload doesn’t need NVLink. A40 is server-rack-friendly (passive cooling, NVENC), A6000 is a workstation card. Same 48 GB ECC, similar bandwidth. A40 is more available in DC fleets, A6000 in studios. **Related GPUs:** [RTX A6000](https://clore.ai/rent-a6000.html), [RTX 6000 Ada](https://clore.ai/rent-6000-ada.html), [A100 40GB](https://clore.ai/rent-a100-40gb.html) --- ## URL: https://clore.ai/host-a40.html ### Host NVIDIA A40 on CLORE.AI > List your **NVIDIA A40** on Clore.ai. Datacenter-form-factor 48 GB ECC books premium rates from rack operators and Omniverse studios — passive cooling handles 24/7 batch loads where active-fan A6000s would cycle. Net around **$310/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps, no minimum balance. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $310/month net at $0.50/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 48 GB GDDR6 ECC | | TDP | 300 W | | Memory bandwidth | 696 GB/s | | CUDA cores | 10,752 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.50/hr | | Spot (average) renter price | $0.42/hr | | On-demand (average) renter price | $0.55/hr | | Estimated monthly net | $310/month | **Why host this card:** *Passive cooling = 24/7 fleet uptime* — No fan to fail, full datacenter chassis, NVENC for video transcode jobs alongside ML — A40 fleets run hotter utilization than equivalent active-cooled cards because hosts can leave them booked round-the-clock without thermal headroom concerns. Strong fit for multi-tenant rack operators. **Related GPUs:** [RTX A6000](https://clore.ai/host-a6000.html), [RTX 6000 Ada](https://clore.ai/host-6000-ada.html), [A100 40GB](https://clore.ai/host-a100-40gb.html) --- # Inference GPUs ## URL: https://clore.ai/rent-l4.html ### Rent NVIDIA L4 > Rent an **NVIDIA L4** for modern Ada-class inference with FP8 and AV1 NVENC. 24 GB Ada at 72 W passive — vLLM serving Llama-3 8B INT8 at ~950 tok/s with 16-request concurrency, three simultaneous 4K60 AV1 transcode streams, Florence-2 captioning at 28 images per second. Continuous-batching headroom prior-gen inference cards never had. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The streaming and embedding card of choice for 2026 datacenter ops. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 24 GB GDDR6 | | TDP | 72 W | | Memory bandwidth | 300 GB/s | | CUDA cores | 7,424 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.22/hr | | Spot (average) | $0.32/hr | | On-demand (average) | $0.48/hr | **Typical workloads:** vLLM serving 7B–13B INT8, Whisper-large transcription pipelines, AV1 video transcode, ML platforms billed per-request, low-power inference at 72 W 24 GB Ada at 72 W passive — the modern replacement for T4 with FP8, AV1 NVENC, and continuous-batching headroom. **Workload spotlights:** - **vLLM serving Llama-3 8B INT8** — stack: vLLM + GPTQ 8-bit + chunked prefill; metric: ~950 tok/s aggregated, p50 45 ms. 24 GB Ada is the cheapest stable card for production 8B serving with batch >=16 concurrency. - **AV1 NVENC video transcode** — stack: FFmpeg + AV1 NVENC; metric: ~3 simultaneous 4K60 AV1 streams. L4's AV1 encoder and 72 W envelope make it the streaming transcode card of choice in 2026. - **Florence-2 captioning at scale** — stack: Transformers + fp16; metric: ~28 images/s @ 768². Vision-language captioning for stock-photo libraries — 24 GB fits Florence-2 large + 16 batch. **Why this card:** *T4 replacement with FP8 + AV1 NVENC* — 24 GB Ada at 72 W passive — same form factor economics as a T4 but with FP8 tensor paths, AV1 NVENC, and enough VRAM to run vLLM with proper concurrency. The card datacenter operators standardized on for production inference and streaming transcode in 2026. **Q: Why pick L4 over a 4090 for inference?** A: Power-efficient (72 W vs 450 W), passively cooled, designed for 24/7 multi-tenant inference. Datacenter-validated for serving stacks like vLLM and Triton. The 4090 is faster per-card; the L4 is cheaper per-request at scale. **Related GPUs:** [Tesla T4](https://clore.ai/rent-t4.html), [NVIDIA L40S](https://clore.ai/rent-l40s.html), [A100 40GB](https://clore.ai/rent-a100-40gb.html) --- ## URL: https://clore.ai/host-l4.html ### Host NVIDIA L4 on CLORE.AI > List your **NVIDIA L4** on Clore.ai. 24 GB Ada at 72 W passive is the inference workhorse datacenter ops have standardized on for 2026 production endpoints. ML platforms, embedding services, and AV1 transcoders rent these by the month, not the hour. Net around **$260/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Stake CLORE for up to **+200%** daily emission on top. **Estimated earnings:** $260/month net at $0.42/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 24 GB GDDR6 | | TDP | 72 W | | Memory bandwidth | 300 GB/s | | CUDA cores | 7,424 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.42/hr | | Spot (average) renter price | $0.32/hr | | On-demand (average) renter price | $0.48/hr | | Estimated monthly net | $260/month | **Why host this card:** *AV1 NVENC + FP8 in 72 W package* — Only inference card that combines AV1 hardware encode with FP8 tensor cores at 72 W passive. Streaming platforms and ML pipelines book L4s by the month — long rentals, low cold-start churn, predictable margin. The successor card in every sense to T4 fleet operators. **Related GPUs:** [Tesla T4](https://clore.ai/host-t4.html), [NVIDIA L40S](https://clore.ai/host-l40s.html), [A100 40GB](https://clore.ai/host-a100-40gb.html) --- ## URL: https://clore.ai/rent-l40s.html ### Rent NVIDIA L40S > Rent an **NVIDIA L40S** for production-grade FP8 70B inference on Ada silicon. 48 GB ECC + Ada FP8 tensor cores — Llama-3 70B FP8 single-card serving at ~720 tok/s with 16-request KV, Hunyuan-Video FP8 at 4.5 min per 5-second clip, LLaVA-1.6 multimodal serving with vision encoder resident. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. The default 70B serving target when datacenter HBM supply is tight. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 48 GB GDDR6 ECC | | TDP | 350 W | | Memory bandwidth | 864 GB/s | | CUDA cores | 18,176 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.65/hr | | Spot (average) | $0.78/hr | | On-demand (average) | $1.20/hr | **Typical workloads:** Llama-3 70B FP8 single-card inference, generative video with FP8 quantization, multi-tenant inference clusters, 4th-gen RT, default when H100 supply tight 48 GB ECC + Ada FP8 + 350 W — the production substitute for H100 inference when supply is tight. **Workload spotlights:** - **Llama-3 70B FP8 single-card serving** — stack: vLLM + TensorRT-LLM FP8; metric: ~720 tok/s aggregated, 16 concurrent. FP8 quant + 48 GB fits 70B with room for KV cache — typically 40–60% the price of an H100 for inference workloads. - **Hunyuan-Video FP8 production** — stack: ComfyUI + Ada FP8 + sequence parallel; metric: ~4.5 min per 5 s @ 720p. FP8 path nearly doubles Hunyuan throughput vs A6000 at the same VRAM — production-grade gen-video card. - **LLaVA-1.6 multimodal serving** — stack: vLLM + fp16; metric: ~24 images/s + 800 tok/s text. Vision-language SaaS pipeline — 48 GB holds vision encoder + Llama backbone + 16-batch KV simultaneously. **Why this card:** *70B FP8 serving at sub-H100 rates* — FP8 quant fits Llama-3 70B in 48 GB with KV cache room — and the L40S delivers it at typically 40–60% of an H100's rental price. The pragmatic pick for production inference teams who don't need HBM3 bandwidth or NVLink fabric, just consistent token throughput. **Q: Is L40S a substitute for H100?** A: For inference, often yes — FP8 throughput on Llama-3 70B is competitive at a fraction of the rental price. For training, the H100's HBM3 bandwidth and NVLink fabric still win. Pick L40S for serving, H100 for pretraining. **Related GPUs:** [A100 80GB](https://clore.ai/rent-a100-80gb.html), [NVIDIA L4](https://clore.ai/rent-l4.html), [H100](https://clore.ai/rent-h100.html) --- ## URL: https://clore.ai/host-l40s.html ### Host NVIDIA L40S on CLORE.AI > List your **NVIDIA L40S** on Clore.ai. 48 GB Ada FP8 silicon books at sustained on-demand premiums from API platforms running 70B inference — the workload that pays month-after-month, not just during training cycles. Net around **$647/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $647/month net at $1.05/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ada Lovelace | | VRAM | 48 GB GDDR6 ECC | | TDP | 350 W | | Memory bandwidth | 864 GB/s | | CUDA cores | 18,176 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $1.05/hr | | Spot (average) renter price | $0.78/hr | | On-demand (average) renter price | $1.20/hr | | Estimated monthly net | $647/month | **Why host this card:** *Production 70B FP8 host with real demand* — Inference platforms have settled on L40S as the cheaper-than-H100 70B serving target — and that demand runs 24/7 on long contracts, not in burst-and-die cycles. Hosts consistently book L40S inventory weeks in advance once they show up on the marketplace. **Related GPUs:** [A100 80GB](https://clore.ai/host-a100-80gb.html), [NVIDIA L4](https://clore.ai/host-l4.html), [H100](https://clore.ai/host-h100.html) --- ## URL: https://clore.ai/rent-t4.html ### Rent NVIDIA Tesla T4 > Rent a **Tesla T4** for the lowest-cost inference per request in 2026. 16 GB GDDR6 at 70 W passive — Whisper.cpp transcription at 7× realtime, YOLOv8n bulk detection at 340 FPS, BGE-large embedding generation at ~1,100 docs/s. Spot floor sits near $0.08/hr, so high-volume batch jobs cost cents per hour of audio or thousands of images. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Turing | | VRAM | 16 GB GDDR6 | | TDP | 70 W | | Memory bandwidth | 320 GB/s | | CUDA cores | 2,560 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.08/hr | | Spot (average) | $0.10/hr | | On-demand (average) | $0.16/hr | **Typical workloads:** Whisper.cpp transcription, ResNet/YOLO classification, MoE-7B INT8 routing, high-volume embedding generation, 70 W passively cooled 16 GB Turing at 70 W passive — still the cheapest path to high-volume Whisper, YOLO, and embeddings in 2026. **Workload spotlights:** - **Whisper.cpp transcription pipeline** — stack: whisper.cpp INT8 + CUDA; metric: ~7× realtime, large-v3 model. 70 W passive form factor and rock-bottom rental price — transcription cost dips below $0.02 per audio hour. - **YOLOv8n bulk detection** — stack: Ultralytics + TensorRT INT8; metric: ~340 FPS @ 640², batch 16. Massive throughput per dollar on classification/detection — the standard hyperscaler hardware for low-cost CV. - **Embedding generation (BGE-large)** — stack: sentence-transformers + ONNX; metric: ~1,100 docs/s @ 512 tokens. RAG indexing at scale — T4 is the canonical GPU for batch embedding pipelines in 2026. **Why this card:** *Cents-per-hour Whisper, YOLO, and embeddings* — Six-year-old silicon, but $/inference still beats every modern card on classification, transcription, and embeddings. 70 W passive form factor keeps host costs at floor — and that floor passes straight to renters as the cheapest GPU on the marketplace for high-volume batch work. **Q: Is the T4 still worth renting in 2026?** A: Yes — for low-cost, high-volume inference (Whisper, ResNet, YOLOv8, embeddings) the T4 still wins on $/inference. Six-year-old silicon, but its 70 W passively-cooled form factor keeps host costs low and rental prices ultra-competitive. **Related GPUs:** [NVIDIA L4](https://clore.ai/rent-l4.html), [RTX A4000](https://clore.ai/rent-a4000.html), [NVIDIA L40S](https://clore.ai/rent-l40s.html) --- ## URL: https://clore.ai/host-t4.html ### Host NVIDIA Tesla T4 on CLORE.AI > List your **Tesla T4** on Clore.ai. 70 W passive cards keep your power bill near zero while booking steady volume from RAG pipelines, transcription services, and CV batch operators — the workloads that don't need fancy silicon, just predictable throughput. Net around **$135/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $135/month net at $0.22/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Turing | | VRAM | 16 GB GDDR6 | | TDP | 70 W | | Memory bandwidth | 320 GB/s | | CUDA cores | 2,560 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.22/hr | | Spot (average) renter price | $0.10/hr | | On-demand (average) renter price | $0.16/hr | | Estimated monthly net | $135/month | **Why host this card:** *70 W passive — almost zero power overhead* — T4 fleets are nearly free to keep online — 70 W passive means no dedicated cooling, marginal electricity draw, and stack densities up to 8 cards per chassis. Even at floor rental rates the operating margin stays healthy because there's almost no cost basis to recover. **Related GPUs:** [NVIDIA L4](https://clore.ai/host-l4.html), [RTX A4000](https://clore.ai/host-a4000.html), [NVIDIA L40S](https://clore.ai/host-l40s.html) --- ## URL: https://clore.ai/rent-a10.html ### Rent NVIDIA A10 > Rent an **NVIDIA A10** as the AWS/GCP-equivalent inference card. 24 GB Ampere at 150 W with MIG — vLLM Llama-3 8B FP16 at ~1,050 tok/s, four-instance MIG partitioning for SaaS multi-tenancy, DeepStream YOLO + NVENC pipelines on a single card. Same Triton/vLLM configs that run on hyperscaler reference setups, deployed identically. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6 | | TDP | 150 W | | Memory bandwidth | 600 GB/s | | CUDA cores | 9,216 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.16/hr | | Spot (average) | $0.22/hr | | On-demand (average) | $0.28/hr | **Typical workloads:** vLLM 7B-13B serving, MIG up to 4 instances, AWS/GCP-equivalent inference, video transcode (NVENC), edge ML platforms 24 GB Ampere at 150 W with MIG — the AWS/GCP-equivalent inference card for hyperscaler-reference deployments. **Workload spotlights:** - **vLLM Llama-3 8B FP16 serving** — stack: vLLM + continuous batching; metric: ~1,050 tok/s aggregated, p50 38 ms. Reference card on AWS g5 / GCP G2 — deploys with the same Triton/vLLM configs as production hyperscaler stacks. - **MIG 4-instance multi-tenant inference** — stack: MIG + Triton Inference Server; metric: 4× 6 GB instances, 1 model each. Hard memory isolation for SaaS multi-tenant inference — each tenant gets a dedicated GPU slice. - **YOLOv8 detection + NVENC encode** — stack: DeepStream + YOLOv8 + NVENC; metric: ~280 FPS detection + 1080p60 H.264 encode. Edge ML video analytics pipeline — inference + encode on a single 150 W card. **Why this card:** *AWS g5 / GCP G2 reference, decoded* — Same silicon as the AWS g5 and GCP G2 inference instance types — meaning your existing Triton, vLLM, and TensorRT configs run unchanged. Drop-in target for hyperscaler-trained reference deployments, with MIG hard isolation for multi-tenant SaaS APIs at proper p99 budgets. **Q: Why pick A10 over L4 for inference?** A: A10 has more raw FP16 throughput (124 TFLOPS vs 121) and the same 24 GB VRAM. L4 is much more power-efficient (72 W vs 150 W) and Ada-class with FP8. A10 is the AWS/GCP standard — better choice if you need MIG or are matching a hyperscaler reference setup. **Related GPUs:** [NVIDIA L4](https://clore.ai/rent-l4.html), [Tesla T4](https://clore.ai/rent-t4.html), [NVIDIA L40S](https://clore.ai/rent-l40s.html) --- ## URL: https://clore.ai/host-a10.html ### Host NVIDIA A10 on CLORE.AI > List your **NVIDIA A10** on Clore.ai. The hyperscaler-reference inference card books steady on-demand from teams running production APIs that already validated on AWS g5 / GCP G2 — they want the exact same silicon, just cheaper. Net around **$155/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $155/month net at $0.24/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 24 GB GDDR6 | | TDP | 150 W | | Memory bandwidth | 600 GB/s | | CUDA cores | 9,216 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.24/hr | | Spot (average) renter price | $0.22/hr | | On-demand (average) renter price | $0.28/hr | | Estimated monthly net | $155/month | **Why host this card:** *Hyperscaler-clone silicon at marketplace pricing* — API platforms with reference deployments validated on AWS g5 / GCP G2 won't substitute random GPUs — they want the exact A10. List one and you capture that direct hyperscaler-substitute demand, with MIG-partitioned multi-tenant inference filling the rest of the card's capacity. **Related GPUs:** [NVIDIA L4](https://clore.ai/host-l4.html), [Tesla T4](https://clore.ai/host-t4.html), [NVIDIA L40S](https://clore.ai/host-l40s.html) --- # Datacenter / training GPUs ## URL: https://clore.ai/rent-v100.html ### Rent NVIDIA Tesla V100 > Rent a **Tesla V100** for legacy training and FP32 scientific compute. 32 GB HBM2 at 900 GB/s — pre-Hopper Transformers pipelines run unchanged, FP32 CFD and molecular dynamics workloads sustain ~14 TFLOPS, batch Whisper transcription at 9× realtime. Retired hyperscaler silicon, re-listed at attractive rates by hosts who picked up bulk inventory. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Volta | | VRAM | 32 GB HBM2 | | TDP | 300 W | | Memory bandwidth | 900 GB/s | | CUDA cores | 5,120 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.28/hr | | Spot (average) | $0.36/hr | | On-demand (average) | $0.62/hr | **Typical workloads:** Legacy training pipelines, FP32 scientific compute (CFD, MD), pre-Hopper framework support, retired hyperscaler hardware re-listed at attractive rates 32 GB HBM2 Volta — retired hyperscaler silicon priced for legacy training and FP32 scientific compute. **Workload spotlights:** - **Legacy HuggingFace training** — stack: Transformers + fp16 + DDP; metric: ~0.6× A100 40GB on BERT-large. Pre-Hopper Transformers pipelines run unchanged — V100 is the cheapest card with HBM and NVLink support. - **FP32 scientific simulation** — stack: PyTorch FP32 + cuFFT / cuSPARSE; metric: ~14 TFLOPS sustained FP32. CFD / molecular dynamics workloads that depend on FP32 — V100 is the cheapest HBM card with full FP32. - **Whisper-large transcription batch** — stack: faster-whisper + CTranslate2 fp16; metric: ~9× realtime, batch 16. 32 GB HBM2 fits large-v3 + big batches — attractive for batch transcription where latency is not critical. **Why this card:** *Cheapest HBM + NVLink card listed* — The only sub-$0.30 spot listing with HBM memory and NVLink support — meaningful when bandwidth-bound legacy code paths or FP32 scientific simulations need server-grade interconnect without paying A100 rates. Plenty of supply from hyperscalers retiring 2018-era inventory in 2026. **Q: Should I still pick V100 over A100 in 2026?** A: Only for legacy code paths or budget-constrained FP32 scientific workloads. For transformer training, the A100 40GB is faster, has TF32, and isn't much more expensive. Pick V100 when the price gap matters more than throughput. **Related GPUs:** [A100 40GB](https://clore.ai/rent-a100-40gb.html), [A100 80GB](https://clore.ai/rent-a100-80gb.html), [RTX 4090](https://clore.ai/rent-4090.html) --- ## URL: https://clore.ai/host-v100.html ### Host NVIDIA Tesla V100 on CLORE.AI > List your **Tesla V100** on Clore.ai. Retired hyperscaler inventory still books steady demand from legacy training pipelines, FP32 CFD shops, and budget-constrained academic ML labs that don't need Hopper-class throughput. Net around **$339/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Withdraw any time, no caps. Stake CLORE for up to **+200%** daily emission on top of every rental hour. **Estimated earnings:** $339/month net at $0.55/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Volta | | VRAM | 32 GB HBM2 | | TDP | 300 W | | Memory bandwidth | 900 GB/s | | CUDA cores | 5,120 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.55/hr | | Spot (average) renter price | $0.36/hr | | On-demand (average) renter price | $0.62/hr | | Estimated monthly net | $339/month | **Why host this card:** *HBM + NVLink at recovered-inventory pricing* — Hosts who picked up bulk V100 inventory from retired hyperscaler fleets list at rates below any other HBM card — and renters with legacy code paths or FP32 simulations target that exact price floor. Steady fill from a niche-but-real market that's not going to disappear in 2026. **Related GPUs:** [A100 40GB](https://clore.ai/host-a100-40gb.html), [A100 80GB](https://clore.ai/host-a100-80gb.html), [RTX 4090](https://clore.ai/host-4090.html) --- ## URL: https://clore.ai/rent-a100-40gb.html ### Rent NVIDIA A100 40GB > Rent an **A100 40GB** as the canonical 7B pretraining and 13B–34B fine-tuning workhorse. 40 GB HBM2e at 1,555 GB/s, NVLink 3rd-gen, MIG up to 7 instances per card. 8-GPU NVSwitch pods sustain ~27K tok/s aggregate for Mistral-7B-class pretraining runs. The exact training silicon FSDP, DeepSpeed ZeRO-3, and Megatron-LM were originally tuned against. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 40 GB HBM2e | | TDP | 400 W | | Memory bandwidth | 1,555 GB/s | | CUDA cores | 6,912 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.78/hr | | Spot (average) | $0.89/hr | | On-demand (average) | $1.20/hr | **Typical workloads:** Pretrain 7B from scratch, fine-tune 13B–34B with FSDP, vLLM Llama-3 8B FP16, MIG up to 7 instances per card, NVLink 3rd-gen 40 GB HBM2e + 1.55 TB/s + NVLink — the canonical 7B pretraining and 13B–34B fine-tuning workhorse. **Workload spotlights:** - **Pretrain Mistral-7B from scratch** — stack: DeepSpeed ZeRO-3 + Flash Attn 2 + bf16; metric: ~3,400 tokens/s/GPU, 8K context. 8× A100 40GB node hits ~27K tok/s aggregate — a Mistral-7B-class pretraining run completes in weeks of spot. - **Llama-3 13B FSDP fine-tune** — stack: Accelerate + FSDP + bf16 + Flash Attn 2; metric: ~1,100 tokens/s/GPU on 8× node. Standard 13B SFT pipeline — 40 GB fits FSDP-sharded weights + activations at 4K context. - **MIG 7-way inference partition** — stack: MIG + Triton + TensorRT; metric: 7× ~5 GB instances, mixed models. Hard isolation for multi-tenant ML platforms — each MIG slice gets dedicated SMs and HBM. **Why this card:** *The reference 7B pretraining card* — FSDP, DeepSpeed ZeRO-3, and Megatron-LM were tuned against this exact silicon. 40 GB HBM2e + 1.55 TB/s + NVLink — the spec on which the public 7B-class pretraining recipes were originally validated, still the cheapest HBM + NVLink path with full MIG support in 2026. **Q: When should I pick the 40 GB over the 80 GB A100?** A: When you're pretraining 7B from scratch or fine-tuning 13B with offload — 40 GB is plenty. Step up to 80 GB for 34B+ pretraining, 70B fine-tuning, or LongRoPE / 128k-context work that exhausts the smaller model's KV cache. **Related GPUs:** [A100 80GB](https://clore.ai/rent-a100-80gb.html), [Tesla V100](https://clore.ai/rent-v100.html), [H100](https://clore.ai/rent-h100.html) --- ## URL: https://clore.ai/host-a100-40gb.html ### Host NVIDIA A100 40GB on CLORE.AI > List your **A100 40GB** on Clore.ai. The canonical training silicon — ML teams pretraining 7B from scratch and fine-tuning 13B with FSDP book A100 nodes by the week, not the hour. Net around **$610/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. NVSwitch pod listings command material premiums. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $610/month net at $0.99/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 40 GB HBM2e | | TDP | 400 W | | Memory bandwidth | 1,555 GB/s | | CUDA cores | 6,912 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $0.99/hr | | Spot (average) renter price | $0.89/hr | | On-demand (average) renter price | $1.20/hr | | Estimated monthly net | $610/month | **Why host this card:** *The canonical training silicon, week-long rentals* — ML teams running 7B pretraining or 13B FSDP fine-tunes hold the pod for the whole run — multi-day to multi-week. That translates to fewer cold-starts and predictable monthly net per card. NVSwitch-fabric pod listings book at the highest premiums in the datacenter tier. **Related GPUs:** [A100 80GB](https://clore.ai/host-a100-80gb.html), [Tesla V100](https://clore.ai/host-v100.html), [H100](https://clore.ai/host-h100.html) --- ## URL: https://clore.ai/rent-a100-80gb.html ### Rent NVIDIA A100 80GB > Rent an **A100 80GB** for the cheapest HBM + NVLink path in 2026. 80 GB HBM2e at 1,935 GB/s — pretrain 7B–13B from scratch on 8-GPU nodes at ~3,800 tok/s per card, fine-tune 70B with FSDP across 16 cards, serve 70B FP16 across two cards via tensor parallel. Standard FSDP + DeepSpeed pipelines run unmodified on this silicon. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 80 GB HBM2e | | TDP | 400 W | | Memory bandwidth | 1,935 GB/s | | CUDA cores | 6,912 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $0.92/hr | | Spot (average) | $1.10/hr | | On-demand (average) | $1.40/hr | **Typical workloads:** Pretrain 7B–13B from scratch on 8× node, fine-tune 70B FSDP across 16 cards, 70B FP16 inference 2-card, MIG 7×, the canonical 2022–2024 training silicon 80 GB HBM2e + 1.93 TB/s — the 2022–2024 training silicon and still the cheapest HBM + NVLink path in 2026. **Workload spotlights:** - **Pretrain Llama-7B from scratch** — stack: DeepSpeed ZeRO-3 + Flash Attn 2; metric: ~3,800 tokens/s/GPU on 8× node. 8× A100 80GB pod is the de facto reference for 7B-13B pretraining — ~50% the rental of an H100 node. - **Llama-3 70B FSDP fine-tune** — stack: FSDP full-shard + bf16 + Flash Attn 2; metric: ~520 tokens/s/GPU on 16× node. 70B SFT across 16 cards (2 nodes via NVLink + IB) — standard reference for 70B fine-tunes in 2026. - **70B FP16 inference 2-card** — stack: vLLM + tensor parallel; metric: ~480 tok/s aggregated, 16 concurrent. Two 80 GB A100s fit 70B FP16 + 16-request KV cache — the cheapest FP16 70B serving setup with NVLink. **Why this card:** *Cheapest HBM + NVLink in 2026* — When you need 80 GB of HBM and NVLink fabric for 70B FSDP fine-tuning or two-card 70B FP16 serving, A100 80GB is typically 50–60% the rental of an H100 with the same FSDP + DeepSpeed pipelines unchanged. The default training card for budget-conscious ML teams. **Q: Is A100 80GB still relevant against H100?** A: Yes — it's typically 50–60% of H100 rental price with 80 GB HBM2e and supports the same FSDP + DeepSpeed pipelines. For training without FP8 / TransformerEngine, A100 80GB remains the cheapest way to get HBM and NVLink in 2026. **Related GPUs:** [A100 40GB](https://clore.ai/rent-a100-40gb.html), [H100](https://clore.ai/rent-h100.html), [H200](https://clore.ai/rent-h200.html) --- ## URL: https://clore.ai/host-a100-80gb.html ### Host NVIDIA A100 80GB on CLORE.AI > List your **A100 80GB** on Clore.ai. The 2022–2024 training silicon still pulls strong on-demand demand because the 70B FSDP fine-tuning workload remains canonical, and not every team has frontier-card budget. Net around **$795/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. NVSwitch-fabric pod listings command top premiums. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $795/month net at $1.29/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Ampere | | VRAM | 80 GB HBM2e | | TDP | 400 W | | Memory bandwidth | 1,935 GB/s | | CUDA cores | 6,912 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $1.29/hr | | Spot (average) renter price | $1.10/hr | | On-demand (average) renter price | $1.40/hr | | Estimated monthly net | $795/month | **Why host this card:** *70B FSDP fine-tunes book by the week* — 70B fine-tuning runs across 16 A100 80GBs hold for days at minimum — and that's the workload with the most consistent funding right now. Hosts with proper NVSwitch + InfiniBand topology consistently book full pod rentals weeks in advance. Top revenue density in the datacenter tier. **Related GPUs:** [A100 40GB](https://clore.ai/host-a100-40gb.html), [H100](https://clore.ai/host-h100.html), [H200](https://clore.ai/host-h200.html) --- ## URL: https://clore.ai/rent-h100.html ### Rent NVIDIA H100 80GB > Rent an **H100** for foundation-model pretraining and FP8 inference. 80 GB HBM3 at 3,350 GB/s, Hopper Transformer Engine, NVLink 4th-gen at 900 GB/s. Pretraining Llama-3 70B-class models hits ~7,200 tok/s/GPU on 8-GPU nodes via FSDP + FP8, single-card 70B FP8 serving at 1,800 tok/s with 32-request KV. The industry-default training silicon. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Hopper | | VRAM | 80 GB HBM3 | | TDP | 700 W | | Memory bandwidth | 3,350 GB/s | | CUDA cores | 14,592 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $1.89/hr | | Spot (average) | $2.10/hr | | On-demand (average) | $2.40/hr | **Typical workloads:** Pretrain 7B–70B on 8-GPU nodes, 70B FP8 single-card serving with KV-cache headroom, 128k contexts native, 4th-gen NVLink 80 GB HBM3 + 3.35 TB/s + Hopper FP8 — the gold standard for pretraining, FP8 fine-tuning, and 70B+ serving. **Workload spotlights:** - **Pretrain Llama-3 70B (FSDP + FP8)** — stack: TransformerEngine FP8 + FSDP + Flash Attn 3; metric: ~7,200 tokens/s/GPU on 8× node. FP8 path nearly doubles A100 80GB pretraining throughput — the industry-default 70B pretraining card. - **Llama-3 70B FP8 single-card serving** — stack: vLLM + TensorRT-LLM FP8 + chunked prefill; metric: ~1,800 tok/s aggregated, 32 concurrent. FP8 quant fits 70B + KV cache on a single 80 GB H100 — the canonical serving config. - **DeepSeek-Coder fine-tune** — stack: TransformerEngine + FSDP + bf16; metric: ~6,500 tokens/s/GPU on 8× node. Code-LLM fine-tuning at 16K context — H100 NVLink + Flash Attn 3 keep gradient sync from being the bottleneck. **Why this card:** *FP8 + Transformer Engine for foundation models* — Hopper Transformer Engine roughly doubles A100 80GB pretraining throughput at the same memory footprint — that's why every published 70B+ pretraining recipe in 2025 targets H100. NVLink 4th-gen at 900 GB/s keeps gradient sync from being the bottleneck on 8-GPU pods. **Q: PCIe or SXM H100 — which should I rent?** A: SXM is faster (700 W, NVLink 900 GB/s) and is what you want for distributed training. PCIe (350 W, NVLink Bridge optional) is fine for single-card inference and easier to host. CLORE.AI lists both — filter by NVLink speed in the marketplace. **Related GPUs:** [A100 80GB](https://clore.ai/rent-a100-80gb.html), [H200](https://clore.ai/rent-h200.html), [B200](https://clore.ai/rent-b200.html) --- ## URL: https://clore.ai/host-h100.html ### Host NVIDIA H100 80GB on CLORE.AI > List your **H100** on Clore.ai. The industry-default training silicon — foundation-model teams hold H100 pods for weeks at a time, paying datacenter on-demand rates that dwarf consumer-tier earnings. Net around **$1,479/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. NVSwitch + InfiniBand topology drives top premiums. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $1,479/month net at $2.40/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Hopper | | VRAM | 80 GB HBM3 | | TDP | 700 W | | Memory bandwidth | 3,350 GB/s | | CUDA cores | 14,592 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $2.40/hr | | Spot (average) renter price | $2.10/hr | | On-demand (average) renter price | $2.40/hr | | Estimated monthly net | $1,479/month | **Why host this card:** *Foundation-model pods rent for multi-week runs* — 70B-class pretraining runs hold the pod from start to finish — measured in weeks, not hours. Hosts with proper 8-GPU NVSwitch fabric and InfiniBand interconnect book multi-week contracts with foundation-model labs. Highest absolute monthly net per card on the marketplace. **Related GPUs:** [A100 80GB](https://clore.ai/host-a100-80gb.html), [H200](https://clore.ai/host-h200.html), [B200](https://clore.ai/host-b200.html) --- ## URL: https://clore.ai/rent-h200.html ### Rent NVIDIA H200 141GB > Rent an **H200** when memory capacity or bandwidth is the bottleneck. 141 GB HBM3e at 4,800 GB/s — fits Llama-3 405B INT4 across 4 cards instead of 8, runs 70B FP16 single-card with 64-request KV cache, serves DeepSeek-V3 at native 1M-token contexts without paginating to host memory. The serving and long-context card of 2026. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Hopper | | VRAM | 141 GB HBM3e | | TDP | 700 W | | Memory bandwidth | 4,800 GB/s | | CUDA cores | 16,896 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $2.40/hr | | Spot (average) | $2.65/hr | | On-demand (average) | $2.95/hr | **Typical workloads:** Llama-3 405B INT4 across 4 cards, 1M-token native contexts, 70B FP16 serving without paginating, memory-bandwidth-bound workloads 141 GB HBM3e at 4.8 TB/s — same compute as H100, but the memory upgrade transforms 70B+ serving and 1M-token contexts. **Workload spotlights:** - **Llama-3 405B INT4 across 4 cards** — stack: vLLM + GPTQ 4-bit + tensor parallel; metric: ~620 tok/s aggregated, 16 concurrent. 141 GB per card fits 405B INT4 in 4 cards instead of 8 H100s — cuts node count in half for the same model. - **Llama-3 70B FP16 single-card** — stack: vLLM + Flash Attn 3; metric: ~2,100 tok/s aggregated, 64 concurrent. 141 GB fits 70B FP16 + 64-request KV cache — no quantization, no model splitting. - **DeepSeek-V3 1M-token serving** — stack: vLLM + paged attention + FP8; metric: ~720 tok/s @ 1M-token contexts. 4.8 TB/s HBM3e bandwidth and 141 GB enable native 1M-token contexts without offload — the H200's signature workload. **Why this card:** *1M-token contexts native, no offload* — 141 GB HBM3e at 4.8 TB/s eliminates KV-cache offload for everything up to 1M-token contexts and fits 405B INT4 in 4 cards instead of 8. Same compute as H100, but the memory delta transforms 70B+ serving and long-context inference economics. Bandwidth-bound workloads' card. **Q: When does the H200 beat the H100?** A: Whenever memory bandwidth or VRAM is the bottleneck — 141 GB HBM3e at 4.8 TB/s eliminates KV-cache offload, fits 405B INT4 across 4 cards instead of 8, and runs 1M-token contexts native. Same compute as H100, but the memory upgrade is significant for serving. **Related GPUs:** [H100](https://clore.ai/rent-h100.html), [A100 80GB](https://clore.ai/rent-a100-80gb.html), [B200](https://clore.ai/rent-b200.html) --- ## URL: https://clore.ai/host-h200.html ### Host NVIDIA H200 141GB on CLORE.AI > List your **H200** on Clore.ai. 141 GB HBM3e is the spec long-context inference and 405B serving teams chase — and they pay accordingly. Net around **$1,818/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. Multi-card NVLink-Switch pods clear the highest premiums on the platform. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $1,818/month net at $2.95/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Hopper | | VRAM | 141 GB HBM3e | | TDP | 700 W | | Memory bandwidth | 4,800 GB/s | | CUDA cores | 16,896 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $2.95/hr | | Spot (average) renter price | $2.65/hr | | On-demand (average) renter price | $2.95/hr | | Estimated monthly net | $1,818/month | **Why host this card:** *405B serving with half the node count* — Inference teams running Llama-3 405B INT4 cut their node count in half on H200 vs H100 — and they're willing to pay the per-card premium because the total cost still drops. Same applies to 1M-token contexts native. Demand currently outstrips supply on the H200 listings. **Related GPUs:** [H100](https://clore.ai/host-h100.html), [A100 80GB](https://clore.ai/host-a100-80gb.html), [B200](https://clore.ai/host-b200.html) --- ## URL: https://clore.ai/rent-b200.html ### Rent NVIDIA B200 > Rent a **B200** at the new training silicon ceiling. 192 GB HBM3e at 8,000 GB/s, Blackwell FP4 tensor cores, 5th-gen NVLink at 1.8 TB/s. Pretraining 405B dense models hits ~14,000 tok/s/GPU on 8-GPU pods, DeepSeek-V3 671B MoE serves at ~3,800 tok/s aggregated, Llama-3 70B FP4 at ~5,200 tok/s with 128-request concurrency. Frontier-lab default. Billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 192 GB HBM3e | | TDP | 1000 W | | Memory bandwidth | 8,000 GB/s | | CUDA cores | 33,792 | **Pricing (per hour, USD):** | Type | Rate | | --- | --- | | Spot (low) | $3.40/hr | | Spot (average) | $3.80/hr | | On-demand (average) | $4.20/hr | **Typical workloads:** Pretrain 405B dense models on 8-GPU pods, serve 671B MoE in production, 256-GPU clusters with NVLink-Switch, 5th-gen NVLink at 1.8 TB/s 192 GB HBM3e + 8 TB/s + Blackwell FP4 — the 2026 frontier card for 405B pretraining and 671B MoE serving. **Workload spotlights:** - **Pretrain 405B dense (8-GPU pod)** — stack: TransformerEngine FP8/FP4 + 5th-gen NVLink + FSDP; metric: ~14,000 tokens/s/GPU on 8× node. 5th-gen NVLink at 1.8 TB/s and FP4 tensor cores roughly double H100 pretraining throughput — frontier-lab default. - **DeepSeek-V3 671B MoE serving** — stack: vLLM + expert parallelism + FP8; metric: ~3,800 tok/s aggregated on 8× node. 192 GB per card holds 671B MoE active weights with NVLink-Switch fabric — the production card for trillion-class MoE. - **Llama-3 70B FP4 serving** — stack: vLLM + Blackwell FP4 + chunked prefill; metric: ~5,200 tok/s aggregated, 128 concurrent. Native FP4 on Blackwell roughly doubles H100 FP8 70B serving throughput at much higher concurrency. **Why this card:** *192 GB HBM3e — new training ceiling* — 192 GB per card and 8 TB/s HBM3e fit 671B MoE active weights with NVLink-Switch fabric — the only silicon where trillion-parameter-class models serve in production today. 5th-gen NVLink at 1.8 TB/s and FP4 tensor cores roughly double H100 pretraining throughput. **Q: Is B200 available outside hyperscaler waitlists?** A: Yes — CLORE.AI hosts have started listing B200 nodes as supply ramps in 2026. Availability varies by region; filter by 'B200' in the marketplace and check live spot floor. NVLink-Switch fabric available on multi-node pods. **Related GPUs:** [H200](https://clore.ai/rent-h200.html), [H100](https://clore.ai/rent-h100.html), [A100 80GB](https://clore.ai/rent-a100-80gb.html) --- ## URL: https://clore.ai/host-b200.html ### Host NVIDIA B200 on CLORE.AI > List your **B200** on Clore.ai. Frontier silicon supply is still constrained in 2026 — listings get noticed by the labs that previously couldn't get past hyperscaler waitlists. Net around **$2,589/month** per card before MFP staking, paid per-minute in **BTC**, **USDT**, **USDC** or **CLORE**. 8-GPU NVLink-Switch pod listings clear the highest sustained premiums on the platform. Stake CLORE for up to **+200%** daily emission. **Estimated earnings:** $2,589/month net at $4.20/hr average host listing price (before any MFP staking emission). **Specifications:** | Spec | Value | | --- | --- | | Architecture | Blackwell | | VRAM | 192 GB HBM3e | | TDP | 1000 W | | Memory bandwidth | 8,000 GB/s | | CUDA cores | 33,792 | **Earnings reference (per hour, USD):** | Metric | Value | | --- | --- | | Host list (average) | $4.20/hr | | Spot (average) renter price | $3.80/hr | | On-demand (average) renter price | $4.20/hr | | Estimated monthly net | $2,589/month | **Why host this card:** *Past the hyperscaler quota waitlist* — Frontier labs still face hyperscaler quota gates on B200 — so any properly-configured marketplace listing gets attention immediately. 192 GB HBM3e + NVLink-Switch fabric pods pretrain 405B dense and serve 671B MoE; that's a workload class with serious funding behind it in 2026. **Related GPUs:** [H200](https://clore.ai/host-h200.html), [H100](https://clore.ai/host-h100.html), [A100 80GB](https://clore.ai/host-a100-80gb.html) --- ## Professional / workstation tier — copywriting reference ### Rent (renter perspective) > Production-grade silicon for studios and engineering teams. **ECC memory**, ISV-certified drivers, NVLink-ready in pairs, and the same professional driver branch your render farm and CAD seats already validate against — billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Why rent this tier:** - **ISV-certified drivers** — Every host runs the NVIDIA professional driver branch — the same one certified for V-Ray, Octane, DaVinci Resolve, ANSYS, COMSOL, SolidWorks and Rhino. Your pipeline doesn't notice it left the studio. - **ECC memory + NVLink** — Workstation cards ship with full **ECC** on-by-default and pair-wise **NVLink** for shared VRAM and bandwidth-bound CAE. Filter the marketplace for NVLink-pinned 2× listings when you need a single contiguous memory pool. - **Datacenter-validated hosts** — Most pro-tier listings come from datacenter operators and post-production studios — the kind of host that publishes uptime, runs proper cooling, and can pin a card for the length of a render. Reliability scores and 30-day uptime are visible on every listing. **Workload fits:** - **VFX & physically-based rendering** (Octane / V-Ray / Redshift, ISV-certified) — OctaneRender 2024 with RTX-accelerated denoising, V-Ray GPU production renders, Redshift IPR sessions on a card that's been validated against the same drivers your studio uses on-prem. Pair via NVLink when a single scene won't fit in one card's VRAM. - **CAD / CAE / simulation** (ANSYS, COMSOL, SolidWorks Visualize) — Run ANSYS Fluent, COMSOL Multiphysics, or SolidWorks Visualize against ECC-protected memory — the prerequisite for any FEA or CFD result you'd put in front of a regulator. Pro driver branch means you keep the QA matrix you already certify. - **Studio fine-tuning & mid-scale training** (Llama-3 SFT, Stable Diffusion XL LoRA) — Full fine-tune of Llama-3 8B or 13B QLoRA on a single 48 GB workstation card, SDXL LoRA training with Kohya, or a 4-card NVLink pod for distributed runs under FSDP / DeepSpeed ZeRO-2. ECC keeps long-running gradients honest. **Workflow:** 1. **Filter for pro silicon** — Filter the marketplace by ECC, NVLink, professional-driver branch, and country. Pin reliability score and 30-day uptime — workstation hosts publish both. 2. **Pick a certified image** — Boot a Docker image pre-flighted for the pro stack — V-Ray, Octane, Resolve, or your studio's own internal pipeline image from a private registry. 3. **Connect over SSH or RDP** — Full root SSH on every server. Pro hosts also expose a `:8888` Jupyter and standard remote-desktop ports for the artist seats that need them. 4. **Stop, snapshot, resume** — Per-minute billing rounds to the second. Stop between shots, snapshot the container, resume on the same host or a different one tomorrow. **Production-grade compute. Without the three-week PO.** ISV-certified, ECC-protected, NVLink-pairable workstation silicon — billed per minute, paid in BTC, USDT, USDC or CLORE. Sign up, filter for the certifications you need, and the next render is live. ### Host (host perspective) > List your workstation rack on Clore.ai. **ECC**-class hardware commands a premium on the marketplace — VFX studios, CAD shops and ISV-bound enterprises pay on-demand for ISV-certified silicon. Per-minute payouts in **BTC**, **USDT**, **USDC** or **CLORE**, withdrawable any time, plus up to **+200%** daily emission via MFP staking. **Why host this tier:** - **Pro silicon = pro rates** — Studios and engineering teams pay on-demand premiums to get an ISV-certified card with **ECC** and NVLink. List spot, on-demand, or both — fees are **2.5% spot** / **10% on-demand**, reducible up to 50% via PoH. - **Per-minute payouts, no caps** — Earnings credit to your wallet balance every minute the rental runs. Withdraw to **BTC**, **USDT**, **USDC** or **CLORE** as often as you want — no daily caps, no minimum balance, no listing fee. - **From one rack to a tier-3 facility** — A four-card workstation under a desk, or a full multi-tenant Citrix-grade rack — same console, same fees, same flow. Bulk onboarding via API for fleets up to **192 servers** per account by default. **Earnings paths:** - **Plug-and-list** (ISV-certified card, no CLORE required) — List your workstation card or pair, accept rentals from studios and CAE shops paying on-demand for ECC + ISV drivers. Get paid per minute, withdraw any time. - **+ Hold CLORE for fee discount** (Up to −50% on your half of the fee) — Hold CLORE in your account wallet — no lock, no contract, no penalty. Your share of the marketplace fee drops linearly up to 50% off, capped at 2,000,000 CLORE held. - **+ Stake MFP for daily emission** (Up to +200% of rental, paid in CLORE) — Lock CLORE behind the server's quality score. The network pays a daily emission on top of your rental — up to **+200%** of your rental price, paid in CLORE, with no penalty if you skip it. **Workflow:** 1. **Install Clore Hosting Software** — Boot the Clore Linux image on the host (USB or PXE). Pair the box with your account using your initialization token. 2. **Pin the pro driver branch** — Configure SSH, Docker, and per-card settings under the NVIDIA professional driver branch. Flip to `public` after auto-attestation passes. 3. **Price for the pro market** — List on-demand for studios and CAE shops; spot for batch jobs. Adjust live — pro-tier on-demand floors run well above consumer rates. 4. **Lock MFP for daily emission** — Stake CLORE behind each server's quality score. 24 h warm-up, then up to **+200%** rental price as a daily reward. Skip it any time, no lockup. **Pro silicon, paid per minute.** ISV-certified workstation hardware earns more on this marketplace than anywhere else on the decentralized cloud. List your rack, pick spot or on-demand, and start collecting per-minute payouts in BTC, USDT, USDC or CLORE. --- ## Inference tier — copywriting reference ### Rent (renter perspective) > Inference economics, not benchmark theatre. **FP8**/**INT8** tensor cores, **MIG** partitions, low-watt edge variants, and the throughput-per-dollar to make **$/M-tokens** beat any hyperscaler quote — billed per-minute, paid in **BTC**, **USDT/USDC** or **CLORE**. **Why rent this tier:** - **$/M-tokens that wins on the spreadsheet** — Run vLLM with continuous batching, FP8 quantization and PagedAttention on cards built for inference, not for marketing slides. Real cost-per-million-tokens beats consumer-tier rentals by 2–3× at the same p99 latency. - **MIG partitions for multi-tenant serving** — Slice a single card into up to **7 MIG instances** and run independent inference workloads with isolated memory and SM partitions. One card, many endpoints, predictable tail latency. - **Watt-efficient by design** — Edge-class inference cards run at 70–150 W with no compromise on FP8 throughput. Lower draw, lower heat, more rack-level density — and a power bill that doesn't eat your margin. **Workload fits:** - **LLM serving (vLLM / TGI / TensorRT-LLM)** (OpenAI-compatible API, continuous batching) — Serve Llama-3 70B FP8, Mistral Large, or fine-tuned 7B–13B variants via vLLM or TensorRT-LLM. Continuous batching, PagedAttention, and an OpenAI-compatible endpoint out of the box. Scale horizontally; pay only for the minutes the GPU is hot. - **Batch & embedding workloads** (Triton Inference Server, INT8) — Whisper transcription, BGE/E5 embedding generation, ResNet/YOLO classification, MoE-7B INT8 routing — Triton with dynamic batching keeps the SMs saturated and the per-request cost on the floor. - **Multi-tenant API platforms** (MIG-partitioned, isolated tenants) — Carve one card into multiple MIG instances and run isolated tenants — each with its own memory partition, SM allocation, and p99 budget. Ideal for inference-as-a-service and internal model gateways. **Workflow:** 1. **Filter for inference silicon** — Filter by FP8 support, MIG capability, watt envelope, and country. Pin reliability score, 30-day uptime, and inbound network speed for your token-streaming endpoints. 2. **Boot a serving image** — Pull a vLLM, TGI, TensorRT-LLM, or Triton container — or paste your own from a private registry. Pre-built images expose an OpenAI-compatible endpoint on first boot. 3. **Wire up the endpoint** — Public IP and port forwards land in under 90 s. Hook your gateway, run a load test against p99, and read tokens-per-second straight off the vLLM metrics endpoint. 4. **Scale by the minute** — Per-minute billing rounds to the second. Spin up replicas during traffic spikes, drain them when load drops — the meter stops with the instance. **Your $/M-tokens is one image pull away.** FP8 tensor cores, MIG partitions, vLLM and TensorRT-LLM ready out of the box — billed per minute, paid in BTC, USDT, USDC or CLORE. Pull the image, hit your endpoint, watch the cost-per-token drop. ### Host (host perspective) > List your inference fleet on Clore.ai. ML platform teams and API providers rent inference-class silicon by the minute for production endpoints — pay-as-you-go, OpenAI-compatible, MIG-partitioned. Per-minute payouts in **BTC**, **USDT**, **USDC** or **CLORE**, plus up to **+200%** daily emission via MFP. **Why host this tier:** - **Inference workloads pay 24/7** — Inference is the always-on workload — not the burst-and-die training run. API platforms pay on-demand premiums for FP8-capable, MIG-ready silicon they can hold for months. Fees are **2.5% spot** / **10% on-demand**, reducible up to 50% via PoH. - **MIG-friendly, multi-tenant by design** — Pro inference renters carve a single card into up to 7 MIG partitions and serve multiple tenants. Higher utilization on the renter side, more billable minutes on yours, fewer cold-starts in between. - **Per-minute payouts, no caps** — Earnings credit to your wallet every minute the rental runs. Withdraw to **BTC**, **USDT**, **USDC** or **CLORE** as often as you want — no daily caps, no listing fee, no minimum balance. **Earnings paths:** - **Plug-and-list** (FP8-capable card, no CLORE required) — List your inference card, accept rentals from API platforms paying on-demand for FP8 + MIG. Get paid per minute, withdraw any time, in any of four currencies. - **+ Hold CLORE for fee discount** (Up to −50% on your half of the fee) — Hold CLORE in your account wallet — no lock, no contract, no penalty. Your share of the marketplace fee drops linearly up to 50% off, capped at 2,000,000 CLORE held. - **+ Stake MFP for daily emission** (Up to +200% of rental, paid in CLORE) — Lock CLORE behind the server's quality score. The network pays a daily emission on top of your rental — up to **+200%** of your rental price, paid in CLORE. Skip it whenever, no penalty. **Workflow:** 1. **Install Clore Hosting Software** — Boot the Clore Linux image (USB or PXE). Pair the host with your account using your initialization token. 2. **Configure for serving** — Configure SSH, Docker, and MIG-partitioning support on each card. Flip to `public` when the auto-attestation clears. 3. **Price for the API market** — List on-demand for production endpoints, spot for eval and batch. Adjust live — inference-tier on-demand demand runs flat 24/7, not just during training cycles. 4. **Lock MFP for daily emission** — Stake CLORE behind the server's quality score. 24 h warm-up, then up to **+200%** rental price as a daily emission reward, paid in CLORE. **Inference is the always-on workload.** API platforms rent inference silicon by the month, not the hour. List your fleet, pick on-demand for the steady money, and collect per-minute payouts in BTC, USDT, USDC or CLORE. --- ## Datacenter / training tier — copywriting reference ### Rent (renter perspective) > Foundation-model training silicon. **HBM3/HBM3e** bandwidth, fourth-gen **NVLink**, NVSwitch fabric, **Transformer Engine** with FP8 — the same silicon used to pretrain Llama-3 70B and 405B. Available per-minute on the marketplace, paid in **BTC**, **USDT/USDC** or **CLORE**. **Why rent this tier:** - **HBM bandwidth, not just FLOPS** — Training throughput is bandwidth-bound past a certain scale. Datacenter cards ship with **HBM3** or **HBM3e** at 3,000+ GB/s, keeping tensor cores fed during attention and FlashAttention-2 kernels. - **NVLink + NVSwitch for 8-GPU pods** — Fourth-gen NVLink at **900 GB/s** per card, NVSwitch fabric for non-blocking all-to-all across 8 GPUs in a single chassis. The substrate FSDP and DeepSpeed ZeRO-3 were designed for. - **Transformer Engine FP8** — Hopper and Blackwell ship with **TransformerEngine** for mixed FP8/FP16 training — half the memory footprint, double the throughput, no convergence loss on the well-known recipes. **Workload fits:** - **Foundation-model pretraining** (Llama-3 70B / 405B, Megatron-LM, FSDP) — Pretrain 70B-class transformers from scratch in 8-GPU NVSwitch pods, scale to multi-node with NCCL-tuned interconnect. FSDP, DeepSpeed ZeRO-3, Megatron-LM tensor-parallel — the full pretraining stack runs unmodified on these hosts. - **Long-context fine-tuning** (FlashAttention-2, 128k+ context) — Fine-tune at 128k–1M context windows with FlashAttention-2 and ring-attention. HBM3e capacity (141 GB on H200, 192 GB on B200) eliminates the activation-checkpointing overhead that dominates 80 GB cards. - **FP8 training with TransformerEngine** (BF16 master weights, FP8 GEMMs) — Run TransformerEngine in mixed FP8/BF16 mode for 2× throughput vs pure BF16 at the same convergence trajectory. Hopper and Blackwell ship the silicon; the recipe is in the public PyTorch nightly. **Workflow:** 1. **Filter for the right interconnect** — Filter by HBM generation, NVLink version, NVSwitch availability, and inbound bandwidth. Multi-node runs need both intra-node NVSwitch and inter-node InfiniBand or 100 Gb Ethernet. 2. **Pull a training image** — Boot a PyTorch + NCCL + Megatron-LM container, or DeepSpeed, or your team's own pretraining image from a private registry. NVIDIA NGC images run unmodified. 3. **Bring up FSDP / ZeRO-3** — Pod boots in under 90 s with NVSwitch fabric pre-configured. Run NCCL all-reduce benchmarks before the first step — the listing publishes its real interconnect topology. 4. **Stop on checkpoint, resume on schedule** — Per-minute billing rounds to the second. Stop after a checkpoint commits, resume on the same node tomorrow. Spot bidders pay less; on-demand bidders never get pre-empted. **Pretraining starts at $0/quota-request.** HBM3e, NVSwitch, fourth-gen NVLink, TransformerEngine FP8 — booked per-minute, paid in BTC, USDT, USDC or CLORE. No three-quarter procurement cycle, no PO, no capacity gate. ### Host (host perspective) > List your datacenter on Clore.ai. ML engineers training foundation models book HBM3-class silicon by the minute for pretraining runs and ablations — at price points your hyperscaler MSAs won't match. Per-minute payouts in **BTC**, **USDT**, **USDC** or **CLORE**, plus up to **+200%** daily emission via MFP staking. **Why host this tier:** - **Training silicon commands real money** — Foundation-model teams pay on-demand premiums to hold HBM3-class silicon for the length of a pretraining run. Fees are **2.5% spot** / **10% on-demand**, reducible up to 50% via PoH — the host take on a single H100 or B200 dwarfs anything in the consumer tier. - **Pod-friendly, NVSwitch-aware** — Datacenter renters book 8-GPU NVSwitch pods, not single cards. The marketplace exposes interconnect topology, so a host with proper NVLink fabric and InfiniBand wins listings against price-only competitors. - **Per-minute payouts, no caps** — Earnings credit to your wallet every minute the run is hot. Withdraw to **BTC**, **USDT**, **USDC** or **CLORE** as often as you want — no daily caps, no listing fee. Onboard up to **192 servers** per account via API. **Earnings paths:** - **Plug-and-list** (HBM3-class card, no CLORE required) — List your datacenter pod, accept rentals from foundation-model teams paying on-demand for NVSwitch + NVLink fabric. Get paid per minute, withdraw any time, in any of four currencies. - **+ Hold CLORE for fee discount** (Up to −50% on your half of the fee) — Hold CLORE in your account wallet — no lock, no contract, no penalty. Your share of the marketplace fee drops linearly up to 50% off, capped at 2,000,000 CLORE held. - **+ Stake MFP for daily emission** (Up to +200% of rental, paid in CLORE) — Lock CLORE behind each server's quality score. The network pays a daily emission on top of your rental — up to **+200%** of your rental price, paid in CLORE. No penalty if you skip it. **Workflow:** 1. **Install Clore Hosting Software** — Boot the Clore Linux image on each node (USB or PXE). Pair the pod with your account using your initialization token. API onboarding for fleets. 2. **Publish the topology** — Configure SSH, Docker, NVLink topology, and inter-node interconnect. Flip to `public` after auto-attestation — datacenter renters filter on real bandwidth, not nameplate spec. 3. **Price for pretraining** — List on-demand for foundation-model runs that hold the pod for weeks. Spot for ablations and short-duration eval jobs. Datacenter on-demand floors run an order of magnitude above consumer. 4. **Lock MFP for daily emission** — Stake CLORE behind each pod's quality score. 24 h warm-up, then up to **+200%** rental price as a daily emission reward, paid in CLORE. **Foundation models pay datacenter rates.** ML engineers hold HBM3 pods for weeks at a time, not 90 seconds. List your datacenter, pick on-demand for the long runs, and start collecting per-minute payouts in BTC, USDT, USDC or CLORE. --- ## Comparison — Consumer tier | GPU | VRAM | CUDA cores | FP16 TFLOPS (tensor, dense) | Mem BW (GB/s) | Spot $/hr | SDXL 1024² it/s | Llama-3 8B tok/s | | --- | --- | --- | --- | --- | --- | --- | --- | | RTX 3070 | 8 GB GDDR6 | 5,888 | ~80 | 448 | $0.10 | ~1.4 | ~50 | | RTX 3080 | 10 GB GDDR6X | 8,704 | ~119 | 760 | $0.14 | ~2.0 | ~85 | | RTX 3090 | 24 GB GDDR6X | 10,496 | ~142 | 936 | $0.18 | ~3.0 | ~110 | | RTX 4070 | 12 GB GDDR6X | 5,888 | ~117 | 504 | $0.16 | ~2.5 | ~60 | | RTX 4080 | 16 GB GDDR6X | 9,728 | ~195 | 716 | $0.27 | ~4.5 | ~95 | | RTX 4090 | 24 GB GDDR6X | 16,384 | ~165 | 1,008 | $0.31 | ~7.5 | ~125 | | RTX 5080 | 16 GB GDDR7 | 10,752 | ~225 | 960 | $0.28 | ~5.5 | ~115 | | RTX 5090 | 32 GB GDDR7 | 21,760 | ~419 | 1,792 | $0.39 | ~10.0 | ~180 | | RTX 4070 Ti | 12 GB GDDR6X | 7,680 | ~160 | 504 | $0.20 | ~3.2 | ~75 | Tier-primary card: **RTX 4090** — best $/throughput pick within Consumer. --- ## Comparison — Professional / workstation tier | GPU | VRAM (ECC) | TDP (W) | NVLink | ISV cert. | V-Ray CUDA score | Spot $/hr | | --- | --- | --- | --- | --- | --- | --- | | RTX A4000 | 16 GB GDDR6 | 140 | — | yes | ~1,180 | $0.13 | | RTX A5000 | 24 GB GDDR6 | 230 | 112 GB/s | yes | ~1,700 | $0.22 | | RTX A6000 | 48 GB GDDR6 | 300 | 112 GB/s | yes | ~2,300 | $0.42 | | RTX 6000 Ada | 48 GB GDDR6 | 300 | — | yes | ~4,000 | $0.55 | | A40 | 48 GB GDDR6 | 300 | 112 GB/s | yes | ~2,150 | $0.32 | Tier-primary card: **RTX A6000** — best $/throughput pick within Professional / workstation. --- ## Comparison — Inference tier | GPU | VRAM | TDP (W) | FP8 | MIG | Perf/W (FP16 TFLOPS/W) | Llama-3 8B tok/$ (vLLM INT8) | Spot $/hr | | --- | --- | --- | --- | --- | --- | --- | --- | | Tesla T4 | 16 GB GDDR6 | 70 | — | — | ~1.16 | ~9,800 | $0.08 | | NVIDIA L4 | 24 GB GDDR6 | 72 | yes | — | ~1.68 | ~14,000 | $0.22 | | NVIDIA L40S | 48 GB GDDR6 | 350 | yes | — | ~1.05 | ~20,000 | $0.65 | | A10 | 24 GB GDDR6 | 150 | — | 4× | ~0.83 | ~12,500 | $0.16 | Tier-primary card: **NVIDIA L40S** — best $/throughput pick within Inference. --- ## Comparison — Datacenter / training tier | GPU | HBM | Mem BW (GB/s) | FP8 TFLOPS | BF16 TFLOPS | NVLink BW | Transformer Engine | Spot $/hr | | --- | --- | --- | --- | --- | --- | --- | --- | | Tesla V100 | 32 GB HBM2 | 900 | — | — | 300 GB/s | — | $0.28 | | A100 40GB | 40 GB HBM2e | 1,555 | — | 312 | 600 GB/s | — | $0.78 | | A100 80GB | 80 GB HBM2e | 1,935 | — | 312 | 600 GB/s | — | $0.92 | | H100 | 80 GB HBM3 | 3,350 | 1,979 | 989 | 900 GB/s | yes | $1.89 | | H200 | 141 GB HBM3e | 4,800 | 1,979 | 989 | 900 GB/s | yes | $2.40 | | B200 | 192 GB HBM3e | 8,000 | ~4,500 | ~2,250 | 1,800 GB/s | yes | $3.40 | Tier-primary card: **H100** — best $/throughput pick within Datacenter / training. --- ## FAQ — Consumer tier ### Rent FAQ **Q: What can I actually run on a consumer GPU on CLORE?** A: Consumer cards on CLORE.AI cover most hobby and indie workflows: Stable Diffusion 1.5 and SDXL, ComfyUI/Automatic1111, Flux.1, LoRA and QLoRA fine-tuning of 7B-13B LLMs, Whisper transcription, video transcoding, Blender Cycles, and game-server hosting. Anything that fits in 8-32 GB VRAM and runs in Docker runs here. You get full root SSH plus a Jupyter template if you want one. **Q: How fast does a rented server actually boot?** A: Cold-start lands in roughly 60-90 seconds for a typical Docker image: server allocation, container pull, GPU passthrough, SSH up. Pre-cached templates (PyTorch, ComfyUI, vLLM, Ollama) are faster because the image is already on the host. Once running you pay per minute, so a 10-minute experiment costs ten minutes of rental, not an hour. **Q: Spot vs on-demand - what's the difference?** A: On-demand is a fixed per-hour price the host sets; the rental cannot be revoked while you have funds. Spot is auction-style: you bid, the highest bidder runs, and a higher bidder can preempt you. Spot is typically 30-50% cheaper. CLORE.AI charges 2.5% on spot and 10% on on-demand, split 50/50 with the host. **Q: Is CLORE.AI cheaper than RunPod or Vast.ai?** A: Spot prices on CLORE.AI usually beat RunPod community pricing because there is no centralized markup; you rent directly from the host with a 2.5% spot fee. Vast.ai is the closest comparison, and on consumer cards CLORE.AI is generally within a few cents per hour. Hold CLORE in your wallet for Proof of Holding and you stack up to 50% off the marketplace fee. **Q: Can I bring my own Docker image and SSH key?** A: Yes. Point at any registry - Docker Hub, GHCR, Quay, your private registry - then set env vars, port forwards, and your SSH public key in the rent dialog. Templates on the platform are just preset configs; nothing is locked down. You get full root inside the container with GPU passthrough. **Q: Do I have to hold CLORE tokens to rent?** A: No. Renters pay in BTC or CLORE; USDT and USDC rental payments are on the roadmap. Holding CLORE in a tracked wallet (Proof of Holding) is optional and gives up to 50% off marketplace fees at 2,000,000 CLORE held - a discount, not a paywall. New users can rent in BTC on day one. ### Host FAQ **Q: Can I host one card or do I need a full rig?** A: One card is fine. Most CLORE.AI hosts run between 1 and 16 GPUs out of home offices, gaming rigs, or small-shop racks. The onboarding flow is the same for a single 4090 in a desktop tower as it is for a 16-card chassis - install the host agent, register the wallet, list the GPU. No datacenter required. **Q: How do I get paid - crypto only?** A: Earnings credit your unified Clore wallet balance. Withdraw any time in BTC, USDT, USDC (Ethereum ERC-20), or CLORE. There are no withdrawal limits beyond network confirmation fees, and no minimum payout threshold. Renters pay in CLORE or BTC today; stablecoin rental payments are on the roadmap so your incoming mix will broaden. **Q: Is hosting on CLORE more profitable than mining?** A: For modern consumer cards (3080 and up), yes - rental revenue typically clears 2-4x what the same GPU mines on Ethash-class algorithms in 2026. When the marketplace is quiet, the host agent automatically falls back to mining, so the card is always doing something. Net result: you capture peak rental income and never sit fully idle. **Q: What hardware do I need beyond the GPU itself?** A: A modern x86 host (4+ CPU cores, 16-32 GB RAM per GPU, NVMe SSD), a stable internet line (100 Mbps symmetric is plenty for most workloads), and a PSU sized for the card's TDP plus 30% headroom. The host agent runs on Ubuntu 22.04 or 24.04. No GPU-direct networking required for consumer-tier listings. **Q: What's the MFP staking bonus and do I need it?** A: MFP Lock is optional. You stake CLORE behind a server's quality score; in return the network pays a daily emission reward up to +200% of the server's rental price and reduces the extra non-CLORE hoster fee to 0%. Skipping it forfeits the bonus but carries no penalty. On a consumer card earning $300/month, MFP can stack meaningful CLORE on top of fiat-equivalent rental income. **Q: Can renters mess up my machine?** A: No. Workloads run inside isolated Docker containers with a sandboxed root - the renter sees a Linux box that looks like theirs, but the host OS, your kernel, and your other GPUs are untouchable. The host agent enforces resource limits and tears the container down on rental end. Twelve thousand GPUs run this way on the network today. --- ## FAQ — Professional / workstation tier ### Rent FAQ **Q: Why pick an ECC pro card over a consumer 4090 or 5090?** A: ECC memory catches single-bit errors silently in flight - mandatory for production CAD pipelines, V-Ray and Octane farms, regulated medical or financial ML, and any research where bit-flip integrity affects results. Pro cards (A4000/A5000/A6000/RTX 6000 Ada/A40) also carry ISV certifications consumer cards do not. If your client SLA references ECC or ISV validation, the consumer 4090 disqualifies. **Q: Are these cards ISV-certified for V-Ray, Octane, and SolidWorks?** A: The NVIDIA RTX A-series and RTX 6000 Ada carry full ISV certifications: V-Ray, Octane, SolidWorks, Rhino, DaVinci Resolve, ANSYS, COMSOL, and the Adobe Creative Cloud chain. Consumer Ada cards (4090/5090) are not on those lists. If your renderer's support matrix excludes GeForce, you need a pro card - which is exactly what CLORE.AI lists in this tier. **Q: Can I run multi-GPU NVLink workloads on pro cards?** A: Yes - the A5000 and A6000 expose NVLink in pairs (no Switch fabric), giving 112 GB/s peer bandwidth and unified memory across two cards (48 GB on A5000 pair, 96 GB on A6000 pair). Filter by 'NVLink' in the marketplace to find listings. The RTX 6000 Ada and A40 do not have NVLink connectors but pair via PCIe with FSDP. **Q: How do these compare to A100 and H100 for studios?** A: Pro cards (A6000 / RTX 6000 Ada / A40) give you 48 GB ECC at one-quarter to one-third the rental price of an A100 80GB and one-fifth of an H100. You give up HBM bandwidth and FP8 tensor cores, but for production rendering, virtual workstations, and 13B-34B inference under ECC the pro tier hits the price-performance sweet spot. **Q: Are pro GPUs quieter than 4090s in shared studio environments?** A: The cards themselves run cooler and quieter at lower TDP - A4000 is single-slot 140W, A5000 is dual-slot 230W, A6000 is 300W with a blower-style cooler designed for rack airflow. CLORE.AI is a remote rental platform, so the noise question only applies to your own studio if you're hosting; pro cards are explicitly the quieter pick there. **Q: Can I rent for production rendering pipelines with deadlines?** A: Yes. Use on-demand listings (10% fee, fixed price, cannot be preempted) for deadline work - the rental holds as long as your balance covers it. Per-minute billing means you pay only for actual frame time, not provisioned hours. Filter by host reliability score and 30-day uptime to find studios with track records that match production SLA requirements. ### Host FAQ **Q: Does my workstation qualify as a CLORE host?** A: If it has one or more NVIDIA GPUs from Pascal (10-series) onward and runs Ubuntu 22.04 or 24.04, yes. Pro tier (A4000/A5000/A6000/RTX 6000 Ada/A40) lists fine on home tower workstations and on rack-mount studio servers - the host agent treats them identically. Workstations under desks earn well because pro cards are typically idle outside business hours. **Q: Can I list 4x A4000 in a single chassis?** A: Yes - the A4000 is purpose-built for it (single-slot, 140W blower). A workstation chassis with 1500W PSU and adequate intake fits 4x A4000 cleanly, and the host agent lists each GPU as an independently rentable instance or as a linked node. Multi-GPU servers also unlock a renter pool for FSDP and tensor-parallel work, which lifts utilization. **Q: How does ECC affect rental price?** A: ECC cards command 20-40% rental price premiums over equivalent-VRAM consumer cards because a meaningful slice of demand (CAD studios, regulated ML, academic research) cannot use non-ECC silicon. The pro tier also clears utilization more reliably during business hours when studio renters are active. ECC is a moat - one consumer cards do not have. **Q: Do studios prefer A6000 over 4090 when both are available?** A: For ECC and ISV-certified pipelines, yes - the 4090 is excluded from many studio software support matrices. For pure throughput-per-dollar inference work, studios still pick 4090. As a host you decide which tier you're competing in: a 4090 listing optimizes for AI hobbyists, an A6000 listing optimizes for studio renters who pay more and rent in longer blocks. **Q: Is there a quiet or passive cooling option for hosting?** A: The A40, L40S, and Tesla-class cards are passive (datacenter rack airflow). The RTX A4000 single-slot blower is the quietest active card in the tier. RTX 6000 Ada and A6000 use blower coolers - quieter than triple-fan consumer 4090s at full load but still audible. For office or home-studio listings the A4000 is the noise winner. **Q: Can I keep using my workstation when it's not rented?** A: Yes - the host agent releases the GPU back to the host OS within seconds of a rental ending and can be configured to pause listings during work hours. Many studios list their workstation farm 18:00-09:00 local time and on weekends, then reclaim the cards for daytime work. CLORE.AI charges nothing for paused listings. --- ## FAQ — Inference tier ### Rent FAQ **Q: What's the rough cost-per-million-tokens for Llama-3 70B FP8 on L40S?** A: An L40S serving Llama-3 70B FP8 with vLLM and continuous batching pushes roughly 3,000-4,500 output tokens/second at batch saturation. At a $0.78/hr spot rate, that lands near $0.05-$0.07 per million output tokens before the 2.5% spot fee. PoH staking knocks the fee in half; reserved spot floors land you closer to $0.04/M. Numbers vary with prompt length and batch shape - benchmark on your traffic. **Q: Can I run vLLM with continuous batching on these GPUs?** A: Yes. The inference tier (T4, L4, L40S, A10) is exactly what vLLM's PagedAttention and continuous batching are tuned for. L40S handles 70B FP8 single-card with KV-cache headroom; A10 and L4 serve 7B-13B at high throughput; T4 covers Whisper, embeddings, and 7B INT8. Pull the official vLLM Docker image, point it at your model, expose port 8000. **Q: Does the GPU support FP8 and INT8 inference?** A: L40S has Ada FP8 tensor cores - the same architecture as H100 for inference math, at a fraction of the rental price. L4 also supports FP8. T4 and A10 are pre-FP8 but have INT8 (T4 added INT8 in Turing, A10 in Ampere) and excel at quantized 7B-13B serving. Pick L40S when FP8 throughput matters; pick A10 or T4 when $/request matters more. **Q: What's typical p99 latency for 7B-class models on this tier?** A: On A10 or L4 with vLLM and batch-1, time-to-first-token for a 7B FP16 model lands around 80-150 ms; p99 inter-token latency is 25-40 ms. L40S with FP8 cuts both roughly in half. T4 doubles them. Real numbers depend on prompt length and concurrent batch size - low-batch interactive serving is fastest, high-batch saturation maximizes throughput. **Q: Can I MIG-partition this card for multi-tenant serving?** A: MIG (Multi-Instance GPU) is supported on A100, A30, and H100/H200 - not on L4, L40S, T4, or A10. For consumer-tier multi-tenancy on the inference tier, run multiple model replicas inside a single Docker container or use container-level resource limits. If you need hardware-isolated MIG slices, rent A100 40GB and partition into up to 7 instances. **Q: Why pick L4 vs T4 vs L40S for serving - they all sound similar?** A: T4 (16 GB, 70 W, Turing) is the cheapest cost-per-request option for Whisper, ResNet, YOLO, and embeddings. L4 (24 GB, 72 W, Ada with FP8) is the modern T4 successor - same power envelope, FP8 support, fits 7B-13B FP16 single-card. L40S (48 GB, 350 W, Ada) is the inference flagship for 70B FP8 single-card and large-batch serving. Match VRAM and FP8 need to model size. ### Host FAQ **Q: What's typical utilization for inference GPUs on the marketplace?** A: Inference-tier cards (T4, L4, L40S, A10) clear 65-80% utilization on average, well above the 60% network-wide floor, because ML platforms running 24/7 serving traffic do not behave like bursty training jobs. L40S leads the tier because FP8 supply is tight in 2026; T4 stays high through sheer cost-per-request demand. Expect steadier monthly revenue than consumer cards. **Q: How does A10 or L4 ROI compare to T4 for hosts?** A: T4 is cheap to acquire (often under $400 used in 2026) and earns roughly $135/mo at 70% utilization - payback in 3 months. A10 and L4 cost more upfront but earn $155-$260/mo on Ada-class throughput and FP8 demand. L4 wins on power-efficiency at 72 W. ROI ranking by months-to-break-even varies with second-hand prices in your region. **Q: Do I need to provision NVENC for video transcoding renters?** A: All four inference cards (T4, L4, L40S, A10) ship with NVIDIA video encoders - L4 has dedicated AV1 NVENC, T4 has H.264/HEVC. NVENC is enabled by default in the host agent; transcoding renters (FFmpeg pipelines, OBS, Twitch infrastructure) find your listing via the standard marketplace filters. No extra setup. **Q: What's the power draw versus revenue picture for L40S?** A: L40S pulls 350 W at full load and earns roughly $647/month gross at typical utilization. At $0.10/kWh that's about $25/mo in electricity, leaving $620+ net before any MFP staking bonus. The card is profitable in any region with electricity under $0.30/kWh. Datacenter operators with sub-$0.05/kWh rates capture the spread. **Q: Can I MIG-partition my A10 or A100 for multi-tenant hosting?** A: A10 does not support MIG. A100 (40GB and 80GB) supports up to 7 MIG instances per card - enable it via nvidia-smi mig commands and the host agent will list each slice as a separately rentable GPU. Multi-tenant MIG hosting is the standard cloud-style pattern: more concurrent renters, smaller VRAM per slice, higher aggregate revenue. **Q: How does inference uptime affect my host reputation score?** A: Reputation scoring weighs 30-day uptime, rental completion rate, and renter ratings. Inference workloads run for days or weeks per rental, so a single power blip or reboot can dent uptime metrics more than a brief consumer-tier outage would. Hosts with 99%+ uptime and clean reboot histories surface higher in marketplace rankings and clear utilization faster. --- ## FAQ — Datacenter / training tier ### Rent FAQ **Q: Can I pretrain a 70B model from scratch on this GPU?** A: Single-card, no - 70B pretraining needs an 8-GPU node minimum. CLORE.AI lists 8x H100, 8x H200, and 8x B200 pods with NVLink fabric for exactly this. A100 80GB pods run 70B FSDP training but at lower throughput than Hopper-class. For multi-week training, contact host operators for reserved-instance terms - listed in the marketplace under 'Reserved'. **Q: What's the FP8 throughput here vs A100 80GB?** A: A100 80GB has no FP8 - peak is BF16/TF32. H100 introduces FP8 with TransformerEngine and roughly 4x the BF16 training throughput at 2x the rental price - so ~2x perf-per-dollar on FP8-eligible workloads. H200 matches H100 compute but adds 141 GB HBM3e. B200 doubles H100 FP8 again with 192 GB HBM3e. Pick by VRAM and bandwidth ceiling, not just sticker FLOPS. **Q: Does this support NVLink-Switch / NVSwitch fabric?** A: 8-GPU H100 SXM, H200 SXM, and B200 nodes ship with NVSwitch fabric - 900 GB/s peer bandwidth on H100/H200, 1.8 TB/s 5th-gen NVLink on B200. PCIe variants (H100 PCIe, A100 PCIe) have NVLink Bridge in pairs only. Multi-node fabric (NVLink-Switch across racks) is available on B200 hyperscale pods - filter by 'NVSwitch' in the marketplace. **Q: Are 8-GPU pods available for FSDP and DeepSpeed training?** A: Yes. Multi-GPU listings expose all cards in a single rental as a coherent node with NVSwitch (where present), shared NVMe scratch, and InfiniBand or 100 GbE fabric for multi-node training. The standard PyTorch torchrun, DeepSpeed, and Megatron-LM launchers run unmodified. Filter the marketplace by GPU count to find 8x A100, 8x H100, 8x H200 nodes. **Q: What's the HBM bandwidth comparison vs the predecessor?** A: V100 (HBM2, 900 GB/s) -> A100 40GB (HBM2e, 1,555 GB/s) -> A100 80GB (HBM2e, 1,935 GB/s) -> H100 (HBM3, 3,350 GB/s) -> H200 (HBM3e, 4,800 GB/s) -> B200 (HBM3e, 8,000 GB/s). Each generation roughly doubles bandwidth or VRAM; KV-cache-bound serving and bandwidth-bound training scale almost linearly with this number. **Q: Are reserved instances available for multi-week training runs?** A: Yes. On-demand listings (10% fee, fixed price) cannot be preempted as long as your balance covers the run, so a multi-week rental is essentially a reservation. For longer commits, contact the host directly through the marketplace - many operators offer 30/60/90-day discounted blocks. Per-minute billing still applies; you pay only for actual GPU-hours used. ### Host FAQ **Q: What infrastructure do I need to host H100, H200, or B200?** A: Datacenter-class power (208/415V three-phase recommended for 8-GPU SXM pods at 5.6 kW for H100/H200 and 8 kW for B200), liquid cooling for B200 SXM, redundant 100 Gbps networking, and a host that can saturate it. PCIe variants run on dual-socket x86 servers with 1500-2000W PSUs. Most CLORE.AI datacenter hosts deploy these in colo cages, not home racks. **Q: Are there enterprise SLAs and fleet contracts available?** A: CLORE.AI is a marketplace - SLAs are negotiated host-to-renter through on-demand reserved listings, not centrally enforced. Operators running 16+ datacenter GPUs commonly offer 99.9% uptime contracts directly to enterprise renters. Talk to the platform team for fleet onboarding tooling and bulk-listing API access if you're deploying 32+ cards. **Q: How does the +200% MFP bonus scale on a $1,000+/month card?** A: MFP daily emission reward is up to +200% of the server's rental price, paid in CLORE on top of fiat-equivalent rental income. On an H100 earning $1,479/month, the maximum MFP bonus tracks rental revenue directly - so the absolute bonus on datacenter cards dwarfs consumer-tier numbers. Stake CLORE behind your fleet at the same ratio across all servers; the bonus also zeroes the non-CLORE hoster fee. **Q: Do training renters expect 99.9% uptime?** A: Yes. Multi-week pretraining runs on 8-GPU pods are economically catastrophic to interrupt - a checkpoint loss on a 7-day H100 run costs the renter low-five-figures in lost compute. Datacenter hosts with 99.9%+ verified 30-day uptime command premium pricing and clear fleet-scale demand. Reliability score on your listing is the primary signal training renters filter by. **Q: Is power and cooling the binding constraint?** A: For B200 SXM, yes - 1,000 W TDP per GPU, 8 kW per node, liquid-cooled rear-door heat exchangers or direct-to-chip loops are effectively required. H100 and H200 air-cool in well-designed colo (700 W TDP each), but rack density lands 2-4 GPUs per U at most. A100 (400 W) and earlier are easier - any datacenter with 30 kW/rack handles them comfortably. **Q: Can I list a multi-node NVLink-fabric cluster?** A: Yes - the host agent supports multi-node listings with a unified rental boundary. Two 8x H100 nodes connected via NVLink-Switch fabric list as a single 16-GPU pod that the renter rents as one unit. B200 hyperscale pods (256+ GPUs over NVLink-Switch) require coordination with the platform team for onboarding; smaller clusters self-list through the standard host agent flow. --- ## Per-GPU FAQs > One uniqueness-driver question per GPU — pulled verbatim from the master-data record. **Q: RTX 3070 — Is 8 GB VRAM enough for SDXL on a 3070?** A: Yes — SDXL runs at 768² with optimizations like xformers, fp16, and tiled VAE. For full 1024² batch-2 you'll want a 3080 or 3090. Quantized 8B LLMs and SD 1.5 fit comfortably. **Q: RTX 3080 — Can a 3080 run SDXL at full 1024² resolution?** A: Yes — 10 GB GDDR6X is enough for SDXL at 1024² batch-1, and with tiled VAE you can push to batch-2. For batch-4 production pipelines, step up to a 3090 or 4080 with 16+ GB. **Q: RTX 3090 — Why pick a 3090 over a 4090?** A: The 3090 has the same 24 GB VRAM as the 4090 at roughly 60% of the rental price. Slower memory bandwidth (936 vs 1,008 GB/s) and no FP8, but for budget-sensitive 24 GB workloads it's the value pick. **Q: RTX 4070 — Can a 4070 handle SDXL and 7B LLMs?** A: Yes — 12 GB GDDR6X fits SDXL 1024² batch-1 and 7B Llama FP16 inference comfortably. Tighter than a 3090 but cheaper, modern Ada cores, and lower power (200 W vs 350 W). Step up to 4070 Ti or 4080 for batch-2 SDXL or 13B. **Q: RTX 4080 — When should I pick a 4080 over a 4090?** A: Pick the 4080 when 16 GB is enough — SDXL batch-2, 7B fine-tuning, 13B INT8 inference. ~70% of 4090 throughput at ~55% of the rental price. Step up to 4090 for 24 GB and 70B INT4 work. **Q: RTX 4090 — Can I run 70B models on a 4090?** A: Yes — Llama-3 70B INT4 fits across two 4090s with tensor parallelism via vLLM or ExLlamaV2. For single-card 70B you'll want an H100 or H200. 13B and 34B fit comfortably on one 4090. **Q: RTX 5080 — How does the 5080 compare to a 4090?** A: The 5080 has 16 GB GDDR7 vs the 4090’s 24 GB GDDR6X — less VRAM but newer Blackwell tensor cores with FP4 support. For 16 GB-class workloads (SDXL, 7B fine-tune, 13B INT8) the 5080 wins on energy and FP4 throughput. For 24 GB workloads (34B, 70B INT4 across 2 cards) the 4090 still wins. **Q: RTX 5090 — Is the 5090 worth the premium over a 4090?** A: If you need >24 GB on a consumer card, yes — the 5090's 32 GB GDDR7 fits Llama-3 13B FP16 in single-card memory and runs ~1.4× transformer training throughput vs 4090. For 24 GB-or-less workloads, the 4090 is still the better $/throughput pick. **Q: RTX 4070 Ti — Is the 4070 Ti the right pick for AI hobbyists?** A: Often yes — it sits right between the 4070 and 4080 in throughput at a friendlier price. 12 GB Ada VRAM runs Flux/SDXL production, 7B QLoRA, and 13B INT8 inference. If you need 16 GB go 4080; if budget-tight stick with 4070. **Q: RTX A4000 — Why pick an A4000 over a 3070 with similar VRAM?** A: ECC memory and ISV certification — required for production CAD, V-Ray, and academic ML where bit-flip integrity matters. Single-slot 140 W form factor lets hosts pack 4× A4000 in one workstation chassis. Quieter and lower-power than consumer Ampere. **Q: RTX A5000 — When does A5000 beat a 4090 for studios?** A: When you need datacenter validation and ECC. ISV certification, NVLink in pairs, and server-grade thermals — important for production V-Ray pipelines, virtual workstations, and academic research. Lower throughput than 4090 but production-stable. **Q: RTX A6000 — When do I need 48 GB instead of 24 GB?** A: For 34B FP16 single-card inference, full-precision LoRA on 70B with FSDP across 2 cards, Unreal cinematics at 8K, and Blender scenes that exhaust 24 GB. The default pick when you need >24 GB but aren't paying H100 rates. **Q: RTX 6000 Ada — How does the RTX 6000 Ada compare to the A6000?** A: Roughly 2× transformer throughput at the same 300 W envelope and 48 GB ECC. Ada's FP8 tensor cores and 4th-gen RT make it the upgrade for studios already running A6000-class workloads who need more headroom without moving to H100 pricing. **Q: A40 — When should I pick A40 over A6000?** A: When ECC + datacenter form factor matter and your workload doesn’t need NVLink. A40 is server-rack-friendly (passive cooling, NVENC), A6000 is a workstation card. Same 48 GB ECC, similar bandwidth. A40 is more available in DC fleets, A6000 in studios. **Q: NVIDIA L4 — Why pick L4 over a 4090 for inference?** A: Power-efficient (72 W vs 450 W), passively cooled, designed for 24/7 multi-tenant inference. Datacenter-validated for serving stacks like vLLM and Triton. The 4090 is faster per-card; the L4 is cheaper per-request at scale. **Q: NVIDIA L40S — Is L40S a substitute for H100?** A: For inference, often yes — FP8 throughput on Llama-3 70B is competitive at a fraction of the rental price. For training, the H100's HBM3 bandwidth and NVLink fabric still win. Pick L40S for serving, H100 for pretraining. **Q: Tesla T4 — Is the T4 still worth renting in 2026?** A: Yes — for low-cost, high-volume inference (Whisper, ResNet, YOLOv8, embeddings) the T4 still wins on $/inference. Six-year-old silicon, but its 70 W passively-cooled form factor keeps host costs low and rental prices ultra-competitive. **Q: A10 — Why pick A10 over L4 for inference?** A: A10 has more raw FP16 throughput (124 TFLOPS vs 121) and the same 24 GB VRAM. L4 is much more power-efficient (72 W vs 150 W) and Ada-class with FP8. A10 is the AWS/GCP standard — better choice if you need MIG or are matching a hyperscaler reference setup. **Q: Tesla V100 — Should I still pick V100 over A100 in 2026?** A: Only for legacy code paths or budget-constrained FP32 scientific workloads. For transformer training, the A100 40GB is faster, has TF32, and isn't much more expensive. Pick V100 when the price gap matters more than throughput. **Q: A100 40GB — When should I pick the 40 GB over the 80 GB A100?** A: When you're pretraining 7B from scratch or fine-tuning 13B with offload — 40 GB is plenty. Step up to 80 GB for 34B+ pretraining, 70B fine-tuning, or LongRoPE / 128k-context work that exhausts the smaller model's KV cache. **Q: A100 80GB — Is A100 80GB still relevant against H100?** A: Yes — it's typically 50–60% of H100 rental price with 80 GB HBM2e and supports the same FSDP + DeepSpeed pipelines. For training without FP8 / TransformerEngine, A100 80GB remains the cheapest way to get HBM and NVLink in 2026. **Q: H100 — PCIe or SXM H100 — which should I rent?** A: SXM is faster (700 W, NVLink 900 GB/s) and is what you want for distributed training. PCIe (350 W, NVLink Bridge optional) is fine for single-card inference and easier to host. CLORE.AI lists both — filter by NVLink speed in the marketplace. **Q: H200 — When does the H200 beat the H100?** A: Whenever memory bandwidth or VRAM is the bottleneck — 141 GB HBM3e at 4.8 TB/s eliminates KV-cache offload, fits 405B INT4 across 4 cards instead of 8, and runs 1M-token contexts native. Same compute as H100, but the memory upgrade is significant for serving. **Q: B200 — Is B200 available outside hyperscaler waitlists?** A: Yes — CLORE.AI hosts have started listing B200 nodes as supply ramps in 2026. Availability varies by region; filter by 'B200' in the marketplace and check live spot floor. NVLink-Switch fabric available on multi-node pods. --- ## Payment methods Renters pay in BTC or CLORE today; USDT and USDC rental payments are on the roadmap. Hosts withdraw in BTC, USDT, USDC (ERC-20), or CLORE. ## Fees and pricing model Spot 2.5% total / On-demand 10% total; both split 50/50 between renter and host. PoH discount up to 50% on marketplace fees at 2,000,000 CLORE held. MFP Lock: optional, daily emission reward up to +200% of the rental price. ## Brand and network Founded 2022, 12,000+ GPUs, 3,000+ servers, 50+ countries, per-minute billing, no minimum commitment. --- ## Skipped routes > Routes intentionally excluded from this snapshot. They are SPA shells served by the React app at request time and have no static, citable copy on disk. The marketplace overview above is the canonical write-up for `/marketplace`. - `/token` — SPA route — no on-disk content; rendered client-side from React state. - `/dao` — SPA route — no on-disk content; rendered client-side from React state. - `/wallets` — SPA route — no on-disk content; rendered client-side from React state. - `/clorent` — SPA route — no on-disk content; rendered client-side from React state. - `/support` — SPA route — no on-disk content; rendered client-side from React state. - `/terms-and-conditions` — SPA route — no on-disk content; rendered client-side from React state. - `/privacy-policy` — SPA route — no on-disk content; rendered client-side from React state. - `/api-docs` — SPA route — no on-disk content; rendered client-side from React state. - `/mfp-calculator` — SPA route — no on-disk content; rendered client-side from React state. - `/mining-calculator` — SPA route — no on-disk content; rendered client-side from React state. - `/check-gpu-ban` — SPA route — no on-disk content; rendered client-side from React state. - `/bare-metal` — SPA route — no on-disk content; rendered client-side from React state. - `/roadmap` — SPA route — no on-disk content; rendered client-side from React state. --- ## Footer - Website: https://clore.ai - Documentation: https://docs.clore.ai - Blog: https://blog.clore.ai - Statistics: https://opendata.clore.ai - Twitter/X: https://x.com/clore_ai - Telegram: https://t.me/clorechat - Discord: https://discord.gg/clore-ai - YouTube: https://www.youtube.com/@cloreai **Host documentation:** - [Becoming a host](https://docs.clore.ai/for-hosts/installing-clore-hosting) - [MFP — Maximum Fair Price](https://docs.clore.ai/for-hosts/mfp-lock-a-complete-breakdown-of-mechanics) - [Marketplace fees](https://docs.clore.ai/for-hosts/host-fees) - [All docs](https://docs.clore.ai/) Generated by `scripts/seo/generate-llms-full.js` — re-run via `npm run seo:llms-full`.