Methodology

How we scoreyour machine.

Everything below runs in your browser. Nothing about your hardware is sent to a server. We modeled the math after the openly published approach behind canirun.ai, then adapted it for diffusion workloads, where the bottleneck is VRAM + memory bandwidth instead of KV-cache size.

1 · Hardware detection

We use three browser APIs to fingerprint your machine, and never transmit the result anywhere.

// WebGL — works everywhere, sometimes obfuscated
const gl = canvas.getContext("webgl2");
const dbg = gl.getExtension("WEBGL_debug_renderer_info");
const renderer = gl.getParameter(dbg.UNMASKED_RENDERER_WEBGL);
//   → "ANGLE (NVIDIA, NVIDIA GeForce RTX 4090 Direct3D11 vs_5_0 ps_5_0...)"

// WebGPU — richer, when the browser supports it
const adapter = await navigator.gpu.requestAdapter();
const info = adapter.info;
//   → { vendor: "nvidia", architecture: "ada-lovelace", device: "..." }

// Navigator — RAM (rounded), CPU cores
const ramGB = navigator.deviceMemory;
const cores = navigator.hardwareConcurrency;

Renderer strings are matched against a database of 75 GPUs (NVIDIA RTX 50/40/30/20/GTX 10, AMD RDNA 2/3/4, Intel Arc, and 15 Apple silicon variants). Each entry has hand-curated VRAM and memory bandwidth from official datasheets.

Caveat: Safari and several privacy extensions strip the renderer string. If we can’t identify your GPU, you can pick it manually — that’s why the override dropdowns exist.

2 · VRAM math

For every model + quantization combination, we estimate the actual VRAM the model takes during inference:

VRAM(GB) = params × bits ÷ 8 ÷ 1024³ + 0.5 GB runtime + 10% KV/safety

The 0.5 GB constant covers CUDA/Metal context plus the inference engine itself; the 10% multiplier covers the KV cache, intermediate activations, and a safety buffer for resolution and batch growth. For diffusion models we add it on top of the base weight VRAM because the U-Net / DiT works on full-resolution feature maps.

We track three quantization tiers per model:

FP16Native precision, max quality, max VRAM.
FP8~½ the VRAM, near-identical quality on most models.
Q4GGUF 4-bit. ~¼ the VRAM. Slight softness/text-rendering penalty.

3 · Fit classification

The best quant for your GPU is the largest one that still fits. “Fits” depends on whether you’re on a discrete GPU or Apple unified memory.

function fits(vramUsed, gpu) {
  // Apple Silicon shares memory with the OS — leave 25% headroom
  const cap = gpu.kind === "apple-silicon"
    ? gpu.unifiedRAM * 0.75
    : gpu.vram;
  const ratio = vramUsed / cap;

  if (gpu.kind === "apple-silicon") {
    return ratio <= 0.70 ? "fits"
         : ratio <= 1.00 ? "tight"
         : "no-fit";
  }
  return ratio <= 0.85 ? "fits"
       : ratio <= 1.10 ? "tight"
       : "no-fit";
}

Discrete GPUs get a tighter window because dedicated VRAM has no competition from the OS. Apple chips need more headroom because the operating system, browser, and apps share the same pool.

4 · Scoring (0–100)

Every model gets a composite 0–100 score against your hardware. The weights are:

Speed

55%

Bandwidth-bound throughput at the chosen quant

Memory headroom

35%

More room = stable at high res / long videos

Quality bonus

10%

Logarithmic bonus for larger model parameter counts

score = 0.55·speed + 0.35·headroom + 0.10·quality
≪ ×0.65 if “tight fit” ≫

Tight-fit models get a 35% penalty because in practice they OOM at higher resolutions, longer videos, or with control adapters attached. Models that don’t fit at all are capped at 18.

5 · Speed estimation

Diffusion is bandwidth-bound: each step reads the full model weights from VRAM. So:

steps/s ≈ bandwidth(GB/s) ÷ model_VRAM(GB) × efficiency

Efficiency captures driver and runtime overhead:

NVIDIA discrete: 0.70
Apple Silicon (MPS / MLX): 0.65
AMD discrete (ROCm/DirectML): 0.55
Intel Arc: 0.45

Numbers don’t replicate exact tok/s benchmarks — they’re a relative-comparison proxy. Trust the ranking, not the absolute numbers.

6 · Models we track

We follow ComfyUI’s officially supported model list. As of today: 21 image, 13 video, and 4 editing models — from SD 1.5 all the way to FLUX.2 dev and Wan 2.2 A14B.

VRAM figures are sourced from official model cards, the ComfyUI blog, and community benchmarks. Quantized numbers reflect city96 GGUF builds and Kijai FP8 builds, both of which the ComfyUI ecosystem standardizes around.

7 · Privacy

All detection and scoring happens client-side. No analytics, no telemetry, no API calls. You can verify by opening DevTools → Network and watching: nothing goes out.

Now you know how it works.

Score my machine