How it works

Your machine is the whole stack.

There is no server tier. LU Labs detects what you already run, loads a model into your memory, and generates — start to finish — on-device.

Auto-detect your backends

LU Labs scans for the 12 runtimes it supports — Ollama, LM Studio, vLLM, llama.cpp and more — and lists what you can run.

Load a model into memory

Pick a model; it loads into your own RAM or VRAM. A green dot means it is live and ready — no download of your prompts anywhere.

Generate — on-device

Every token, image and frame is produced by your hardware. The result never round-trips through a cloud you do not control.

What leaves your machine?

The honest answer, side by side.

LU Labs

Typical cloud AI

ON-DEVICE

100% local inference

The model runs where your data already lives. Air-gap it if you want.

NO TELEMETRY

Zero phone-home

No analytics SDK, no background reporting, no silent updates of your data.

NO KEYS

No API keys to leak

Local mode needs no cloud credentials, so there is nothing to steal.

OPEN SOURCE

Auditable code

Read exactly what the app does. The community can and does.

AUTO-CLEANUP

Processes die on close

The ComfyUI process is killed when you quit — no lingering daemons.

OFFLINE

Works with no internet

Once a model is on disk, pull the ethernet cable. It still runs.

LU Labs is open source — verify every claim on this page yourself.