Comma with an External GPU — What Can It Do?

Analysis

March 18, 2026

The comma 3X and comma 4 already see 360 degrees. The current AI model just cannot use most of it. Two recent developments from tinygrad — a new training infrastructure and USB4 GPU support — mean that is finally about to change. Here is what an external GPU could actually unlock.

TL;DR

The comma 3, 3X, and 4 all have 360-degree cameras, but the current AI model can only use a fraction of that view due to hardware constraints.

The current on-device driving model is only ~58 MB — not because that is the natural size for the task, but because it was engineered down to fit the Snapdragon DSP and run at 20 Hz.

tinygrad (George Hotz's other company) recently built the training infrastructure that comma needs to create larger, more capable models.

tinygrad also recently enabled GPU support for comma via USB4 — the final missing piece to run a bigger model.

A bigger model on an external GPU could expand the field of view, improve reliability, and dramatically increase steering precision.

The GPU loads over USB4 but runs inference entirely on its own internal bandwidth — that is why a mobile chip cannot compete.

The AMD RX 9070 XT benchmarks comparably to Tesla HW5 on certain workloads, according to comma's lead engineer.

The hardware is already there — the model is the limit

Every comma 3, 3X, and 4 ships with a triple-camera 360-degree vision system. The bottleneck is not what the cameras can see. It is what the AI model can fit in memory.

The current model is ~58 MB — by necessity, not by choice

The on-device driving model (driving_vision + driving_policy) totals around 58 MB on disk. That is not the natural size for a task this complex — it is the size comma was forced to hit to run at 20 Hz on the Snapdragon 845 MAX DSP. The model was engineered down to fit the chip, not sized up to match the task.

The real bottleneck is compute throughput, not just RAM

Driving requires inference at 20 Hz — one full forward pass every 50 milliseconds. The Snapdragon DSP has a fixed number of TOPS (tera-operations per second). A model 10–100x larger simply would not finish in time, dropping to 2–3 Hz, which is useless for real-time control. More RAM alone would not solve that.

Limited field of view

Even though comma 3, 3X, and 4 all have a 360-degree camera system, the on-device model can only process a narrow slice of that view at inference time. Processing more cameras at higher resolution multiplies the compute requirements dramatically — well beyond what the current chip can handle.

No training architecture (until now)

Previously, comma had no internal infrastructure to build new, larger models from scratch. They could refine and retrain existing architectures, but there was no pipeline to scale up dramatically.

The comma devices run on a Snapdragon 845 MAX — a powerful mobile chip with a custom cooling system that comma engineered to prevent thermal throttling on a hot windshield. It is very good at what it does. But a model that processes wider camera inputs and runs a fundamentally larger architecture would need orders of magnitude more compute throughput than the DSP can sustain at real-time driving speeds. Interestingly, comma has already added placeholder files named big_driving_policy.onnx and big_driving_vision.onnx to the openpilot repository — empty stubs that signal the bigger model is already being planned in the codebase.

Two pieces that just fell into place

For a bigger model to become real, comma needed two things they did not have before. They now have both.

Piece 1: tinygrad's training infrastructure

tinygrad — the open-source deep learning framework George Hotz runs alongside comma.ai — recently completed the infrastructure needed to train large models end-to-end. That means comma can now build bigger, better-architected models with significantly more data, rather than being constrained by earlier tooling.

Piece 2: GPU support over USB4

Also recently: tinygrad enabled external GPU support for comma devices over USB4. The comma 3X and comma 4 both have a USB4 port. That port can now be used to connect an eGPU — giving openpilot access to dramatically more compute than the on-device chip can provide.

Together, these two developments form the complete stack: a way to build a bigger model, and a way to run it. Neither alone was enough. With both in place, comma can start targeting something their hardware was never able to do before.

What a bigger model on a GPU would actually unlock

According to comma's lead engineer — shared on an X (Twitter) voice chat — these are the real improvements a GPU-powered model would deliver.

Wider field of view

The camera hardware already sees 360 degrees. A larger model running on a GPU could actually process and use more of that panoramic view — seeing more of the road, adjacent lanes, and hazards that the current compact model simply cannot fit in its context.

More reliable driving

More parameters trained on more data generally means more robust behavior across edge cases. A bigger model can better handle unusual road conditions, merges, intersections, and scenarios that are currently underrepresented in the smaller model.

Higher-rate steering commands

The lead comma engineer mentioned this specifically: a bigger model can poll and output steering commands at a much higher rate. That translates directly into smoother, more precise turns — one of the most noticeable improvements drivers would feel behind the wheel.

How USB4 inference actually works

The USB4 connection is not a bottleneck — because it is not used for the actual inference at all.

USB4 is for loading, not running

A common misconception is that running inference over USB4 would be too slow. That is not how it works. USB4 is only used to load the model weights onto the GPU at startup. Once the model is loaded, all inference happens entirely on the GPU's own internal memory bandwidth — which is massively faster than any external connection.

Why mobile chips cannot compete

Mobile SoCs like the Snapdragon 845 have limited memory bandwidth, tight thermal envelopes, and are fundamentally designed for power efficiency, not throughput. A discrete GPU with its own dedicated VRAM has 10x–20x more memory bandwidth and can sustain workloads that would immediately throttle or overflow a phone chip.

The 9070 XT and Tesla HW5

The AMD RX 9070 XT has been benchmarked comparably to Tesla's HW5 chip on certain inference workloads — a remarkable comparison given that HW5 is a purpose-built autonomous driving accelerator. According to comma's lead engineer on an X (Twitter) voice chat, this is a GPU they are excited about for running a larger openpilot model.

Where things stand today

This is not vaporware. The USB4 GPU support in tinygrad is real and recently shipped. The training infrastructure is real and being used — comma's openpilot 0.11 world model is the first product of that investment. And the hardware port has been on every comma device since the 3X.

What does not exist yet is the bigger inference model itself, and the consumer eGPU product to run it on. Those are still in development. But all the pieces are present for the first time, and the excitement from the team — including the direct comments from the lead engineer — is not theoretical.

For comma users, this matters most for one reason: your device's cameras already see the full road. Soon, the model might actually be able to use all of it.

Learn more

Relevant reads for context on this topic.

openpilot 0.11 — The World Model Release →

The release that debuted the new tinygrad-powered model training.

tinygrad.org →

George Hotz's open-source ML framework powering the training and GPU inference stack.

comma.ai/shop →

The comma 3X and comma 4 — both have USB4 and 360-degree cameras.