Designing low-latency brain-computer interfaces

Lessons from cutting end-to-end decode latency under ten milliseconds for responsive, closed-loop brain-computer interfaces.

BCI

February 9, 2021

Designing low-latency brain-computer interfaces

In a closed loop, latency is felt. A brain-computer interface that takes a quarter of a second to respond feels broken, no matter how accurate it is. Closed-loop systems live and die on the delay between intention and feedback. This post walks through where latency hides in a BCI pipeline and how Qusp drives end-to-end decode time under ten milliseconds.

Why Latency Dominates BCI Feel

The brain expects feedback to follow action almost immediately. When a neurofeedback display or a control signal lags, the loop breaks: users overcorrect, learning slows, and the illusion of direct control disappears. Accuracy gets the headlines, but latency is what users actually feel on every trial.

"In a closed loop, ten milliseconds is the difference between control and frustration." — Qusp BCI Team

Where the Milliseconds Go

Acquisition buffering: Large sample buffers add delay before a single value even reaches the decoder; smaller chunks trade throughput for speed.
Filtering and windowing: Every filter and analysis window has a latency cost that has to be budgeted, not assumed away.
Decode computation: The model itself must run in well under the loop budget, which shapes how complex it can be.
Output and rendering: The last mile — sending the command and drawing the feedback — is latency too, and it's easy to forget.

Engineering for Speed

Cutting latency means attacking every stage at once: stream small sample chunks, use filters with known short delays, keep the decoder lean, and render feedback on a fast path. Qusp processes EEG as a continuous low-latency stream rather than in big blocks, so a sample can move from electrode to decision in single-digit milliseconds.

It also means measuring honestly. End-to-end latency — electrode to feedback — is the only number that matters, and it has to be measured on the real system under real load, not estimated from component specs.

Latency vs. Accuracy

There's always a tension: longer windows and bigger models can be more accurate but slower. The art of closed-loop BCI is finding the point where the system is fast enough to feel direct and accurate enough to be useful. That point is different for every paradigm, and the only way to find it is to tune latency and accuracy together.

A responsive system that's slightly less accurate often beats a sluggish one that's marginally better, because users adapt to a fast loop far more readily than to a laggy one.

Final Thoughts

Low-latency BCI isn't one optimization — it's a discipline applied across the whole pipeline. Stream small, filter carefully, decode lean, render fast, and measure end to end. Get the loop under ten milliseconds and the interface stops feeling like software and starts feeling like control.

‍