Dr.Tim Mahoney (right) and Dr.Dean Freestone, the CEO and the CTO of Fluent BCI. Image courtesy of Fluent BCI.
How Fluent BCI is building the brain's third hemisphere, an AI-enabled device for the voiceless
Nebius AI Discovery Awards landed in London this week. Among the winners: Fluent BCI, a small team of Australian neuroscientists betting against Elon Musk’s Neuralink and most of the brain-computer interface industry. Backed by advances in genAI and unique empirical data, they’re planning to stage a revolution in neuromedicine. Their first step: restoring speech. The ultimate goal is even braver.

Sixty years ago, one of the brightest Cambridge PhD students started getting clumsy. He struggled to bend his legs, tie his shoes, and soon could no longer walk. Diagnosed with amyotrophic lateral sclerosis (ALS), he was given only a few years to live before the disease robbed him of the ability to breathe. Instead, he went on to live a remarkable life, started a family, and built an extraordinary scientific career. All that being unable to speak and almost completely paralyzed for most of the next 55 years.
This is, of course, the story of the legendary Stephen Hawking, whose scientific and personal achievements were made possible by a unique communication device controlled through the only muscle he could still twitch—in his right cheek. Hundreds of thousands of ALS patients have never been so fortunate. Many lose every remaining muscle and with it any ability to communicate.
Only in recent years has a more radical solution emerged: reading the intention to speak directly from the brain. Whenever we want to say something, the motor cortex generates signals that are meant to move the speech muscles. Even if those signals never reach their destination, the underlying neural activity can still be detected. A major issue is that brain signals are so weak and noisy that many researchers consider outside-the-skull devices impractical. Most companies, including Elon Musk’s Neuralink focus on invasive implants placed directly in the cortex, on top of it or in its blood vessels.
Australian startup Fluent BCI argues that this is no longer necessary. As its co-founder and CEO Tim Mahoney proved in his PhD thesis, an implant could be placed under the scalp but outside the skull. With some LLM magic they now expect to restore intended speech with up to 96% accuracy and almost at the speed of thought.
This week, Fluent BCI was one of the winners at the AI Discovery Awards 2026, the annual flagship competition organized by Nebius. We talked with Tim about what makes Fluent BCI so confident in their bet, how far along are they, and what their InScribe device becomes if all their dreams come true.
What was the moment, or the experience, that made you decide to spend a big part of your life working on brain–computer interfaces?
I always say I have a foot in two camps. One: ever since high school I’ve wanted to work in this technology. Learning about the body, that electrical signals control everything, and that we can sense those signals, it just made sense to me that you could reroute them to anything. You could control things with your mind, or have things stimulate you and change your experience of life. That idea has always fascinated me, and that’s what started me on the early journey.
The other camp: just as I was about to start my PhD, my sister was diagnosed with multiple sclerosis. It can manifest in many ways as it progresses — it can take away motor control, the ability to walk, or the ability to communicate. We did a lot of work with other people with multiple sclerosis, ran surveys to understand how brain–computer interfaces could best serve them, and the answer was always communication. That’s what focused me toward Fluent, which is a speech brain–computer interface.

How does Fluent BCI work?
There are lots of parts that come together. We have the brain data — we call this the neural interface. Our insertable device — InScribe — records brain activity, does a lot of pre-processing, and extracts features, which we pass to a machine learning model that predicts which phoneme a person is trying to say at any point in time. There are about 40 phonemes in English so at any moment you have probabilities across those 40 options.
The problem is that when you record from outside the skull, the signal is very attenuated. So we’ll get errors, and we’re fully aware of that. What we bring in to solve that is an LLM powered by relevant context. Instead of someone saying a random thing at any time, what we talk about is usually related to what we’ve been talking about before, or where we are, or the salient objects around us, or who we’re talking to. We capture that context with different senses, audio and visual, and also an understanding of a person’s digital history.
The system would know I’m in this interview with you; it would have a small dossier on each of you, the interactions we’ve had over email, anything it could find about you. So the system would know that the chance of me saying “My grandma makes great apple pie,” which is completely unrelated, is low. Instead it can constrain the possible decoded phonemes down to the most likely outcomes. That makes the problem much simpler and doable with sub-scalp recordings.
And I’m confident the signal is there because I’ve finished my PhD now, and it was all about signal quality in the sub-scalp space. Between the scalp and the bone the quality is good enough. Toward the end of this year we’ll be doing our first in-human work with electrodes in the sub-scalp space. These are patients in hospital, where we’ll do speech work with them to collect data and build a model that we’ll later use with sub-scalp data on our future device.
One great thing we’ve discovered is that sub-scalp brain recordings are very similar to non-invasive recordings, the really high-density systems where you shave your head and stick electrodes directly on the scalp. We found you can collect data non-invasively, in massive amounts, to build these models, which can then be used with sub-scalp data later. So we don’t need lots and lots of patients undergoing a procedure to build the models. We’ve already collected the largest data set of its kind in the world, non-invasively, and we’ve built a model that maps brain-data snippets to speech snippets with 96% accuracy.
So it’s a combination of analysis from the brain and from the surroundings. Which parts of the pipeline are the most compute-intensive?
We want to run all of it at the edge, because inference speed is very important. It needs to be fast for someone to use this intuitively. What people have available now are these iPads where they press a button to say a word — it’s not them joining a conversation, it’s very transactional, and the conversation carries on without them. So it needs to be fast, and it needs to work when there’s no internet, because people will be in places where that doesn’t work. Being completely local is the goal.
As for what’s most computationally intensive — that’s tricky. A lot of our future work is figuring out what’s actually important. Is digital history even that important compared to a transcript of the conversation so far? How many of these things can we take away if we don’t need them? That’s a lot of the work ahead, especially working with participants over a long period, capturing all this context. So it’s an open question.
This LLM that analyzes contexts, will it be open-source, third party or something you want to build yourself?
So far we’ve used open source, to save costs and get a proof of concept. But I can see a future where we set a benchmark with a frontier model and then try to save costs by going back to open source to see if we can reach that benchmark. It’ll be a lot of trial and error — figuring out the most cost-effective approach, but also which works best for the user.
Context, is that something that could be outsourced? Many companies are figuring out that the new big words are “context” and “memory”.
Neuroethics is this enormous field popping up right now — lots of governments are trying to build policy to catch up with where the technology is going. Who has access to the data? Who owns it? If a user wants to retract their data, what can they retract — can the model be retracted? All super interesting questions.
So far we’ve played around with both doing it ourselves and outsourcing, but only with data we’ve collected on ourselves, not yet with other participants. There would be an ethical question about whether we’d be allowed to outsource it, but I think it would be okay; it just needs to be in the agreement, and in the consent when we collect the data. People need to be aware of what’s happening with their data, and when.
There’s another part to that question: deciding what Fluent wants to own — whether we want to own the model itself. I think that’s part of what we want to do, because it’s a significant part of the product. But it’s hard to be sophisticated at both the hardware and the software end at the same time; you don’t want to bite off more than you can chew. We’ve already partnered with another company, and that’s going well — they’re called i14, based in Melbourne, and they have a new model with a really long context window, which works well for us. I could see that relationship continuing.

Fluent BCI collected the largest non-invasive dataset that maps brain snippets to speech snippets with 96% accuracy. Image courtesy of Fluent BCI.
There must be a lot of false perception about you attempting to read someone’s thoughts, or even to control the brain. What’s your response? Where does the difference lie?
We get this question pretty much every time — even with investors:"Well, you’re reading my mind” We are definitely not reading people’s inner thoughts. That’s actually really hard to do. How would you even train a model to understand what the truth is in that case? People have tried for a long time. No: we record from the part of the brain that controls the speech muscles. You need to have an intention to speak. You don’t actually need to move the muscles — even just thinking about moving them is enough — but you do need that intention to speak.
Controlling the brain? There’s none of that. But it’s a really good thing to be afraid of, because it is totally possible. There are groups right now building BCIs that sit in the occipital lobe trying to restore sight by stimulating the part of the brain that creates vision. You can imagine that if you can stimulate it in certain ways, you could manipulate what someone is actually seeing, which is a very scary thought. But our device does no stimulation. It only records. It only reads information.”
How do you imagine your product at its peak, in five or ten years? What’s the perfect world with Fluent reaching its success?
I love being able to think about that future. We want to help people with impaired speech, but for us that’s really a stepping stone to the bigger picture: removing all friction between people and AI. Being able to insert a device that’s so low-risk — it’s like getting a tattoo. You go into a clinic, get it done, pop back out in half an hour, and you’re all set. Without a keyboard or a phone you can stream a thought to an AI to offload a task because there’s no friction.
It basically becomes a third hemisphere that can do tasks for you and give information back. It’s not stimulating, but you could have an earpiece, or glasses with a visual display, so it’s this totally personal, frictionless way to connect seamlessly with AI. That’s where I’d love to see it going. I either get people saying “please, no” or “hell yeah.”
One of your mottos is “intelligence over invasiveness.” The bet is essentially that large language models and context can be engineered well enough that even the sub-scalp, outside-the-skull data is enough. Could you get into more detail on that? There’s quite some competition — Neuralink and all the very invasive technologies.
Take Neuralink. They started their journey in 2016. Large language models weren’t even a thing. Neuralink is building for a history that didn’t exist.
We came in at this perfect time: we’re building for language, just as this other phenomenon, AI, is coming up, which is perfect at generating language and understanding context. So now is the perfect time to build this.
As for how we know it’ll work: even without context or a language model, we’ve built a model that maps brain activity to speech with 96% accuracy. And we’ve done this in a simulation, not with real data. We built a simulator that constructs an entire scenario: a dossier of people’s relationships, a transcript of an ongoing conversation. It even generates an image from a person’s point of view, as if captured from glasses.
Then we generate the next sentence the person wants to say, insert errors at the phoneme level, and we can dial the error rate up. We found that, given the context we expect to capture in future, a large language model can correct errors in a phrase of up to 50%. So even if we only get 50% accuracy with the device, the context sweeps in. And that’s at this stage — in future we’ll capture even more context, have a bigger context window, and the systems will be faster and smarter. Every time LLMs advance, our product advances. To me it’s a no-brainer. We just have to get in there and build it.
Another thing that works really well for us is being sub-scalp. There’s a very recent company that just got FDA clearance for a sub-scalp device. Sub-scalp recording is a very new thing, and they did it for epilepsy monitoring. Their device is basically a cochlear implant refactored for recording instead of stimulating. They got FDA clearance and showed us the pathway through the FDA as a Class II device. Neuralink, Paradromics, Synchron all develop Class III devices, so they have to go down this really long, expensive pathway to get clearance, and they’ll be at that for probably another couple of years.
This other company, Epiminder, showed us the path for a sub-scalp device. Being able to build ours on that same pathway means we can do it much more cost-effectively, and much faster. So even though the big companies have a head start on us, it’s not really a head start, because they have to take this long clearance path.
One other advantage might be the environment, perhaps? There are multiple teams working on various BCIs in Melbourne, besides Fluent. How did Melbourne become almost the global centre for this technology?
It is the legacy of the cochlear implant. The cochlear implant was developed in Melbourne in the '80s, and from that they created Cochlear, the company. From that came the Bionic Ear, and then the Bionic Eye Institute, and a lot of bionic-eye work — which all turned into the Bionics Institute. There’s lots of general bionics work happening in Melbourne as a result. That led to lots of professors at the University of Melbourne who had the skills, which built up the skill set.
But with speech, why infiltrate the brain in the first place? Think of Stephen Hawking who still had a tiny muscle in his cheek that worked. It is also known that for some mysterious reason eye-blinking doesn’t stop unless it’s a very severe case.
If you boil it down, it’s all about speed. In some cases you have people who are completely locked in: they can’t speak or move their body, and in some cases can’t even move their eyes. You’re right that Hawking could move his cheek. But in his scenario he’d use a device that scanned through individual letters, and he’d twitch his cheek when it reached the right letter, then it would start again. It’s binary — and if you can only go binary, it’s a very slow process. Whenever you saw Hawking speaking, he’d have had to spend ten minutes forming a sentence, and then he’d hit play and it would play out.
And your expected speed is …
Speed of speech.
Another thing that surprises me a lot is the so-called co-adaptation phenomenon. It’s not just the device learning your brain, it’s your brain learning how to fire the signals so that the device works the best.
Let’s take a look at a well-studied example. They put electrodes on top of your brain and the system is just looking for a certain frequency-amplitude increase. Depending on that, a cursor moves on a screen. You can’t tell someone, “Increase the beta band in your brain” — people have no idea what that is. But over time, as you sit there watching the cursor, it moves whenever your brain goes into a certain state, and the feedback loop tells your brain that whatever you just did was good. Eventually you learn to move the cursor. It’s like how babies learn to move their limbs — they want to reach for something but don’t know which part of the brain to activate, and eventually they get the dexterity to figure it out. So as long as the feedback is quick enough, you could map your brain to control anything it’s directly connected to — you could turn on a TV with your brain if there were a direct link.
Why it would be difficult with our device is that the feedback needs to be immediate. If you’re waiting for a sentence to complete, it’s hard for your brain to figure out what went wrong across that whole sentence and adapt. There could be parts where, if you keep getting a certain word wrong, then once you get it right you think, “Okay, I need to phrase that differently in my brain when I say it through the device.” Ideally with Fluent it shouldn’t work that way — the device should just learn to work with the user. But the idea is fascinating.
Does that mean there’s not as much need to calibrate each device for each person?
If you collect enough data from one individual, or multiple individuals, and build a general model, it takes less time to train on a new individual. This is something we’ve seen, and it’s been shown a lot in research — the idea of “transfer learning”. No one will want a device if it means spending hours calibrating, or calibrating every day you want to use it.
A major part of what we’re doing over the next twelve-plus months is collecting data — we want someone in a Faraday cage all the time doing this speech work, building a massive data set, which gives us that general model.
Let’s try to specify the data volumes in numbers then. Contracting a cheek muscle is binary, blinking is binary and both do not produce a lot of data to record. What data throughput we’re talking about when someone reads your brain? Is it kilobytes, megabytes, gigabytes per second?
If we look at what we’re collecting, we’re planning on 64 channels, at about 16 bits per channel and about 1,000 hertz. So it’s about a megabyte per second. The data that actually goes through a large language model is probably a lot less, because you’ve pulled out features and down-sampled. But it’s a tricky question to answer. On the edge it can’t be an enormous amount — the real limiting factor is that you have to be able to transmit it over Bluetooth.
Your device also reads from two sources at the same time: one being the motor cortex, and the other being the actual micro-sensations in the speech muscles themselves. Why is that?
Yes. A great thing about where our device sits, where the motor cortex is, is that if you record inward, it’s the motor cortex; but if you record outward, it’s the temporalis muscle, which sits here and controls your jaw. Normally a muscle would be noise for brain recording — you’d filter it out because you don’t want it. For us, because it controls the jaw muscle, it’s actually a signal. So it’s really fortunate that it’s there. Even if you just think about moving that muscle, you get micro-activations in it, and that can help us figure out what someone’s trying to say.
Well, that’s one great coincidence!
There are many. For example, that’s also the thinnest part of the skull, only a couple of millimetres. So the signal isn’t attenuated the way it is at the top or back of the head. It’s really fortunate.
That just makes it destiny at this point.
It is. The body is asking for us to do it.

Tim Mahoney “Being able to insert a device that’s so low-risk — it’s like getting a tattoo”. Image courtesy of Fluent BCI.
