Will Wade

Been working with Sherpa-Onnx TTS a lot over the last year. It’s a nice project to make a onnx runtime for lots of different languages and interfaces. Just whipped together a Gradio demo to show all the voices and hear them - most notably MMS Onnx models Sherpa-Onnx Demo

→ 6:25 PM, Dec 16
Also on Bluesky

Published our py3-tts-wrapper python library finally this week. Should power a lot of funky things. Supports all the major TTS engines online and new ones offline. Use alongside this app to see what voices are available (API here)

→ 11:26 AM, Jul 31
Also on Bluesky

A lot of dictation tools takes a little while to get used to as you have to mentally prepare what to say. I’ve thought for a long time that for the population we see a lot who need these this is a skill that is pretty cognitively difficult. Aqua Voice is a new solution to this.

Aqua Voice feels as natural as talking someone at a keyboard. It writes what you say, but it also does what you say.

We achieve these interactions using an ensemble of models fused together in a way that allows them to understand your intent, not just what you literally said.

Examples

Remember that the meeting is on Thursday. - Friday, Friday.

->

Remember that the meeting is on Friday.

(Implicit Correction)

It looks neat. $10/month. See the video demo

→ 8:25 AM, Mar 27
Also on Bluesky

I’ve often pondered on a deceloping a rotary encoder with a push down action as a technique for accessing our auditory scanning app echo. One day we might do it and if so I’ll look at this project.

GitHub - carlosefr/spinner-mouse: Arduino-based USB rotary controller for arcade Arkanoid, Tempest, etc. github.com
Reader: github.com
GitHub: Let’s build from here https://github.com/carlosefr/spinner-mouse?tab=readme-ov-file#spinner-mouse-game-controller

Also
- Rev-O-Mate www.tindie.com/products/…
- Smartknob github.com/scottbez1…
→ 8:11 AM, Mar 27
Also on Bluesky

Iterative rapid development Part 2 (aka Judith Part 2).
So sometimes.. just sometimes.. a couple of months works pays off. Or I guess we could frame this post around successful product development - keeping the end user the focus of your development and involving them at all stages of your development cycle is key. We are proud to shout loudly about this.

So two months ago we saw Judith. Technology has not been successful for her before. (You can read more why in my previous posts). So we rapidly developed a SwiftUI app that was unique in its operation - you dragged from one letter to the next. Trying it with Judith we still had a problem - positional errors and missing letters were high. So we then rapidly worked on a “correct-a-sentence system” and settled on using a Azure OpenAI API (GPT3.5) and then developed our own custom model alongside this (needing something that worked offline and privacy first was key (NB: The videos is this offline model - Not GPT). A few iterations of this and its correcting as good as GPT 3.5 83% of the time.

So two months later, what does this all mean for the end user? They can now write on technolgy to communicate for the very first time. Happy - well so far yes. But long term AAC use is complex and not just down to the technology. So lets be curious and proud at the same time. There is more to do
- Small tweaks in the UI
- Its correcting a bit too much at times - sometimes adding more grammar changes than I would like - and equally dangerous - losing words. To fix this we could further train our custom model.
and then more I’m sure the next time we review this cycle of the iterative development cycle.

What does this process look like overall?

We follow the double-diamond development cycle. It’s a pretty common approach yet following it is not always easy.

So lessons so far?
- Iterate Fast: Rapid prototyping and adjustments are key to finding viable solutions for complex, unique solutions. There is a much longer tail to development for solid, reliable products but getting somewhere fast helps to see whats important.
- User-Centric Design: Keeping the user at the heart of all stages of development ensures that the technology we develop truly meets their needs. Its not always easy but it can be done. Its important to do this as part of a team though. It can so easily be lost as to whether you are getting carried away or whether you are on track. We pride ourselves on at Ace Centre being a transdisciplinary team where never one person sees the whole picture.
- Continuous Learning: Every challenge presents a learning opportunity, pushing us to constantly improve. And heck - its good fun too seeing big gains.
- Passion and Persistence: A keen interest in making a difference and the drive to keep pushing forward are indispensable.
→ 9:38 PM, Mar 26
Also on Bluesky

Objectified. "if we understand what the extremes are, the middle will take care of itself."

Objectified (2009, 75 minutes) is a documentary film about our complex relationship with manufactured objects and, by extension, the people who design them. What can we learn about who we are, and who we want to be, from the objects with which we surround ourselves?

It’s a great film (and free to watch from March 14-17!) but in particular, I love this quote from Dan Formosa, Design & Research, Smart Design, New York (around 6 minutes in)

But really our common interest is in understanding people and what their needs are. So if you start to think, well really, what these guys do as consultants is focus on people, then it’s easy to think about what’s needed design wise in the kitchen, or in the hospital, or in the car.

We have clients coming to us and saying here’s our average customer. For instance, she’s female, she’s 34 years old, she has 2.3 kids.

And we listen politely and say, well that’s great, but we don’t care about that person. What we really need to do to design is look at the extremes, the weakest, or the person with arthritis, or the athlete, or the strongest, or the fastest person. Because if we understand what the extremes are, the middle will take care of itself.

View it online at www.ohyouprettythings.com/free

→ 8:35 AM, Mar 15
Also on Bluesky

There is a lot of evidence for many students (mainly dyslexia in research) that autocorrection software is far more efficient than using a wordpredictor¹. In general, for increasing writing speed, it makes heaps more sense (Note: predictor software can be better if your aim is supporting literacy development) - and I would argue for other areas of AT too ². Using any AI tools (GPT or Bard - but even a lot smaller models will do this), they will “fix” your sentence with a prompt, e.g. “correct this typo sentence including spaces”. But doing this simply, in real-time without user interaction³, is complicated. This fixkey is getting there and worth keeping an eye on ( it needs a shortcut key, which isn’t perfect. It needs to run without that really) https://www.fixkey.ai
1. This paper is a bit old now - but was from the University of Cardiff onlinelibrary.wiley.com/doi/10.10… - note it was using Global AutoCorrect which is now part of TextHelp. Good luck on finding it. Others like this are also useful ↩︎
2. I’m convinced autocorrect is needed more in AAC ↩︎
3. Tools like Grammarly are great too - but for sure, you need to use it like a spell checker. I hacked together an OpenAI API backed script to do this and its not as straightforward as I would have liked see it here if you are interested ↩︎
→ 7:40 AM, Feb 2
Also on Bluesky

Rapid development in swiftUI for niche problems

Some clients we see are fantastic with paper-based solutions. But sometimes, finding powered AAC systems which give them more independence is far trickier than you may think. Consider Judith. She doesn’t lift her finger from the paper. This continuous movement is surprisingly not well supported in AAC. Your obvious thoughts are SwiftKey and Swype, but they require a lift-up at the end of a word or somewhere else. Next up, you may try a Keguard or TouchGuide. But then, for some users, this is too much of a change in user interaction. Even if you succeed, you often ask an end user to change the orientation or layout of their paper-based system.. and all in all, it’s just too much change. Abandonment is likely. The paper-based system is just more reliable.

So what do we do? We could look at a bespoke system. But typically, it requires much thought, effort and scoping. That’s still needed, but you can draft something up far quicker these days using ChatGPT. It wrote the whole app after a 2-hour stint of prompt writing. Thats awesome. (Thanks also to Gavin, who tidied up the loose ends). So, this app can be operated by detecting a change in angular direction or detecting a small dwell on each letter. We need to now trial this with our end user and see what’s more likely to work and what’s not and work this up. We may need a way of writing without going to space (something that we see quite a lot), and I can see us implementing a really needed feature, autocorrect. This is all achievable. But for now, we have a working solution to trial a 500 lines of code app made in less than a day’s work.

→ 11:04 PM, Jan 12
Also on Bluesky

Stephen Hawking's AAC setup in closeup
At MOSI in Manchester today, I saw Stephen Hawking’s Chair and other neat things from his office in Cambridge. Note the spaghetti of cables. It’s tricky to figure out where all the leads go, but I’ll give it a wild guess. The plugs look like either mini XLR or the old PS2 Serial leads. Some questions, though; I’m unsure what the “Filter” box fits to and why is the Words+ box even used? I thought the connection with Intel meant he was using ACAT. Why is that Words+ Softkey box the parallel version when there is clearly a lot of USB kicking about, too? Why are we plugging into something behind the chair when surely the tablet has the speakers anyway? There are as many questions than answers.
- Words+ Archive page (This is the USB version of the softkey box)
- Case for Original Synthesiser made by David Mason at Cambridge Adaptive
- Chair details
→ 9:03 PM, Nov 3
Also on Bluesky

Correlating Sounds for a sound switch

Last week, I visited a client for work to test out a sound switch device. For one reason and another, the kit didn’t pan out on the day (NB: Highly possible it might have been me.. I need to try again). But with the recordings we got, we can now do some fun work and create a melspectogram correlation technique. It might work.. It certainly looks pretty reliable against background noise and talking. You can see our work in progress and try it yourself at github.com/acecentre…

→ 10:51 PM, Oct 26
Also on Bluesky

I was lucky to see the latest iteration of the Colibri (“Hummingbird”) from Colibri interfaces a few weeks ago. It’s a wireless head mouse - and blink switch. They also have a free web-based Scanning Speller, which is accessible with Blink from your browser. Portuguese only for now.

→ 8:19 AM, Jul 7

Midiblocks is a super neat idea. A block editor to program your gestures . Under the hood it’s using HandsFreeJs which is another wrapper around MediaPipe. Similar to EyeCommander and Project Gameface. Talking of which I’ll just leave this here. Eek. 👀 Not a great PR start for the Google project.

→ 12:29 AM, Jun 29

Having a great chat with our team of OTs about how we measure outcomes and sharing our old presentation from 2011, which still stands. Whatever happened to the adapted GAS for AT? Well GAS Light looks interesting.

Outcomes in Occupational Therapy (& Assistive Technology) from will wade

→ 4:32 PM, Jun 19

Looking forward to delivering day 9 today of our Assistive Technology Unit (with the University of Dundee). The focus is on Activity & Occupational Analysis - which I feel is an essential part of our AT assessment process. (Adapted from Acitivity & Occupational Analysis)

See also:
→ 11:58 PM, Jun 14

I met some OTs at #bci2023 ! Whoop! Check out the work from Canada looking at paediatric (CP, Rett syndrome and other diagnoses) in mobility and play using BCI. (I took some pictures but their own tweets are better! twitter.com/sneakysho… and twitter.com/sneakysho…)

→ 5:17 PM, Jun 8

Thank Lord Brexit hasn’t affected long queues entering our neighbours 🤬 (sarcasm. if you didn’t realise). I am arriving in Brussels (with my bike - if it’s made it through the journey) for the BCI symposium (Postscript. took an hour 20 get through)

→ 1:22 PM, Jun 5

I’m reading about some of the most recent work in BCI. Much of it is academic, but this is an easy read from NeuralEchoLabs on gaming with BCI. Gaming is interesting as it’s not as critical as AAC and has many scopes to play with UI. (And for a more academic read see “this paper” )

→ 10:01 AM, Jun 1

Sebastian Pape has been doing a ton of work on the original Dasher code base for his research on Dasher in VR. It’s pretty awesome. Some of the output can be seen here (and watch the video) - you can also watch a 3D initial from our meeting here. dasher.acecentre.net

→ 5:09 PM, May 25

Last week we released TextAloud on the AppStore. You can read our blog for the entire details as to what it’s all about and why but in brief, it’s v1 of a more extensive app we want to create to support people better in long streams of TTS. We have several ideas for this - but most importantly, we are putting users at the heart of the design process along all stages (using the double diamond approach). Get in touch if you want to be part of the focus group. One idea, though, is using SSML to help markup a speech. You can see one implementation idea below.

There’s a much longer post due from me about why SSML hasn’t been used in AAC, but in short - the time is overdue.

→ 7:00 AM, May 21

What does this process look like overall?

So lessons so far?