Hi everyone, hope you’re doing well and staying safe my name is Trey I’m totally blind from the UK. I’m an electronic Musician, currently using a combination of hardware, synthesisers a hardware step sequencer and macOS to make electronic music in my studio. For awhile now I’ve really been interested in getting a hardware sampler that I can use to play live, all of the currently available options I’ve tried haven’t suited my needs. I came across zynthian after posting on reddit link here: https://www.reddit.com/r/synthdiy/comments/1btjtqb/totally_blind_enquiring_about_starting_a_project/
I really think that zynthian could be a great option for myself and other totally blind users that are looking for a good groove box, and sampler option.
I have no coding experience myself, and I am unable to do any diy electronics due to being totally blind and having the use of one hand because of cerebral palsy. However, I am looking for developers from this community to collaborate with me on a project to make zynthian accessible to totally blind users via a text to speech interface, some of you may be wondering about haptic feedback via motors. This I believe would be expensive and if the motors break this could lead to other problems. Some of you may also be wondering about the use of braille in an accessible interface, Braille would be difficult to implement and not all blind. Musicians are fluent enough in bbrail to take advantage of such an interface. With all, this being said, I’m looking for collaborators to help make zynthian accessible to myself and other blind users.
If anybody is interested in collaborating, please drop me a message or comment on this thread. Thank you very much for taking the time to read this post everyone and I look forward to collaborating with you all
Hi @soundwarrior20! A very warm welcome to our wonderful community. You raise a subject that we have been discussing for some time but have not yet implemented all the elements required to make Zynthian as accessible as we would like.
Our high-level plan is to provide a separate audio output that carries information about the user interface. This would be mixed with the main audio output for a user monitor feed, separate from the main outputs used for front of stage. It would give audible information to the user about the current state including spoken word and tones.
User input can be via various methods including the rotary encoders and switches and computer keyboard. There could be a set of commands that a user could type that drive the Zynthian.
I am keen to avoid interpreting the graphical user interface, e.g. with screen readers. We have the opportunity to implement a properly focused interface for blind and partially sighted users rather than try to emulate a user interface that is built for fully sighted users.
I also want to provide as much accessibility as possible, not just for visually impaired users but also physically impaired, etc.
FYI I am a member of the MIDI Association Special Interest Group for Accessibility.
We did some proof of concept testing with speech generation and audio mixing. We paused because the ability to drive the Zynthian core directly required a more advanced API and separation on GUI from core functionality. Much of that was fine during the development of Oram, the next version of Zynthian but much remains to be completed. The API and separation of UI from core was not the primary aim during that development but knowing it was required we did as much as was sensible during the refactor.
I think that improved accessibility can be driven by workflows and user stories. The technology changes required to facilitate this can be done iteratively. If we wait until all the building blocks are in place it may never happen.
Even help with language can be invaluable. Some people may use inappropriate terms whilst others may be concerned about using inappropriate language which may inhibit their willingness or ability to help.
So, something that you can do to help is describe some workflows that you currently do or aspire to do with any commentary on the challenges and opportunities to implement such workflows. It can often be simple things that we overlook that cause significant issues and a developer without a disability does not necessarily comprehend the challenges nor the skills and abilities that a user with a disability may have.
There is a thread on our forum where we discussed accessibility for sight impacted users. It is quite long with a variety of people giving various ideas with varying amounts of experience. Accessibility for sight impaired users.
I wonder whether we merge these threads. @lewisalexander2020 was very vocal in that thread. Unfortunately we didn’t have the capacity to complete the work started there but we really should.
I’m still here, I just haven’t been involved further in this due to abandoning linux accessibility due to two major problems, one of which concerning a developer, the other concerning someone conning me out of money for work being undertaken and nothing in return. so I’ve been away from all this.
I’ve been working for Modartt since December 2022 to make organteq 2 accessible for blind organists and as such, I’m building a physical console from scratch which is sponsored by a number of parties, manufacturers, etc.
It would be a huge help, but also a challenge to make zynthian accessible with linux’s built in screen reader. ORCA is not as stable as some think, worse on a raspberry pi due to key mappings as an example, that’s a lot wrong that needs fixing.
all the best and yes, please, do merge this in to a suitable thread or area of the forum and keep the work going. If I can support this somehow, ok, but As I have no test environment for this, there’s not much I can do other than advise.
It is good to hear from you again. It sounds like you have had some bad luck or experiences since we last chatted but it is great that you are working with our friends at Modartt. I love what they do and they are a lovely bunch of people.
As we discussed before, Zynthian’s core is not suitable for screen reader interface and we plan to improve accessibility to all with embedded solutions. The module in Zynthian that does benefit from screen reader is webconf, its web based configuration tool. I suspect there are parts of this that don’t work well for all users, e.g. there may be parts where screen readers fail or are inaccessible to keyboard or screens that have poor contrast. I believe there may be tools that can test some of this automatically and we should run webconf through such tools. Amy advice from anyone with experience in this field is appreciated.
I am interested in your physical console. I would suggest that we keep the two threads seperate now that we have a linked to the old thread. Maybe you could update the old thread with your progress on the console. I think that fits better there, being slightly off-topic (in that it isn’t directly zynthian related) but super interesting to us here.
Good luck with your various projects. Keep in touch and let us know if there is anything that we can help you with too.
I have some good news on this subject. I have a proof of concept implementation of audible feedback for the zynthian Vangelis workflow. It provides spoken feedback of the navigation and control status plus some tones or beeps too indicate long running processes, like loading large snapshots. I am quite pleased with progress and have used it to perform some core actions, totally blind. If course I have the benefit of a deep understanding of the workflows and underlying code which makes me a less representative tester.
Audio output is via a separate soundcard. I have been using a cheap Chinese dongle that cos t a couple of dollars. Itv should work with any ALSA device that is not the main audio device and not being used as a hot plug USB device, although the code does support both hotplug USB and narration (as I have called the text-to-speech). The idea is that you have a separate monitor feed for narrator than your main audio output and that it doesn’t integrate with the jack sound system to reduce system overhead.
This is only in a development branch, not in the Vangelis testing branch. It is incomplete and may need further optimisation but it is a usable proof of concept.
I’m testing the TTS interface and it works fine! I’m using a super-cheap USB audio interface (5€) and it was auto-detected and configured, including the volume control.
I just added the needed configuration so TTS is available and accessible from all standard zynthian kits. Once your development branch is merged into Vangelis, enable/disable TTS will be as easy as:
V5 => long + F4
V4 => long + S4
I hope we can engage some visually impaired users to have real feedback and keep improving on this area.
Good news! Text to speech narrator is now in Vangelis.
Long press F4/S4 to enable and disable.
Announcement is made for most activities, e.g. entering a view, listening navigation, chain selection.
A configuration menu in the admin menu.
Short press F4/S4 to silence the current announcement.
Short press F4/S4 to announce context, e.g. name of view and some context detail.
Reads help pages. Left/right arrow buttons or rotate encoder 3 to move through help document. Short press encoder 3 to toggle pause.
By default, audio is sent to the first unused soundcard. I have used a cheap ($2) USB soundcard.
Pulses a short tone when busy, e.g. loading a snapshot. Sounds a higher pitched, longer tone when busy ends.
This is still in development but is mostly functional. Indeed, I have been able to navigate the UI and perform most workflows without sight.
No resource impact when disabled. Minimal impact when enabled.
Two TTS engines with robotic or more natural sounding voices. The former may sound less appealing but users may find it more intelligible or accurate.
I appreciate that Vangelis is our test branch and that users may be hesitant or have challenges to use it but would appreciate any user feedback. We may reduce the quantity of voices but left a selection to allow some preview for shortlisting.
I am really pleased to have been able to deliver this long-time aspiration to zynthian and believe it could be one of the most complete and integrated assistive technologies in a musical instrument.
I have one of those cheap USB soundcards, so I just gave it a go.
With the default settings (Susan voice, I think), I noticed a very small cut at the beginning of phrases in some context/view changes, although I wasn’t able to find a rationale. Some other similar menu options sound quite alright. I played with speed and other characters, and I believe that initial cutoff happens only with some of the available voices.
The intelligibility depends a lot on the accents familiar to each user, nevertheless I think we may want to do something with the final “#option” of “#total options”. At least for me, it gets really confusing when it’s output at the same volume and immediately concatenated with the selection narration. Also, I realized we may need to have (some) alternate help pages. In some cases, what makes sense visually as abbreviations becomes a mental mess when read aloud, for example, the MIDI Input Devices help.
Another issue, instead of announcing the standby after a timeout, the voice narrator rereads the last selected option.
Anyway, I’m really impressed. What a fantastic achievement @riban
For that cutout with that cheap usb sound card thing - i’ve noticed that with my phone and some podcasts. Is it perhaps some kind of ramp built into the sound card itself causing that? As in, does it perhaps take a little while for it to ramp the volume to actually be audible?