Files

151 lines
11 KiB
Plaintext
Raw Permalink Normal View History

Episode: 1782
Title: HPR1782: ChorusText - a Non-visual Text Editor Open Assistive Device Project
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1782/hpr1782.mp3
Transcribed: 2025-10-18 09:12:35
---
This is HBR episode 1782 entitled, Chorus Text, An On-Visual Text Editor Open Assistive
Device Project, and in part on the series Accessibility, it is posted my first time post-cure
a cure and in about 18 minutes long.
The summary is, Introducing Chorus Text, An On-Visual Text Editor Open Assistive Device
Project.
This episode of HBR is brought to you by An Honesthost.com.
At 15% discount on all shared hosting with the offer code HBR15, that's HBR15.
Better web hosting that's Honest and Fair, at An Honesthost.com.
Hi, my name is David, and this is my first podcast for Hacker Public Radio.
I've been a listener for a good few years now.
I just thought that it's time for me to record a podcast and make a contribution.
And at the same time, I'd like to share a little project that I have been working on called
The Chorus Text, which is an Open Assistive Device made from Arduino, Linux single board
computer, and ESPIC, and Node.js.
So it's basically a non-visual text editor that lets a person with visual impairments do
text editing solely by means of touch and hearing alone without using eyesight.
So text editing is a very sight-led activity in order to do text editing effectively.
One must be able to locate the position of the cursor and gain an understanding of where
he is in the text by reading the text around the cursor and do all these by using eyesight.
That's the primary means of doing text editing.
And yet in our modern life, we have to do so much electronic text editing and reading.
So on the surveys of chorus text, there are three physical sliders that the user can reach
out to at any time.
These sliders are similar to those found in audio mixerboards and changing the position
of the slider would cause the system to pull out the corresponding part of the text and
pass it to the text speech engine, which will speak the portion of the text out loud.
The first slider is the line slider.
So changing the position of this slider from top to bottom will cause the system to read
the text progressively line by line.
The second slider is the word slider and changing the position of the word slider from left
to right will cause the system to read out the words in the current line progressively
word by word.
And the third slider is the character slider and changing the position of the character
slider from left to right would cause the system to spell out the characters that make
up the current word later by later.
This way the user can read the text he's working on with ease and can drill down to the
level of character by character spelling if he needed to.
And much more these sliders are motorized sliders so I can programmatically move them
to any spot along the track and control them with the Arduino.
So when doing text editing, as the user types, the sliders continuously change position,
themselves to reflect or to manifest the contents of the text in real time.
So for example, if the user types three more letters, three more characters, the character
slider would advance three positions, three steps to the right.
And if he deletes one character, the slider would move one step backwards.
And likewise, if he added two words to the current line, the word slider would move
two steps to the right.
And if he deletes one line, the line slider would step back one step.
So gone is the concept that the cursor is this abstract blinking thing that only leaves
inside the monitor and for which the only means of finding where it is on the screen is
by means of eyesight.
It is now physically manifested by the three sliders.
Simply reach out to these three sliders with your hand at any time and the text is immediately
accessible and navigable.
No eyesight required.
If one needs to verify the text that one has just typed, simply move the slider around,
right, move the line slider up to read the previous line, move the slider down again to read
the current line and verify words and spelling in the same manner.
And there are no updates drawn on the monitor.
There is no monitor.
What we have are these three sliders that are constantly moving, repositioning as the
user types.
Other things on the surface of this device is two knobs, one for adjusting the speech
rate and one for adjusting the spelling rate when the user navigates character by character.
And buttons that are essentially equivalent to home and page up and page down.
So four buttons for each sliders and button for speaking the contents of the current line,
the current word, the current character and one button for speaking the content of
the whole text and the mode switcher dial.
So the mode switcher dial is for switching focus.
Like in a browser, we can switch focus to the address bar by clicking on the address
bar, right, and switch focus to the text box by pointing, clicking on the text box.
Likewise with the mode switcher dial, the user needs to turn the dial to a certain position
to switch the focus.
So right now there are only a few modes available, main text and settings and a few placeholder
modes like chat and search.
So if the user turns the knob to settings, he will enter the settings area and there are
two buttons next to the dial and the user can switch languages by pressing these two buttons.
So right now the languages supported are English, Chinese and Bahasa Indonesia.
Those are the three languages that I speak.
But I am using eSpeak and eSpeak supports many more languages, it's just that I don't speak
any other languages so I don't know, I can't verify whether it is truly speaking French
or how well it works, right, but if there is anyone interested to add support for other
languages, please feel free.
Okay, so in future there will be a chat mode and a Wikipedia search mode going to the chat
mode, it will automatically bring him to a chat room and he can read the chat messages
and type chat messages and send to other people in the room and Wikipedia search mode would
let the user type in keywords, such queries that will be sent to Wikipedia and then when
the results come back, they will be loaded on the course text and made available to the
user via the slider so all he needs to do is to type in and reach out to the sliders
to read the text and access the content of Wikipedia that way.
So component-wise, it's using Arduino Uno right now and an Adafruit Motor Shield
version 2, which is excellent and three motorized line potentiometers, the three sliders and
one motorized rotary potentiometer, the mode switcher dial, about 20 buttons, just simple
push buttons and two rotary potentiometer, normal non-motorized potentiometer for adjusting
the speech rate and spelling rate and also a PC Arduino where the course text application
is running and handles communication with Arduino and execute e-speak, text to speech and runs
a web interface so users with low vision can go to the address and read the text in an
enlarged font type or the user's significant others can also access the device via this web
interface and import text or follow the reading progress of the user.
So all that is done through a web browser so they can come in from another computer,
they can come in from a tablet or a smartphone and communication is done via socket IO real
time and there's also AngularJS that handles the DOM updating real time.
It sounds very complicated but trust me it's not as complicated as it sounds.
I have the code and the design files up on my GitHub simply go to github.com slash kura
Dave slash chorus text that is K-U-R-A, K-U-R-A, D-A-V-E slash chorus text, inside it everything
is inside it basically the Arduino code is inside it, no JS, course text application is
inside it, the hardware design files are also inside it, oh I forgot to mention there's
one button on the surface of the device that is intended for one click OCR text import so
connect the pcduino to a scanner via usb and then put the text to be scanned on the scanner
and press this one button on the course text device and it will initiate a scan and pass it to
tesseract and load the converted OCR text into the device ready for the sliders and made them
available to the user via the three sliders. Also another thing that I'm contemplating right
now is to make use of Mary TTS as the speech engine for chorus text but to do that I would probably
use another machine to host the Mary TTS application server and let the pcduino send query to this
more powerful machines and playback the wild file received from this server machine. The reason being
eSpeak is great and it's lightweight and it's fast and it's robust and it has a lot of man hours
behind it but it sounds pretty robust. I think it's great I have no issues with it whatsoever but
people knew to text-to-speech technologies as basically text-to-speech technologies in Linux
might be repulsed by the robotic voice. Again this is completely personal thing. Depends it differs
from person to person but I think it would be interesting to add support for Mary TTS. Unfortunately
Mary TTS requirements can barely be met by single board computers like pcduino3
in its more than 512 mb of RAM and a decent processor. pcduino3 can run Mary TTS server
it's definitely not as responsive as if it were running on an x86 i7 or i5 or i3 machine.
So describing how the device works by using a podcast is a very poor way of doing that
and I humbly invite you to visit the webpage at www.carstacks.org where I have demo videos and
write-ups on the website. So if this is something that tickles your fancy or if you are looking for
your next Arduino project I would be thrilled to work alongside you and to provide more details
just shoot me an email curacura.dev at gmail.com. K-U-R-A-K-U-R-A-D-A-V-E at gmail.com and we'll be talking soon.
So I don't want to be a lone developer working in a cave and now that I got a physical
prototype ready that I can bring to a table and put down on a table for people to try I'm
sharing this with as many people as I can. I'm going to know Asia Summit this week in my home
country in Indonesia and after that I am participating in MakerFair Singapore 2015 which I believe
is in 11 to 12 July 2015 and it would be awesome to get as much feedback as I can if this
device is to be something that is truly useful I think it should go through many storms of many
mines so this week and so comments feedback anything at all just send me an email alright thank you
so much for listening you've been listening to hecka public radio at hecka public
radio dot org we are a community podcast network that releases shows every weekday Monday through
Friday today's show like all our shows was contributed by an hbr listener like yourself if
you ever thought of recording a podcast and click on our contributing to find out how easy it
really is hecka public radio was founded by the digital dog pound and the infonomican computer club
and it's part of the binary revolution at binrev.com if you have comments on today's show please
email the host directly leave a comment on the website or record a follow-up episode yourself
unless otherwise stated today's show is released under creative comments attribution share
light3.0 license