- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
142 lines
18 KiB
Plaintext
142 lines
18 KiB
Plaintext
Episode: 2309
|
|
Title: HPR2309: Crowdsourcing Accessibility
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2309/hpr2309.mp3
|
|
Transcribed: 2025-10-19 01:07:36
|
|
|
|
---
|
|
|
|
This is HPR episode 2,309 entitled Crowdsourcing Accessibility and in part on the series Accessibility.
|
|
It is hosted by John Kulp and in about 23 minutes long and Karimaklin flag.
|
|
The summary is a show about my efforts to get lots of students to help correct transcription of my lectures.
|
|
This episode of HPR is brought to you by an honesthost.com.
|
|
Get 15% discount on all shared hosting with the offer code HPR15, that's HPR15.
|
|
Better web hosting that's honest and fair at An Honesthost.com.
|
|
Hey everybody, this is John Kulp and Lafayette Louisiana.
|
|
Man, it has been a long time since I have contributed an episode and it's mainly because I've just been really, really busy with my job.
|
|
I got, I don't know if a promotion is the right word for it, but about a year and a half ago I was, I made director of the School of Music and Performing Arts here at UL Lafayette.
|
|
And that is, it's changed, it's changed a lot in my job description, a lot more time commitments and less time for stuff like this.
|
|
But summer is upon us and I'm in the office this morning on a Sunday taking care of a couple of things and I thought I would take advantage of the opportunity to record an episode.
|
|
This episode is going to be about something that I've just started trying to do with respect to accessibility in my online class and before I get going on I need to start something.
|
|
What I'm doing, in general terms, the project is to get transcriptions, text transcriptions of the audio part of all of my lectures.
|
|
So if in my online Music Appreciation class I've got 20 video lectures and in order to have even basic compliance with accessibility standards there needs to be a text alternative to anything that is audio or visual for that matter.
|
|
So that means alternate text for images if you've got online content like web pages and stuff like that with images in it or in the case of spoken word you need a text alternative.
|
|
This isn't as much of a concern for music class because I've been teaching for 20 years now and I've never had a deaf person in a music class.
|
|
And normally the captioning and the transcripts are for deaf people but in a music class I've never had a deaf person.
|
|
I've had blind people a number of times but never a deaf person because being able to listen to the music is kind of a fundamental part of the class.
|
|
That doesn't mean that I don't think it's a good idea to have the captions or in this case transcripts because I don't know about you guys but sometimes I like to have options in how I consume things whether it's audio or video or what.
|
|
And if I'm faced with 20 lectures that are on a video format like my, back when I did the guy who did the video for me used a pretty locked down video format and so it doesn't have things like high speed viewing like I would like to have now where I could speed myself up.
|
|
So I like to have options and if I had the option of simply reading the text of all of these lectures at my own pace I would like that and so I'm doing this in part for myself and in part to come into compliance with accessibility standards.
|
|
And it's a big job to take the audio from 20 lectures each of which is anywhere from 25 to 40 minutes about and to transcribe this.
|
|
I've got a transcription tool that I use to do a lot of the heavy lifting.
|
|
It's part of the Dragon Dictate program for Mac and what I was about to do a moment ago when I said let me do something first was to start the transcriber on an audio file.
|
|
Lesson number 19 out of 20 and the way I do that is this is on a Mac by the way.
|
|
I don't know if the Dragon version for Windows has the transcription tool or not but the one for Mac does and I've got a computer in my office here.
|
|
It's actually a triple booting iMac it can boot either the Mac OS or Windows 10 or Ubuntu 16.04.
|
|
I know the people over in the tech department here and they trust me enough where they will let me have a certain amount of control over my own computer.
|
|
So they allow me to triple boot my iMac.
|
|
So I'm going to go into the Dictate menu under Tools and choose Transcription.
|
|
And choose Lesson 19 Wave File, double click on that and in just a moment it will start transcribing.
|
|
And it does a pretty good job.
|
|
It knows my voice I've trained Dictate to understand my speech patterns and my voice but in a music history lecture there are going to be a lot of words that are not in its dictionary.
|
|
So it's going to mess some stuff up and apart from that transcribing an audio file that was not originally intended to be listened to by one of these dictation tools means that it will not have capitalization and punctuation and paragraph breaks and stuff like that.
|
|
All of these things you can do if I were just dictating into the transcription tool I could tell it when I want to period a comma a semicolon a new paragraph a new line all of these things are very easy to tell it to do and it does a great job.
|
|
I can tell it which words to capitalize or make all caps but if I'm just talking to an audience I'm not going to be putting in audible punctuation like that.
|
|
And so it gives me this raw transcript it's already transcribed about two minutes of the lecture it does it about four times the actual speed I think it's already got probably four hundred words transcribed it's it's pretty impressive I think I did a show about this couple years ago where I where I did some audio where I was actually doing the whole thing as if I was talking to drag and dictate and then in real time
|
|
talk to you guys while it was doing the transcription and then posted the transcript as the show notes now I'm not going to do that here that would be extremely tedious.
|
|
But so it's it's often running now and I'm going to keep talking to you guys while I make a little trip around the music building here we've had some problem recently with students coming in and propping open doors and leaving stuff very insecure
|
|
and security is something that we want to take seriously.
|
|
I'm going to check the rear entrance here this one looks okay nobody's coming back here right now just because we've got construction going on and the construction company has blocked it off.
|
|
But they ripped up all these landscaping bricks and pavers that were back there that had gotten completely messed up every time any heavy equipment came along.
|
|
They ripped all that up and poured nice fresh concrete that will not get damaged it doesn't look as pretty initially but long term it's going to be much better.
|
|
I'm just noticing that they've stripped the finish off the tile floor in the hallway too it looks like since classes are out right now the cleaning crews do their
|
|
biannual stripping and waxing of the floors to make everything look nice and shiny for when classes start back up.
|
|
All the desks are piled in the hallway so that they can get to the floor of the classroom.
|
|
It looks like this door down here is secured walking all the way down by the elevator now.
|
|
The door is locked good everything should be locked up tight on a Sunday morning.
|
|
Certain music students actually do have access to the building with their IDs.
|
|
If they're music majors they need to be able to get in here and practice their instruments and stuff so music majors have card access even on weekends and deep into the night.
|
|
It shouldn't be very surprising that music students sometimes find the best time to practice is 11 o'clock at night because there's no one around no distractions and they can just work on their music.
|
|
And so they have ID access for that the problem is when students who are not supposed to be in here somehow have gotten in and then they want their friends to come in and they'll prop open the door for them.
|
|
Okay these doors are also secure good good.
|
|
Yes incredible there's this one door where I put a sign on it says keep this door closed and locked because I kept finding it unlocked and it should never be unlocked.
|
|
And not surprisingly people just ignore the sign and just keep leaving it unlocked and sometimes propped open.
|
|
This morning looks like everything is all locked up I'm very pleased about that.
|
|
Back before I was the director of the school of music this wouldn't have necessarily been my problem.
|
|
I mean it's everyone's responsibility to make sure the building is secure.
|
|
But now whenever there's a problem like this I'm the one they call.
|
|
And so I've got to kind of deal with it.
|
|
Anyway back to the topic of the podcast.
|
|
So what I'm doing right now while I'm walking around is letting the machine run its raw transcript and I mentioned that it does not have any punctuation or paragraph breaks or capitalization.
|
|
And so that's what needs to be done. That's where the many hours of labor are required to get it into a decent shape where it would be readable.
|
|
And I had an idea not too long ago. I mean this is too much work for me to do.
|
|
I've tried a couple of ways to get this work done in the past by having a graduate student assistant assigned to me for the semester where they have to spend a couple of hours a week working on it.
|
|
Or sometimes having student workers from the office who are technically reporting to my secretary.
|
|
If they're out of stuff to do and sitting there just fiddling around on their phones I'll have them work on these things a little bit.
|
|
But I can't seem to make a whole lot of headway and I don't have time to do a lot of it myself.
|
|
And so I had an idea that was inspired partly by the amazing effort that is used to correct the text for Project Gutenberg books.
|
|
You guys may have never heard of this but there's an effort called Distributed Proofreaders.
|
|
Distributed Proofreaders is a website where users around the world can go and help to do proofreading and correction for texts that have been scanned at high speed in libraries and then dumped somewhere.
|
|
Waiting for corrections so that the corrected e-book can be uploaded to Project Gutenberg for the world to enjoy.
|
|
I mean all those books that you get on Gutenberg, I get on Gutenberg, I don't know how many of you guys go make use of Project Gutenberg but I use it all the time because I have a fancy for 19th century fiction and stuff that's in the public domain.
|
|
So I've read tons and tons of stuff off there but that stuff doesn't get there just by magic. It takes a lot of work of people.
|
|
And what they do is they crowdsource it. So volunteers will proofread anywhere from one page to entire books.
|
|
You can volunteer to do as much or as little as you have time or inclination to do.
|
|
And in the end stuff gets proofread and then uploaded to Project Gutenberg and they don't catch everything. I mean I actually will routinely find errors in the books that I'm reading and I will submit erata reports and then they go and fix those.
|
|
But I had the idea based on that to try to distribute the work for this correction of these raw transcripts. I've got about seven or eight of them left to do.
|
|
I finished 11 where I think I use markdown format and I put paragraph breaks, capitalization, sometimes italics or bold face if there are special terms that I want to call attention to.
|
|
I will also insert section headings to help keep readers on track as to the big topics that they should be thinking about as they read.
|
|
I find that helpful for myself and definitely for the students. And also you have to just fix the things that the transcription tool got wrong.
|
|
It takes a lot of work and I finished 11 lessons and about halfway through the 12th. But I had the idea to try to crowdsource this effort just the same way that the distributed proofreaders did and have my students do it in my classes and get extra credit for it.
|
|
The students are always after extra credit if they get to the end of the semester and realize that their grade is just a little bit below what they want they'll they'll come and ask me is there anything I can do for extra credit.
|
|
Well now I have something they can help me proofread these lectures and for each one minute segment of audio that they proofread they can get one credit one extra credit point.
|
|
Now to manage this I needed a way to make available one minute distinct one minute sections of audio and that they could easily access listen to and then away also for them to get the corrected text to me.
|
|
So I wrote a script not surprisingly. First of all I found a cool command called MP3 split. It's actually MP3 SPLT MP3 split and what that will do is it'll take an input audio file and split it up into segments of whatever length you tell it to.
|
|
And I'm going to put a couple of commands in the show notes here the one that I use is to split it into one minute length segments so the command is MP3 SPLT space dash T.
|
|
The T flag I guess is to tell it what time you want to split at space 1.0.0 or one period zero period zero so that tells it one minute zero second zero middle milliseconds I think that's what that mean.
|
|
So after you tell it the time unit to use for splitting you put another space and then you just put the name of the input file.
|
|
And in like about a tenth of a second it will have split your audio file into one minute segments it really takes hardly any time at all it's not even noticeable.
|
|
And so then I've got all these audio files for a 30 minute lecture there will be 30 files and then I need to post all those 30 files on my web server in such a way where the students can access them.
|
|
And so I wrote another script that will create an HTML page with 30 little audio players and the file name displayed for each one of those.
|
|
And so what it does the it uses a for loop for I in asterisk dot MP3 do the following things and it'll get a file name from the input file and create a little HTML audio player for each one of those.
|
|
And then when it's done I also have the script push all of the audio files over to my server to put them in place where you know as soon as it's done pushing them over there you can access them from a web page.
|
|
It also creates an all version of all those MP3 files so that whatever browser you're using will be able to play the audio.
|
|
So that's how I put all of the files over on the server and give a web page that lists all the files and I also put on that same page a link to the raw text that they are supposed to use as their starting point.
|
|
Now to manage the getting extra credit for it I set up a discussion form on our course management system called Moodle or learning management system.
|
|
Moodle is an open source learning management system that we've used here for about 10 years I guess.
|
|
And so I set up a discussion form and start a new thread for each lesson that they're working on and ask students to post a response in the discussion form with the corrective text.
|
|
And what that does gives me a central place to keep all those and it gives me a little place to give them one point of credit that shows up right there in the grade book.
|
|
And I had I think three students take me up on this I started it late in the semester so I didn't have a real good chance to to see how it was going to work.
|
|
But at least three students took me up on it and got some extra credit.
|
|
And then I also had some students we've got this class for music majors where every week they have to go to what's called recital seminar.
|
|
So every Wednesday at 10 o'clock they've got to go and just sit there and listen to their colleagues perform or listen to a lecture or something like that and they get zero credit hours for it.
|
|
But they have to go and they have to get a grade of satisfactory by attending at least ten of these Wednesday recital hours and then also five concerts in the evening or on the weekends.
|
|
And inevitably five or six students at the end of the semester failed to show up. I mean I always tell them right at the beginning of this class.
|
|
Seriously guys all you have to do is show up for this and you get your credit and you need eight of these to graduate.
|
|
Well if you get to the semester and you've missed a couple of the attendance is you're getting to get an unsatisfactory and that could keep you from graduating on time.
|
|
And so this semester I think I had three students who came to me wondering how they could make up the credit and so I had them help me with these lecture transcriptions.
|
|
So depending on how many recital hours they missed I would assign them to do more or fewer minute long segments.
|
|
I had one person I think I had to do ten segments of a lecture and another one about five or six.
|
|
But so I got some more help that way and this summer I'm going to be offering extra credit to students in my music appreciation class for helping with this as well.
|
|
And they do a pretty good job overall I have to go through and check it but most of the stuff they do just fine and so I'm pretty happy with it.
|
|
I'm hoping to finish this job pretty quickly this summer or if not this summer then in the fall when I've got another music history class where students might need some extra credit.
|
|
And in this way we'll get there in the end and I won't have had to do all of the work myself.
|
|
They will have helped me and they'll get extra credit in the process and I'm also hoping to give a conference presentation at one of the distance learning conferences.
|
|
They always want to hear papers about accessibility issues and how you deal with that.
|
|
And so I've got a proposal in to talk about this project and so hopefully that'll get accepted and I'll get to go and give a paper about it.
|
|
So they'll add a line to my resume as well.
|
|
Anyway it's done with the next file now and I think I'm pretty much done telling you what I wanted to about it.
|
|
And so I'm going to sign off and I really hope that I didn't set the recording level too high.
|
|
A couple of times I looked and it looked like it might have been peaking but let's hope not.
|
|
It's about 20 minutes. That's some bloviating long enough.
|
|
I learned that word recently. Look at bloviate. It's a great word.
|
|
It means to talk excessively and boringly about something.
|
|
I've done a lot of bloviating on HPR so sorry about that but hopefully there's some of you guys enjoy anyway.
|
|
And I will talk to you next time. Bye.
|
|
You've been listening to Hacker Public Radio at HackerPublicRadio.org.
|
|
We are a community podcast network that releases shows every weekday, Monday through Friday.
|
|
Today's show, like all our shows, was contributed by an HPR listener like yourself.
|
|
If you ever thought of recording a podcast then click on our contributing to find out how easy it really is.
|
|
Hacker Public Radio was founded by the digital dog pound and the Infonomicon Computer Club and is part of the binary revolution at binrev.com.
|
|
If you have comments on today's show, please email the host directly, leave a comment on the website or record a follow-up episode yourself.
|
|
On this otherwise stated, today's show is released on the creative comments, attribution, share a life, 3.0 license.
|