- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
192 lines
13 KiB
Plaintext
192 lines
13 KiB
Plaintext
Episode: 2430
|
|
Title: HPR2430: Scanning books
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2430/hpr2430.mp3
|
|
Transcribed: 2025-10-19 02:51:53
|
|
|
|
---
|
|
|
|
This in HPR episode 2,430 entitled Cunning Hooks, it is hosted by Ken Fallon and in about
|
|
12 minutes long, and Karimanek's visit flag.
|
|
The summary is, Ken explains how and why he is Cunning Hooks.
|
|
This episode of HPR is brought to you by an honesthost.com, get 15% discount on all shared
|
|
hosting with the offer code HPR15, that's HPR15, better web hosting that's honest
|
|
and fair at an honesthost.com.
|
|
Hi everybody, my name is Ken Fallon, you're listening to another episode of Hacker
|
|
Public Radio.
|
|
Today I want to talk to you about scanning schoolbooks and how I do it.
|
|
Some of you, if you have children, they probably have schoolbooks.
|
|
Schoolbooks are much preferred for education system over PDFs in my opinion or e-book.
|
|
The simple reason for this is if you're doing something like geography, I want to refer
|
|
to a map or something, you have the map on one page and the book over and refer to
|
|
the questions and go back and forward continually from the word lists at the back, translations
|
|
at the front or whatever.
|
|
So physical schoolbooks are a thing, there's a reason they've been so successful and they
|
|
continue to be.
|
|
That said, you don't particularly want your child to be looking home all the books every
|
|
day and photocopying certain pages as a bit of a book.
|
|
My daughter's school, they have rented the books, by the way in the Netherlands, the children
|
|
get the books for free as well as for free.
|
|
So the schoolbooks are provided by the school themselves, but they will, in my daughter's
|
|
school, they rent the books from the book and then they have to pay a 50 euro deposit
|
|
and if the books are in condition, they get the deposit back.
|
|
They also have an option where if you're a customer with them, you can rent another for
|
|
50 euros, you can rent a complete set of the books for home use so that you don't need
|
|
to be looking your books to school and back the whole time.
|
|
And that's also very useful and that's it for my son.
|
|
They have bought the books outright, a communal schools or a coop of schools have bought
|
|
all the books outright so therefore we can't avail of the rent second rent system.
|
|
So what I've decided to do is scan all the books.
|
|
The books themselves, if you're to buy them are quite expensive, they're about between
|
|
70 and 150 euros per book so it is fairly hefty.
|
|
So what to do, what to do and the answer of course is scan the books to, as you will
|
|
know, I have a printer scanner, I've already done an episode on this and a continuous
|
|
ink supply system and it is a brother MFC J59100W and that has a scanning option.
|
|
As part of that, it comes with some tools that allow you to scan over the network, scan
|
|
an image and you can use a device name brother for I name on dev01 in my case.
|
|
However those tools as I found out are only available on I think they're proprietary
|
|
and I think they're only available on Intel platforms so I actually scan them
|
|
to reply but that was not an option that I had.
|
|
Now one thing I could do is connect a Raspberry Pi directly and then scan it over USB
|
|
into the Raspberry Pi and have the Raspberry.
|
|
So if you're going to do that, that's a definite option and then scan image will work.
|
|
This isn't really a technical show as such because it's more, this is what I've done,
|
|
there are a lot of manual steps involved, it's a process, a lot more than I'd like, mostly
|
|
due to the book publishers insisting on using non-standard formats for the shape of their
|
|
books.
|
|
Why all the books can't be A4 or similar, actually everything should be.
|
|
You've done transferring to from Fahrenheit to Celsius, you might want to transfer to
|
|
the A standardized ISO A4, A3, A8, A0, a papering system.
|
|
But anyway, let the flame wars begin.
|
|
So what I'm doing here is I'm setting up a variable, this is a back script that basically
|
|
runs an infinite and I set the image path and then I picked the final name of the image
|
|
path for on the day command, whereas specify the date to be saved as plus percent, well,
|
|
it's the ISO A61 date, which is year, month, day, well, on score is nice, h hours, minute,
|
|
seconds.
|
|
And that gives me a year, then I run the scan image program, which is default with the
|
|
same pack and I select the device and I set the resolution to be 300 DPI.
|
|
For other things, I use 600 DPI, but that's interpolated, so it's just higher as a
|
|
little cost guest.
|
|
And then dash dash format equals JPE, up to a file name.
|
|
And then I open GwenView with the file name, so I can preview the file if it's not a good
|
|
take, then I delete it within, otherwise then loop back, continue the process scanning
|
|
page.
|
|
Now, on all the books that's worthy for format, I could just simply sit here next to
|
|
a scanner with a Bluetooth keyboard and press enter, turn the page, press enter, press
|
|
the enter and watch the screen.
|
|
But in this case, I've had to, because it doesn't actually totally fit onto the scanner,
|
|
sometimes it's let any to bring it in a little bit on the left, a little bit on the right,
|
|
I've had to add the GwenView option, and that is the standard KD image display to see
|
|
my image left or right.
|
|
So once I have all that done, what I do is I go in and find a representative page that
|
|
I want to crop down.
|
|
So I will, all my images will be saved with a date, time style, there will all be an order
|
|
starting when I began scanning the book from the first page at the end book.
|
|
And it doesn't really matter what the time stamps are, it's just important that they're
|
|
in sequence.
|
|
And when you're scanning books, it will be, you'll be scanning the first page, will be
|
|
the right way around, the second page will be upside down, the next one will be right
|
|
way around, the next one will be.
|
|
When you open up all the images, there will always be an area of the flap of the flap
|
|
edge scanner that's excessive, so it'll be a gray bar at the bottom to get rid of.
|
|
So the idea is the first thing, once you've finished all your scans, is make a backup
|
|
of the scans that you have.
|
|
And then highlight one of the images, save it somewhere else and then crop it, get rid
|
|
of that black part of the side.
|
|
And you can use, so that is the area that you're interested in of the image.
|
|
Now if you're scanning using X-Sane or something like that, that gives you the option per book,
|
|
per scan to identify the areas.
|
|
To be honest, I wanted to just keep it a little bit more generic and it's actually trivial
|
|
to post process the whole thing, rather than scanning and then accidentally truncating
|
|
the last two millimeters of the page, some critical word or some eye scan, the whole flap
|
|
edge, it's just a lot.
|
|
So then I use GM, identify, now GM is what's called graphics magic.
|
|
So image magic, I heard this on the tux jumper, that image magic was not being maintained
|
|
and that graphics equals is plug and replace them to a loan behold.
|
|
I already have graphics magic available to me and I was blown away.
|
|
It's a lot easier to use, you type GM and then press enter and you get some help.
|
|
So GM, space, identify, will identify the image, which was previously identified.
|
|
In image magic, all the tools had different names and you had no idea which one to call.
|
|
The graphic magic magic, you have graphic magic GM and then you have the command that
|
|
you want, in my case, I want to identify.
|
|
So that identifies, it's a JPEG image and then it gives the dimensions of it.
|
|
So for two, seven, in my case here, it's two, four, seven, seven, X, two, six, zero, zero
|
|
plus zero, zero, zero and then direct class, it's blah, blah, blah.
|
|
So that's information that two, four, seven, seven, X, two, six, zero, is actually what
|
|
I want to crop all the images.
|
|
So you can do that using GM, magnify, space, dash crop, space, just paste that number in
|
|
and then space, asterisk.jpeg, boom, all your images get cropped to that size.
|
|
So now all your images are now the right size with that piece of the flatbed scanner gun.
|
|
So rather than losing any, I have all the information that I took back up of all the original
|
|
images before and if you didn't, that's an important step.
|
|
Take a back up.
|
|
Also, if I haven't mentioned in the Netherlands, it is legal to copy books for your own personal.
|
|
What I'm doing is, may very well be illegal in New York, but it is perfectly legal.
|
|
So now I've cropped all the images.
|
|
So the only problem now is that the first page is around the second page by subversive.
|
|
So I've written a very small program that will make use of GM again, GM, magnify, dash,
|
|
rotate, space, 180 and then the image in it.
|
|
So what I want to do here is I set skip to one, it's a variable and then for image in
|
|
a strict JPEG, do if skip is equal to one, then skipping image sets skip or skip equal
|
|
zero.
|
|
Then when it loops else rotating image, GMs is magnify, space, dash, rotate, rotate, image
|
|
and then set skip equals to one.
|
|
So when you loop through that, you're skipping this image, rotating, that's skipping this
|
|
image, rotating.
|
|
Now, depending on your flatbed scanner, the way you choose to scan, the first page may
|
|
be all the images will now be the same orientation.
|
|
They may all be upside down or they may be all the right.
|
|
If they're all upside down, you just go GM, magnify, dash, rotate, 180, asterix and
|
|
they love what you will notice though is if you, there might be a few pages at the end
|
|
that are the right and wrong way.
|
|
Now one thing I forgot to mention is that after you've done the scan, the first thing
|
|
that you should do, so you do your scan, the first thing you actually should do is quickly
|
|
check to see if you've got all the bits, one right way round, upside down, right way
|
|
round.
|
|
If to the right way round, you've missed it, and what I do is I go back and scan it and
|
|
then find out what the name of the one was before us and then add a milly to that one
|
|
and then use that as the final name to save.
|
|
So make sure that you have all, right way, wrong way, right way, wrong way, right way,
|
|
wrong way, right way, wrong way.
|
|
The whole way through that you have to missed any pages or that you have to scan any pages
|
|
twice.
|
|
So this is a little bit of a reborious process.
|
|
So once you've done that, you zip them all up and save them, as you've done that, identify
|
|
crop one of the images information, crop all of the images, then run this rotate everything
|
|
to the bash command and if you need to, you can log for GM, rotate 180 if you need to.
|
|
And then the only thing that's left to do is convert all of these into a PDF.
|
|
Now I found that it got too big, the program complained whether I used image magic or graphics
|
|
magic, both of them complained creating a large PDF file like so what I ended up doing
|
|
was looking into each of the individual books and breaking it up per chapter or usually
|
|
it's section.
|
|
So each of these books usually have about five sections for a section for semester.
|
|
And then I make subdirectories, I make a directory of the name of the book, be all the images
|
|
into that, I make subdirectories in there, crop one, two or five or whatever.
|
|
And then I put physically copy and put the images in there.
|
|
And then I run a simple script for I do convert dollar I forward slash asterisk.jpg into
|
|
dollar I PDF.
|
|
So what they're all about, I'll put a copy of all of these things into the show notes.
|
|
And what you do there is for every subject that there is, it will run the convert and create
|
|
a, a final name, a PDF with this subject, so chapter one, the PDF, chapter two, the PDF.
|
|
And that's pretty much it.
|
|
So for the most part, this is an easy enough thing to do.
|
|
And it takes me about two and a half per book, except for this one because this one, this
|
|
takes me about an hour and a half per book and I'll just sit there, usually watch a few
|
|
big live videos or YouTube videos and just sit beside the scanner, press enter on my
|
|
keyboard, flip the book, press enter.
|
|
And you get kudos that you're doing something for your kids while at the same time enjoying
|
|
an electronic video.
|
|
Anyway, that's it, hopefully Murphy has not messed this one up too much for me, so I will
|
|
go and post it.
|
|
And tune in tomorrow for another exciting episode of Hacker Radio!
|
|
You've been listening to Hacker Public Radio at Hacker Public Radio dot org.
|
|
We are a community podcast network that releases shows every weekday Monday through Friday.
|
|
Today's show, like all our shows, was contributed by an HPR listener like yourself.
|
|
If you ever thought of recording a podcast, then click on our contribute link to find
|
|
out how easy it really is.
|
|
Hacker Public Radio was founded by the digital dot org pound and the Infonomicon Computer
|
|
Club and is part of the binary revolution at binwreff.com.
|
|
If you have comments on today's show, please email the host directly, leave a comment on
|
|
the website or record a follow-up episode yourself.
|
|
Unless otherwise stated, today's show is released on the creative comments, attribution,
|
|
share a like, 3.0 license.
|