Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
466
hpr_transcripts/hpr2378.txt
Normal file
466
hpr_transcripts/hpr2378.txt
Normal file
@@ -0,0 +1,466 @@
|
||||
Episode: 2378
|
||||
Title: HPR2378: Why Docbook?
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2378/hpr2378.mp3
|
||||
Transcribed: 2025-10-19 02:00:51
|
||||
|
||||
---
|
||||
|
||||
This is HPR episode 2,378 entitled Why Not Book.
|
||||
It is hosted by Klaatu and is about 40 minutes long and Karina Cleenflag.
|
||||
The summary is Klaatu talks about why not book is the greatest.
|
||||
This episode of HPR is brought to you by an Honesthost.com.
|
||||
Get 15% discount on all shared hosting with the offer code HPR15.
|
||||
That's HPR15.
|
||||
Better web hosting that's Honest and Fair at An Honesthost.com.
|
||||
In my previous episode I discussed how to use the book.
|
||||
I gave you the nitty and the gritty.
|
||||
I gave it to you in excruciating detail.
|
||||
I think it took me about an hour when all I really should have said was download
|
||||
the book, start writing the book, put it in the pan dock and you're done.
|
||||
That's that episode, pretty much summarized.
|
||||
But gosh, there's so much more about the book to talk about because there's kind of the
|
||||
why of the book.
|
||||
I mean, that was the how of the book, but there's this why question that keeps burning
|
||||
people, because I mean, even today I see projects migrating away from dock book for
|
||||
something simpler, something that they perceive to be simpler, such as markdown or ask
|
||||
e-dock or restructured text.
|
||||
And I'm not the kind of person, at least in this case, to say that those are bad alternatives
|
||||
because I've actually used those alternatives quite frequently myself.
|
||||
I use them to this day, to be honest.
|
||||
There are many times where I'm making notes or writing down a recipe or something.
|
||||
I'm not going to do that in dock book.
|
||||
I'm not going to write a whole article on how to make cinnamon sticky buns.
|
||||
You know, I mean, I just want something quick and easy.
|
||||
So I write that in markdown.
|
||||
And it also helps that my next cloud installation can parse that markdown into something pretty.
|
||||
And so it kind of all integrates with the notes app of next cloud, which is a great
|
||||
one.
|
||||
You should check it out.
|
||||
So markdown is a valid thing.
|
||||
It's not the most valid thing, but it's a valid thing, restructured text, great.
|
||||
Ask e-dock actually, I've never used it, but let's ignore that fact for now since I'm
|
||||
going to be commenting on it in this episode.
|
||||
So dock book is great.
|
||||
It just might not always be the thing that you want to use.
|
||||
And I think that's understandable, and I think that's totally valid.
|
||||
And I think it's something important to remember.
|
||||
So this is not an episode on why dock book is the best, and it's the only thing you should
|
||||
use.
|
||||
It's simply why is dock book still relevant?
|
||||
So let me tell you how I found dock book.
|
||||
I found when I first had started using Linux, one of the first purchases that I made was
|
||||
a Slackware DVD and a Slack book, or I think it was just called Slackware Essentials or something
|
||||
like that.
|
||||
And that book has gone with me so far.
|
||||
I mean, that thing has been in my gym bag, my gym bag that held all my books for years
|
||||
and years and years.
|
||||
It's gone to me with me, with me, to jobs, many different places.
|
||||
It's been all over the place.
|
||||
It's no longer something that I own because I had to move, and it wouldn't fit in my suitcase.
|
||||
No physical books fit in my suitcase.
|
||||
There were no physical books traveling with me.
|
||||
But in investigating how the Slack book was written, I saw that it was written in something
|
||||
called SGML and processed with something I think it was called Jade.
|
||||
And researching that further led me to the later incarnation, I guess, of those tools
|
||||
or something called Docbook.
|
||||
And Docbook was just this set of rules, really, written by a guy named Norman Walsh, and
|
||||
people seemed to use it for technical documentation.
|
||||
So I thought, you know what?
|
||||
It's a good thing to learn.
|
||||
So I started learning it, and I started using it, and I actually really, really liked it.
|
||||
And I liked it, I think, to be fair.
|
||||
I liked it because it was the only thing that I knew of at the time.
|
||||
And to be fair, again, it might have actually been the only thing at the time.
|
||||
I remember distinctly hearing about what I gathered was a new concept when I started
|
||||
going to technical conferences, this markdown idea, and no one seemed to know what this
|
||||
markdown idea was.
|
||||
So I kind of got the impression that it was just coming out.
|
||||
And I could look that up.
|
||||
I guess I could look when markdown hit the world scene.
|
||||
But I really think that I might have been there for sort of the birth of markdown.
|
||||
And by there, I just mean sort of in the tech scene.
|
||||
So I learned about Doc Book.
|
||||
I loved it.
|
||||
I didn't really know of any other alternative other than word processors.
|
||||
And I knew that I hated word processors.
|
||||
So now let me give you a little bit of history about myself with word processors.
|
||||
So word processors I didn't actually use early on.
|
||||
I just used this handoff computer that my dad didn't want to use anymore because it was
|
||||
too old and too clunky.
|
||||
And the word processor on it, such as it was, was kind of like this command liney thing
|
||||
that you would, and you tagged things, and then when you printed it out, those tags would
|
||||
turn into formatting options.
|
||||
So my very, very first experience with the idea of, oh my gosh, I can type stuff into
|
||||
a computer and then print it out, and I don't have to do it with handwriting, was this
|
||||
program that, I mean, I don't know what it was, I don't remember.
|
||||
But it was running, you know, kind of in a, you know, it was like a black screen with
|
||||
like, just, you know, plain text, like console text, it wasn't like a font or anything.
|
||||
You know, and that was how you did it.
|
||||
So I was very used to this idea of being explicit in how I want my presentation to end
|
||||
up.
|
||||
And then when word processors kind of got introduced to me, I thought, okay, well, this is
|
||||
kind of cool.
|
||||
There's a lot of menus here.
|
||||
I don't really like that, but, okay, this is kind of interesting.
|
||||
And I kind of fell into it and I started using it because you think, you know, in your
|
||||
foolish youth, you think, well, I should learn this stuff because all my teachers tell
|
||||
me that this is the big thing in the business world and I guess that's what I'm going to
|
||||
do.
|
||||
And so you're like learning all this stuff and, you know, within two days you know more
|
||||
than the teacher, that sort of thing, we've all been there.
|
||||
And give it about five or seven years and I go to open up a file that I'd done and suddenly
|
||||
the thing won't work anymore.
|
||||
Thing is, things deprecated.
|
||||
The whole file format doesn't work.
|
||||
There's no application on the face of the planet that will open that file format.
|
||||
Why?
|
||||
The file format was closed and the application that gave birth to that format has closed
|
||||
its doors and it no longer works or it's updated and decided out of that old format
|
||||
who cares about that.
|
||||
Nobody used that anyway.
|
||||
We'll just do this new one.
|
||||
That happened to me about five different times.
|
||||
I swear to you and I've heard other hosts on HPR and in other podcasts say similar
|
||||
things.
|
||||
And in the Fedora project, I know a couple of people who kind of overheard idly say,
|
||||
like, oh yeah, I remember back in the day when I was using this and that got deprecated.
|
||||
And that's like, I feel like that there's a not insignificant number of people who have
|
||||
been pushed towards open source from that very experience.
|
||||
It's the idea that just something as simple as a word processor document has been, has
|
||||
given us so much trouble that we won't touch word processors anymore.
|
||||
And I'm boldly and proudly one of those people.
|
||||
I will not touch a word processor except in very rare.
|
||||
I might not actually touch it.
|
||||
I can't remember the last time that I've authentically used a word processor and I just
|
||||
don't do it.
|
||||
So Docbook was very exciting to me because it was plain text.
|
||||
So I knew it would open because I mean you go online and you look at BBS postings from
|
||||
1983 and they still open.
|
||||
They still work.
|
||||
There's nothing deprecated about it.
|
||||
And the ASCII art that they include is still there.
|
||||
It's still exactly as it was before, you know.
|
||||
That's powerful stuff.
|
||||
So I knew plain text was good for longevity and this Docbook stuff just seemed so exciting
|
||||
because I thought I could do this Docbook and then put it through a processor and it will
|
||||
translate all the tags for me and then I can have output in any format that I want, including
|
||||
plain text without all the tags.
|
||||
But it'll still maintain that structure.
|
||||
So that was really exciting and I absolutely loved it.
|
||||
And then later when I found out about Markdown I thought wow that's a really powerful idea
|
||||
now because Markdown doesn't have all those tags.
|
||||
It's just a way for us to all sort of agree, hey we're going to write plain text but we're
|
||||
all going to agree we're going to do a gentleman's agreement here that we'll all do it in the
|
||||
same way.
|
||||
We'll do it in a predictable and repeatable way.
|
||||
So instead of just doing a chapter heading we'll put a hash in front of the chapter.
|
||||
And if it's a section within a chapter we'll do two hashes.
|
||||
And then another section within that section we'll do three.
|
||||
It's very, as I said in the last episode, maybe not intuitive but once you see what
|
||||
it's, you see the structure you kind of, you see the logic in it.
|
||||
And the same maybe true for Docbook but Docbook's a lot more verbose than just hey put a hash
|
||||
in front of your header and that's now you've got a chapter heading and we'll even parse
|
||||
that as a chapter heading and we'll put it in your table of contents for you if you want
|
||||
us to.
|
||||
It's a powerful stuff and that's very appealing and it's very attractive and very exciting.
|
||||
And hey I'll admit it I've flirted with Markdown I've tried it you know I spend a little
|
||||
bit of time with Markdown just like everybody else.
|
||||
No but seriously I do use it still.
|
||||
I think the problem is that the way that Markdown has kind of talked about in a lot of
|
||||
communities is like it is the end of all other formats you know it's like this great
|
||||
thing and it's simple and it's so it doesn't need to be that complex.
|
||||
Because it's Markdown or it's Restructured Text or it's Ask You Doc and we do it right
|
||||
because we're simple and we're minimalist and you can learn this in an afternoon.
|
||||
And yeah that's true sure I mean you I will admit you have one page print out of a
|
||||
Markdown cheat sheet and you're probably good to go you'll probably be writing a document
|
||||
within within the hour definitely because it is just there's there's no declarations
|
||||
there's no there's no namespacing there's no tags it's just it's like hey if you want
|
||||
to do a chapter heading you do this if you want to do a section heading you do that that's
|
||||
all very easy but then you do have to stop now mind you when you're doing your URLs right
|
||||
because you might think well this will be easy I just do a you are oh wait how do I do
|
||||
an embed like an actual how do I make a word a hyperlink like I know that it'll automatically
|
||||
link HTTP colon slash slash but how I want I don't want to I don't want the readers see
|
||||
the HTTP slash slash I want them to just see food when they click on food it'll take them
|
||||
to example dot com and so you have to look that up and square bracket food square bracket
|
||||
parenthesis HTTP colon slash slash example dot com close parentheses and now that works
|
||||
and and again looking at it after the fact you're like yet I see what they did there square
|
||||
brackets was the term parentheses was the sort of the expansion of that term which parentheses
|
||||
kind of are used for in real life anyway so it totally makes sense and it reads well it reads
|
||||
as if though it was plain text that you probably wouldn't write it exactly that way in real
|
||||
life you would probably put like food space parentheses example dot com close parentheses
|
||||
and that would be like the totally natural way to do it so mark down is doing a little bit
|
||||
of scoping there with the square brackets but it's still it's all within the same realm
|
||||
it's it's within the same within that margin of like yeah I could get used to that and
|
||||
you do get used to it you get used to it very quickly and that's the the power of mark
|
||||
down the problem with mark down though I I would argue first of all is that you show
|
||||
me a person who has used mark down and has not fallen back on html it can't be done you
|
||||
won't be able to find anybody under the sun who's who's done that I mean not anyone sure
|
||||
I mean someone has written a read me dot md file forget hub or get lab and have they have
|
||||
not had to use an html tag but I'm saying someone who's really working in mark down it's
|
||||
a classic classic trope that they're going to fall back on html eventually and once
|
||||
you start falling back on html then you almost you know it kind of it's an illusion shattering
|
||||
moment because everything was going so well and then suddenly you're typing out tags
|
||||
again and then you're just like wait well so why am I using mark down again and that's
|
||||
what happens I I've seen it happen several times because at my just my previous job we
|
||||
had to juggle several formats and one of them was mark down and it would every single
|
||||
time every single time someone would be writing something about this plugin that they'd
|
||||
written for you know the 3d stuff and and and they would be trying to do something and
|
||||
it wouldn't it just mark down just would not the processor for mark down processors for
|
||||
mark down just they're not smart enough or the processors are processors right the the
|
||||
mark down spec itself just doesn't have an allowance for certain certain things I can
|
||||
give you an example and this may be like a painfully common example maybe maybe you've seen
|
||||
people complain about this online already and there's probably an answer out there someone's
|
||||
probably already sort of come up with a solution or whatever well there's a solution but
|
||||
I mean someone's probably explained this one away but I'm going to cite it anyway just
|
||||
because it is a pretty common thing and I feel like it's representative of some of the quirks
|
||||
so let's say that you're writing a mark down file so you've got your hash and then this is
|
||||
mark down okay so there's your chapter heading cool that's easy and then you've got your
|
||||
paragraph so that's just complete you know all the way to the left and you just let's see
|
||||
what this mark down does and then you start a bullet a numbered list rather so you want a list
|
||||
that says one two three four five six with six items but on the third item you want to have a code
|
||||
block that's a pretty common thing especially in the technical world right you mean you if you're
|
||||
describing how to use a software for instance on Linux especially you might want to have a code
|
||||
block in your bullet list in fact it might be something so common that you that's the main
|
||||
thing that you want to do with mark down because maybe you're writing about a command line application
|
||||
so yeah it could be super common so we'll do a one dot and then this is item one and then one
|
||||
dot this is item two and one dot this is item three mark down it doesn't matter the numbers that
|
||||
you use as long as it's an integer and a dot and that's it'll it'll order it correctly so this
|
||||
is line three of the list and it's going to have a code sample so now as we all know
|
||||
the code insertion method for mark down is four spaces from the left margin so one two three four
|
||||
so now your four spaces in and you do I don't know dollar sign hello world and then you go to
|
||||
the next line all the way back to the left margin one dot this is the fourth item one dot fifth item
|
||||
one dot six item okay you render that and the hello world item will not be a code block it'll
|
||||
just be part of the previous line so it doesn't even see really that it was supposed to be a block
|
||||
within that list so then maybe you say okay well that didn't work so you do you you buffer it
|
||||
with some blank lines so we go back in and we put a return after oops after the third item
|
||||
and a return after the code block and so now we got one one one blank line and then our code
|
||||
block four spaces in and then space within a blank line and then one one one so render that
|
||||
and it does that but now the code block is not a code block it's just a paragraph in the middle of
|
||||
the or map might be a code block actually but now the list does not continue into sequentially so
|
||||
it's one two three code block one two three that's not what you wanted you wanted one two three
|
||||
code block four five six okay so you open it back up you fiddle around with it you complain you
|
||||
look it up on line on forums and finally you realize that the answer to all of your problems
|
||||
is to do the blank line after the third item and then one two three four five six seven
|
||||
spaces and then the XML HTML code rather code tag HTML tag code and then dollar sign hello world
|
||||
and then close the code tag and then close the code tag and then continue with your
|
||||
blank line and then continue with your list so now I'm going to pipe that through will not
|
||||
literally pipe I'm going to send that through pan doc and end up with a test dot let's just do
|
||||
PDF for rendering nope can't do that let's do HTML instead there we go um
|
||||
and then let's look at that in something let's use conqueror
|
||||
and you've got one two three and then the code block and four five six so it completely worked it
|
||||
totally worked um and it only worked because you used probably some magical number of spaces I'm
|
||||
not even sure if that actually matters that was just that's where I ended up let's put it that way
|
||||
and then you used HTML and that happens all the time I've never seen a person seriously using
|
||||
markdown who has not had to fall back on HTML tags and you think well that's not that big of a
|
||||
deal because everybody knows HTML blah blah blah but I mean that's actually exactly the the point
|
||||
of markdown is that you're you're it's supposed to be plain text that then gets parsed into something
|
||||
into many other formats so if we're falling back on HTML for every time it's important then why
|
||||
are we using markdown honestly um restructured text does a lot to answer those problems and I don't
|
||||
know if markdown is something that just isn't maintained anymore or if it's just had so many
|
||||
forks at this point that the original maintainer just doesn't care I don't know I don't know why
|
||||
markdown is it's kind of lacking in the ways that it is or maybe it's feature complete in the
|
||||
eyes of the maintainer any D just doesn't feel the need to make it more complex with with different
|
||||
use cases but um restructured text which is used by Python the Python community does a lot of
|
||||
their documentation restructured text with with Python's finks and and that works really well
|
||||
but it's like looking at it if I had to put it on a scale between or a spectrums between
|
||||
dockbook and and markdown um then it would be maybe sort of well somewhere in the middle um
|
||||
and it kind of you know you look at it and it's like yep that's that looks like plain text I mean
|
||||
the chapter headings are are underlined with dashes so that delimits those and paragraphs look
|
||||
normal and the code blocks are prefaced with this weird the sort of dot dot code colon sort of thing
|
||||
and then it's indented in a funky way I see what they're going for there and they're kind of labelling
|
||||
it so I understand what that means but there yeah there are a lot of like weird little markup
|
||||
things that you have to do in restructured text to to to to be to have valid restructured text RST
|
||||
so it is it's it's it's better I would say but it's not I wouldn't say it was perfect because
|
||||
because there are quirks to it as well I have found and a lot of times once again you kind of end
|
||||
up falling back on html because that's just kind of in the end what works best and so if you're
|
||||
using html I just I keep asking myself why are we using the the markdown type formats then why
|
||||
don't we just use html and get it over with so asky dock I've never used so I actually can't
|
||||
complain about asky dock although if I use it for a day I'm sure I could find something although
|
||||
maybe not maybe it's great you know I mean it has been highly regarded by many people I actually
|
||||
know some people who write an asky dock and then I think translate it over to dockbook and and
|
||||
that's totally valid I've actually done that before myself not not asky dock I've done it with I
|
||||
think plain text actually I used to write and plain text and then process it into dockbook and
|
||||
then go through and correct all the dockbook tags point being is that I don't believe that these
|
||||
quote-unquote simplified methods are actually all that simplified and especially that that
|
||||
becomes especially true if you start actually caring about the style which again is true of dockbook
|
||||
as well but I just find people saying oh markdown and re-chartored text and asky dock they're so simple
|
||||
you'll never you know you'll never look back and it's kind of like well I've been down that road
|
||||
and I have looked back because I've been to some place where I I'm super super happy with the
|
||||
workflow with the tool chain until I decide well I would really rather that the color of that
|
||||
heading to be red instead of black and then it's just you think well that's super simple I can
|
||||
just do this nope that doesn't work well okay I can do this nope that doesn't work if it gets
|
||||
at the template file okay well I'll do that well now you have to learn dock utles or something
|
||||
like that you know it's just it it becomes this big big project and at that point the simplicity
|
||||
of the format basically falls apart for me and I think well there's no point to it anymore now I
|
||||
might as well use something that is highly structured and I'll just keep using that because the
|
||||
there's no there's no advantage to this other thing yet so my default is dockbook obviously but
|
||||
it might not be your default should it be your default well yes it should so as I think I might
|
||||
have mentioned briefly in the previous episode or I might not have dockbook has a lot of what
|
||||
would be called semantic information and that is to say that in HTML for instance or even you know
|
||||
like in markdown let's do markdown because I'm picking on markdown in in markdown you might be
|
||||
typing something and you might think okay well there's this thing here there's this variable name
|
||||
an environment variable specifically and I want to I want to somehow separate that from the
|
||||
rest of the text and that's a pretty natural thing for us to try to do for one reason or another
|
||||
sometimes we want to do it just because we feel who this is a super important term I want to make
|
||||
it blink and other times we think well it really is confusing if this thing is not highlighted in
|
||||
some way people will not understand where the invariant the environment variable begins and
|
||||
where the rest of my text ends or the environment variable ends and where the rest of my text begins
|
||||
so so you might want to do limit that some some way separate that some way from the rest of the
|
||||
the the document so you might go in there and give it two asterisks to make it bold or you
|
||||
might go in there and give it two back ticks to make it a code sort of thing and that's fine
|
||||
and that kind of works and it produces on the you know once you render that through your processor
|
||||
it it makes that environment variable name bold or monotype or whatever it does monospace whatever
|
||||
and that's great and and that's down to you know styling and stuff like that but but but you've
|
||||
got and even in the source code you do have some indication hey this word is special it's either
|
||||
got asterisks asterisks around it or it's got back ticks around it or or three quotes I think is
|
||||
what it is and restructured text of three single quotes so or maybe it's two back ticks and I don't
|
||||
know but it's it does have something there but in doc book you could say okay I want this
|
||||
environment variable to be separate from the rest of the text and give it an in var tag e in
|
||||
v a r tag now you know exactly what that environment variable is not just by looking and saying oh
|
||||
yeah it's got some asterisks around it you know that it's an environment variable because it is
|
||||
tagged as such the same the same goes for a lot of different things key combinations like control
|
||||
c control x control control z control x I said that already if anything like that you can tag
|
||||
that stuff in doc book as a keyboard combination a key combo and then you can tell exactly what
|
||||
what that is is it a what kind of key is it is it a is it a is it the thing that is literally
|
||||
printed on the key or is it some other kind of key you know you can get really really specific
|
||||
with the data that is contained in your document and there are two great reasons three great
|
||||
reasons for that one is consistency consistency and documentation especially technical documentation
|
||||
but also other kinds of documentation it's really really important people want to see things
|
||||
we when you read something and it's consistently laid out or or or or processed or rendered or
|
||||
whatever that makes you think that there's some form of professionality professionalism in that
|
||||
document you know they've they've actually taken the time to make sure that when they said control
|
||||
c yes on the page on the page that I've just turned the previous page it was you know it looked
|
||||
like this and now when I turn the page and control c is there again it still looks like that every
|
||||
time I see control c it's exactly like that or every time I see an environment variable it is
|
||||
it is styled the same way so there's there's a virtue there to to making sure that you're using
|
||||
the same tag for the same kinds of elements ensure you could do that and mark down as well or whatever
|
||||
but with with doc book you're you're you're doing it because of the tag itself rather than just
|
||||
remembering okay every time I use a keyboard one I'll do two asterisks every time I do any kind
|
||||
of variable name I'll do the two backticks and that's how it will be no with doc book you've got
|
||||
specific tag types around specific terminology or specific uh yeah words or letters or whatever
|
||||
and then in your styling you can do whatever you want to with that and it won't matter
|
||||
because it's got a tag especially for it and we were really bad about that with html you know we
|
||||
we very frequently I see source code in html it just abuses term tags you know they just they just
|
||||
they don't really know maybe there's not a tag for something but they want to so they just
|
||||
repurpose tags you know well nobody uses the e m tag in my business so I'll just I'll repurpose
|
||||
it in my CSS and anything that is the e m tag will we won't italicize it we'll just use it to do
|
||||
such and such instead and you know I've seen that all over the place so doc book avoids that kind
|
||||
of thing because it's got a tag for so many different things and that's great and heck I mean
|
||||
as xml so you can actually add your own tags I mean doc book frowns upon that because they're they
|
||||
they say if you modify doc book you're not using doc book but I mean you could add your own you could
|
||||
bring in your own name namespace and have your own tags so that's a possibility and that kind of
|
||||
speaks to the next benefit of all this which is the semantic meaning of data you know the actual
|
||||
fact the fact that the actual term itself is of a certain kind so maybe it's a product name or
|
||||
maybe it's a vendor name or something like that it becomes meaningful now because it is tagged
|
||||
in certain way and so now you're not searching just by what did we do to our product names do we
|
||||
make them emphasis yeah I think so okay well let's look for all the emphasis tags and then within
|
||||
those emphasis tags we'll look for all the product all the the business names or whatever we're
|
||||
looking for this time you you actually know what you're looking for because it is tagged in a certain
|
||||
way and here's a true story from New Zealand it's quite quite funny but it's it's I swear it is a
|
||||
true story there was a company called telecom or yeah telecom or something like that telecom
|
||||
and they got bought by a company called Spark so for about three days on the internet all their
|
||||
public documentation and their website and everything everywhere every time the word telecommunication
|
||||
appeared instead of saying telecommunication it said spark communication I mean you can see
|
||||
exactly what some code monkey did they did a find and replace on the the business name telecom or
|
||||
telecom and completely didn't think of the fact that telecom was a very very common prefix in their
|
||||
industry for the word from which it comes telecommunications and so when they did a find and replace
|
||||
blindly on every document in their business they turned every instance of the actual word telecommunication
|
||||
into spark communication and it was hilarious and embarrassing and and you could just completely see
|
||||
people just you it was just a face palm a moment you know it was just like yeah I see what you did
|
||||
there and I think you should have maybe thought about that first and I can only imagine like when
|
||||
someone found out what they had done it must have been just really hoping that nobody noticed but
|
||||
yeah I mean that's the kind of thing that that that can happen when you have no idea what the
|
||||
content of your document is so with doc book you can actually tell you know you can identify
|
||||
what the content is rather than just having random words in it and maybe that's important to you
|
||||
maybe it's not but that's the third thing which is um I don't know I think of it as future
|
||||
proofing but I guess technically it's it's not really but in a way it's it's opening yourself up
|
||||
for future uses of data that you might not actually consider yet so for instance
|
||||
five years ago maybe or not 10 years ago 10 years ago there was no real concept
|
||||
of of this of the idea that that in a web page you might have a phone number and you might want
|
||||
to make that phone number clickable because why would any why why would that be something like
|
||||
why would you need to click a phone number phone number you use on your phone you don't use it on
|
||||
the web and then suddenly smartphones started coming out and people started actually interacting
|
||||
with the web on their smartphones and it became almost a problem that no one could identify where
|
||||
a phone number was versus a date or a zip code post code whatever um and they thought oh man
|
||||
wouldn't it be neat if we had a tag for this and of course such a tag has since come about and now
|
||||
we can tag things as a telephone number you can click it and it goes to your cell phone telephone
|
||||
app and and you can dial it without actually you know pressing the buttons you just clicked on
|
||||
a number but if we if we think of that really broadly outside of that specific example I mean the
|
||||
same can be true for for documentation uh that you're writing right now there may be concepts in
|
||||
there that you'll never think you know why would I possibly need to do this with with all of this
|
||||
all with all of these words they're just words but I mean at some point you might find that
|
||||
that that some of that data becomes important if identified and with something as explicit
|
||||
and verbose as doc book is you're kind of doing that inherently you know you're actually
|
||||
classifying the data that you're typing into your document rather than just typing data into
|
||||
the document and just saying ah the reader will the reader can just figure out the the context of
|
||||
all or the the the the type of all of this type and I mean they can and that's great but that's a
|
||||
human that's the human reading the words what about the computer that's going to read the words
|
||||
at some point and that's why doc book to me is is really a clever system so why am I even talking
|
||||
about doc book or markdown or restructured text or ask you doc at all and that is an important
|
||||
question because a lot of people are just thinking well I don't even want to get involved I'm just
|
||||
going to write in plain text or I'm going to write in a word processor so first of all word
|
||||
processors today are writing in XML you don't know it you don't want to think about it but it is
|
||||
that's what's happening underneath the underneath all that fancy gooey stuff that you're looking at
|
||||
it's it's most of the word processors are yeah generating XML and you can look inside certainly
|
||||
of an ODT it's just a zip file you can unzip it and look right at it it's a bunch of XML you can
|
||||
look it's inside of a an EPUB and certainly that's got a bunch of HTML in it I think a little bit
|
||||
of XML for that that index part and then you can I think doc X is famously XML as well I'm pretty
|
||||
sure I mean I could be wrong I haven't really dealt with doc X in any recent time but I'm pretty
|
||||
sure that's what the X actually stands for is XML so XML is is here to stay man but I mean really
|
||||
it's it's it's it's something that we're all using secretly anyway but aside from that
|
||||
technicality the important thing here actually whether you're using doc book or markdown or
|
||||
researcher texture ask you docs is structure and predictability a long time ago I was trying to
|
||||
get rid of all my physical books because they were heavy and I was tired of carrying them around
|
||||
so I wanted them all really on my computer because I didn't have an e-reader at that time so I
|
||||
thought if I just get these books into some format that I can have them on my computer they'll
|
||||
be easier to carry around and at the time this was a very long time ago that was really hard to do
|
||||
because the publishing industry hadn't really jumped on board the whole concept of let's do this
|
||||
digitally yet in fact it's pretty arguable that they still have yet to figure that out luckily it
|
||||
doesn't really matter because there are enough independent publishers at this point to really
|
||||
make that almost a moot point for for what I'm interested in but at the time I thought okay well
|
||||
I'll just I'll take pictures of them on a digital camera and then I'll load them in and that's
|
||||
what I'll do and yeah I had a digital camera even that long ago because I was in film school at
|
||||
the time so it kind of worked out so I thought well that's what I'll do that'll be great
|
||||
you know to make my own little PDFs and I actually did that almost to a whole book
|
||||
and the resulting PDF file was something like 200 megabytes or something insane like that because
|
||||
they were all like PNGs so I made them all JPEGs and I I spent like a week trying to figure out how
|
||||
to process you know 200 images because I wasn't doing Linux yet so I had no idea how to do that
|
||||
finally sort of figured it out anyway long story short it was not working well for me eventually
|
||||
I found that some people were doing that like on Gutenberg like you could find lots of books on
|
||||
Gutenberg and they were in HTML and I think well I don't think I know TXT and HTML formats at
|
||||
that time I don't think there was any kind of ebook format really I mean there may have been
|
||||
something but certainly nothing I was familiar with and so I did I I've read a lot of books
|
||||
from Gutenberg so and and that was just plain text but one thing I was I started to notice
|
||||
about plain text was that it wasn't predictable so once I got savvy on Linux and figured out oh
|
||||
I could script stuff cool so I went to go take some of my plain text books and and process them into
|
||||
some other formats you know like HTML let's say I don't actually remember but let's say it was
|
||||
HTML and I was finding that you know in one file the chapter headings would be chapter one
|
||||
Foo and then in the next file it would just be one dot space Foo and then in the next file there
|
||||
would be no chapter numbers at all it would just be Foo you know and and then maybe in the next
|
||||
file it would be chapter one hard to carriage return Foo so there was no predictability about what
|
||||
I was going to see in one text file to the next and that meant that there was no way to script some
|
||||
kind of process to say hey when you see the word chapter make that whole line and H1 elements
|
||||
or whatever and then when you see I don't know something else do something else with it you
|
||||
know you could not do that it was not scriptable because no file was the same so when I found out
|
||||
about will certainly doc book but but even even later when I found out about markdown and stuff
|
||||
and kind of realized that it was imposing a kind of an agreement on how we were going to format
|
||||
things then I realized the true power of of structure of of giving your data some kind of
|
||||
predictable scriptable structure where we can say yes we know exactly what we're going to expect
|
||||
in this data file because we've been told it is a book so we know that there will be headings
|
||||
and we know that there will be sections and we will know that there will be URLs at some points
|
||||
and maybe places to drop in images by our processor all that kind of thing all of that stuff becomes
|
||||
very very predictable and you can reproduce the same set of processes across several several
|
||||
several different files that's the power of structure and it's why you should be using
|
||||
markdown restructured text asky doc and frankly probably doc book that's all for me about
|
||||
doc book folks thanks for listening I hope you found it interesting and you should try doc book
|
||||
out it's really quite quite cool it's a lot of fun and once you start getting into styling it
|
||||
with XSL it's just super super satisfying and you know what best part about it it'll keep you out
|
||||
of those word processors that's what I really care about talk to you next time
|
||||
you've been listening to hecka public radio at hecka public radio dot org we are a community
|
||||
podcast network that releases shows every weekday Monday through Friday today's show like all our
|
||||
shows was contributed by an hbr listener like yourself if you ever thought of recording a podcast
|
||||
and click on our contributing to find out how easy it really is hecka public radio was found
|
||||
by the digital doc pound and the infonomicum computer club and it's part of the binary revolution
|
||||
at binrev.com if you have comments on today's show please email the host directly leave a comment
|
||||
on the website or record a follow up episode yourself unless otherwise stated today's show is
|
||||
released on the creative comments attribution sharelight 3.0 license
|
||||
Reference in New Issue
Block a user