Initial commit: HPR Knowledge Base MCP Server

- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Lee Hanken
2025-10-26 10:54:13 +00:00
commit 7c8efd2228
4494 changed files with 1705541 additions and 0 deletions

222
hpr_transcripts/hpr2114.txt Normal file
View File

@@ -0,0 +1,222 @@
Episode: 2114
Title: HPR2114: Gnu Awk - Part 1
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2114/hpr2114.mp3
Transcribed: 2025-10-18 14:30:26
---
This is HPR episode 2,140 entitled Gnurk Part 1 and is part of the series Bash Crypting.
It is hosted by me and in about 23 minutes long.
The summer is an introduction and the architect passing tool.
This episode of HPR is brought to you by AnanasThost.com.
Get 15% discount on all shared hosting with the offer code HPR15.
That's HPR15.
Better web hosting that's honest and fair at AnanasThost.com.
Welcome Hacker Public Radio fans, this is Bee Easy once again.
This time I'm going to do a series of tutorials you can call them, working in collaboration
with the famous Dave Morse, which makes me really excited.
He's allowing me to do the intro and he'll intro himself as well as go into a deep dive
as we proceed, but we are going to be doing a little tutorial on Ock.
In particular I'm going to be focusing on Gnurk, which is very similar to the original
Unix version and it has some additional features.
I don't know if we're going to go into the differences between Ock and Gnurk, but we
are going to at least start with some of the basics right now.
So without any further ado, here's Ock.
So from its man page, Ock is the Gnurk project's implementation of the Ock programming language.
It conforms to the definition of the language in the POSIX 1003.1 standard.
And this version is in turn based on the description of the Ock programming language by
aho karagin and wineburger.
And Gnurk provides the additional features found in the current version of Brian Kernigan's
Ock and a number of good news specific extensions.
So that's the beginning to the description of Gnurk in the man page.
Ock is a powerful text parsing tool to be specific and like in the description says, it
is its own language.
Now Dave, especially but also myself, we're going to go into how to put Gnurk into a language
inside of a text file, a dot Ock file if you will.
But I'm going to start off with some basic commands to get our feet wet with Ock because
you can just do it on the command line with a simple inline coding.
I use this tool all the time, both inline and files.
The good thing about putting in files is it's easy to go back to and run the same command
over and over again on different files.
But it's really handy if you don't feel like opening up or you don't want to open up
or you can't open up a tool like library office and parse CSV files or if you just have
some really complex stuff that you might be pulling in from a pipe from like a said command
or wget command where you're getting stuff off the internet and you want to parse it in
real time and put it into a file or parse it into another tool that's going to do more
processing on it later.
So I'm going to try to see if I can get a file uploaded but if not, I have example files
right in the show notes.
So all you have to do really is just copy and paste the example files right from inside
the show notes and put it into a text file and you should be on your way.
So the basic syntax of AUK is AUK and then some options and then inside of single quotes
a pattern and inside still inside the single quotes inside of curly brackets actions
before you end the curly of the single quotes and then the file that you want to do that
to or the group of files that you want to do that to.
So it kind of sounds hard but it really is pretty simple to get started so you're just
going to do AUK dash something a pattern to search for but the pattern is optional and
then the action that you're going to do file.txt.tsv whatever that that whatever you're working
with.
So for example purposes I created a file called file1.txt and a companion file that's all
the same data that's file1.txt.tsv the difference between two is one is space delimited the
other one is or white space delimited the other one is comma delimited.
Delimited means the way that you're going to separate the different fields in the file.
So comma separated file CSV means that your delimiter or the limit of that column is
separated by the comma in a white space one it's going to be separated by any white space
and that's the default in AUK is that it's going to parse whatever you're looking at
whatever text string it's looking at by the white space and it's going to put it into
columns that way.
So if you look at the file that I have in file1.txt the first column is the headers name color
and amount and then under the name I have a bunch of different fruit apple banana strawberry
grape apple again plum kiwi potato I guess that's not a fruit and pineapple and then the next
column over I have different colors I have red fruit apple yellow for banana strawberry red grape
purple and then for that second apple I have green in this column now so we have a green apple
and a red apple then plum for the plum column I have purple then brown for kiwi brown
for potato and yellow for pineapple and then in my third column I have the amount of each one
of those items so I have four apples six bananas three strawberries 10 grapes eight green apples two
plums four kiwis four nine potatoes and five pineapples now this is going to be a cool file because
we're going to be able to do a lot of things with it and later episodes we're going to be able
to do a wrist metic on these and do some aggregate functions on it but for now we're going to do
something really simple we're going to just do the command AUK and then inside of single quotes
you put also curly brackets so single quote single curly bracket print dollar sign two close curly
bracket and then second single quote file 1.txt space file 1.txt so what that is is all print column
to a file 1.txt so that like we said like I said before the actions go inside the curly
braces since we didn't have anything before the curly braces there was no pattern to match
so it's just going to look in the entire file and it's going to look in that second column
and since I didn't give it any way to to the limit the file other than its default it's going to
use white space and in my example file I lighten up the white spaces so that they are all so it
looks nice but AUK doesn't care about that it it will just parse it on white space no matter what
so whether it's one space or ten spaces or in one column or three spaces in another column and
25 spaces in another column it doesn't care it's going to parse them all the same and put them all
into even columns starting on the first now white space character so a couple of things that
you can see is that it's kind of intuitive it starts with 1 it doesn't start with 0 like other
program languages so you're going to say print 1 is going to be the first column print 2 is going
to print the second column so if I say in this file example if I say print 2 I'm going to print out
all the colors it's going to first put out the header row color let's go say red yellow red
purple green purple brown brown yellow so one special character to our special column number
is 0 so if you do dollar sign 0 it's going to print all the columns so that's just something to know
so going back to our example I'm going to do a little bit I'm going to add to that example I'm going
to say all now inside the first single quote I want to say dollar sign 2 equals equals and then
double quotes yellow and then you can put a space but or not um start the curly bracket print 1
closed the curly bracket closed the single quote file 1 that takes tea what this is doing
since we have now something before the curly brackets before our action we have our pattern
and our pattern is dollar sign equals equals 2 oh and yellow so look in the second column for the
word yellow and print column 1 and file 1.txt if you remember the file we had a bananas and
pineapples I have both of those in there as yellow so let's go to just print out banana pineapple
it's going to skip the header column because the header column didn't have the term yellow in it
that's one thing to understand about it's not going to automatically print the headers unless you
tell it to and we'll talk about that a little bit later in another episode
now right now we've been working with this file that is space-separated which has a lot of uses
especially on the command line where you're when you're going to pipe uh other commands into it
and you just want to see like you might want to do ls dash l and then pipe that into
awk and then you can separate by the columns that way that's fine but a lot of times when you're
working with data you're going to be working with either tab separated files or comma separated
files and so if you're not using a plain white space separated file or I like to do pipe
separated a lot of times because then you don't have to worry about curly brackets I'm a curly
double-coats around the um around the text fields to get around commas inside of a text
you want to we might want to use a different file separator so there's different ways to do
file separation and awk I'm going to go over the most apparent which is using an option the dash
capital f option the the character or characters that follow capital dash capital f is
your separator so if you just do dash f uh dash capital f comma that's going to tow awk to use
commas for the separator so that's fine you really don't need us actually you do not want to put a
space between it you don't need any other characters if you just put dash f comma it's going to do
that if you do dash f period it's going to do a dot separated however sometimes you might want to
do more complicated field separators that are more than one character in that case you want to
put your field separator inside of double quotes and you might see that sometimes in other people's
examples when they are just using commas they'll do dash f double quote comma double quote with no
spaces in between that's going to do the same thing as uh dash capital f comma so I have a
similar file called file one dot csv which is the same exact file but taking out the spaces and put
a comma in between and if we run the same command of awk this time awk dash capital f and inside
of double quotes comma space inside of single quotes dollar sign two equals equals inside of
double quotes yellow space inside of curly brackets print dollar sign one and the and the
the single quotes file one dot csv it's going to give us the same exact output is if we were doing
the white space delimited one without the dash f option which is banana and a pineapple
inside of those patterns you can also use regular expressions as well I have an example here
that's awk inside of single quotes dollar sign two and till day which is the on a usk
keyboard layout it's the one right above the tab
if you hit shift so till day space inside of forward slashes so awk for regular expressions like
the till day to say it's kind of like pearl well it likes the till day to say this is going to be
a regular expression and inside of forward slashes the expression that you want to evaluate
and I'm not going to go into regular expressions but uh that's a whole another topic but in this
example I'm doing p dot plus p so I'm looking for a p any one or more characters in between
and then another p and then I'm going to go um and after that I'm going to do inside of curly
brackets or action now print zero dollar sign zero and the close single quote file one dot txt so
I'm looking for any words that have the pattern of p anything in between in column two p anything
in between p and it returns the entire line of grape purple ten because purple which is in
column two has two letters in between the p and the second p and then also plum in the second
column is also purple so it's matching purple in both cases numbers can be evaluated in the pattern
as well so and it does this kind of intuitively so if you in our example we have numbers in our
third column so if I say all dollar sign three greater than five and then inside of our action
print dollar sign one comma space dollar sign two close the action close the single quote file one
dot txt I'm going to print both the first and the second column if the value in the third column
is greater than five so it's a good idea to go look at that um example but it's it's pretty
intuitive you're going to say if column three is greater than five print column one and column two
I'm sure you can see applications for this if you ever have to work with data that is um
that you have to manipulate um so continuing along with this uh I give the output of you're
going to find banana grape apple and potato because those are all the ones that had values that
were higher than five in our um example file you could also take that and redirect the output
of that into a file so if I do that same exact thing and say at the end of all
so I'm going to do for this example I just want to show it doesn't matter because it's still
going to print it out with space element um all dash capital F comma inside of the single quotes
thousand three greater than five inside of our action curly braces print
dollar sign one comma space the dollar sign two and the action file one dot csv
then greater than sign again output dot txt it's going to put name
color in the first line banana yellow grape purple apple green potato brown in a file called
output dot txt so that's a good way it's a nice way to be able to filter out things that you want
from a file and put it into another file and here's a cool trick that I learned on one of my
recent uh references that I gave at the end of the uh episode if you do this command
awk print awk and inside of the single quotes inside the curly braces print greater than
sign dollar two and then right next to the dollar two inside of doublecoats dot txt close the
parenthesis uh clear of the curly brace close the single quote file one dot txt so I recommend
for any of these episodes that we're going to be doing on the series that if you really want to
follow along and you don't want to just listen to our lovely voices that you probably get out the
show notes because they're it's really helpful but anyway um that command of five of awk print
so we're actually doing a redirect inside of our print statement that's what that curly bracket
that print curly um greater than sign means we're doing a redirect inside of our print statement
it's it's dollar two dot txt so we're looking at column two and whatever is in there
we're going to put um all matching ones are going to go into their own file
I'm not explaining this very well I'll do it again uh so print um greater than sign dollar two
and then and doublecoats dot txt file one dot txt is going to create a group of files
one yellow dot txt one red dot txt one color dot txt one brown dot txt one green dot txt
because those are all the different things that you can find in that um second column
and it's going to put print out in my example it's going to print out all the data that's in
um that all the columns that are in there and it's going to go into their own files so it's a really
quick way to take a whole bunch of data that might be all intermingled and separate it all into
individual files of like information so it's like doing a if you're going to do this in Excel
you'd have to do a filter and then pick pick the ones um uncheck the boxes that you don't want
pick the only one that you do want highlight all those copy it paste into another file
and save that file and then do the same thing for the next option in your filter and your next
option in your filter next option in your filter this and one command automatically make all the
different file a whole series of files based off of the um the pattern that you're matching it's
really cool um i mean elistemy maybe i'm just a dork that's fine oh but that's uh some of the commands
that you can do now one other thing i'm going to introduce but i'm not going to go into right now
is that sometimes with awk you can get really complicated in how you both set up how you're going
to parse the file so in your pattern um if you want to do some pre-processing and then do some more
processing on it and then do like some counts and some sums and some division and all that kind of
stuff you might want to it it's going to get really cumbersome on the command line so you're
going to want to put all that in a file and a lot of times the the convention is it'll be the
file name dot awk and then to get access to it you'll do awk dash lowercase f
file name dot awk and then file one dot txt and i'm pretty sure that they're a remainder of our
episodes we're going to be using the files because as we get more advanced in the awk
it really does like i said get cumbersome to deal with awk on the command line when you have
you know 15 lines of commands that you want to put in uh so that's the introduction
i'm excited to get into this series with uh with Dave hopefully we are able to enlighten some
people teach some new things and hopefully i'll learn a couple new things as we go i've already
learned this new technique with this uh separating things into individual files based on the the
match so it's pretty cool i have a couple also of a couple of uh resources that i found
online to help so i don't know if anyone knows about linux.die.net so linux.de.net slash man
that is like the man page for everything in linux so you'll find like so linux.die.net slash man
slash one slash awk is the man one page of awk. another really cool tutorial and i'll be doing
some of my examples following this or from www.linuxschool.deunuxschool.com and then some other ones are
from techman um upcoming in our series we will be talking about more of the other options besides
dash lowercase f and dash capital f uh we will also be talking about some of the built in variables
that are in awk and we will do some arithmetic operations some fancy text manipulation as
much as we can without going into said and going over the awk language and its syntax once again
thank you for listening hacker public radio this is be easy signing out
you've been listening to hacker public radio at hackerpublicradio.org we are a community podcast
network that releases shows every weekday Monday through Friday today's show like all our shows
was contributed by an hbr listener like yourself if you ever thought of recording a podcast
then click on our contributing to find out how easy it really is hacker public radio was found
by the digital dog pound and the infonomicon computer club and it's part of the binary revolution
at binrev.com if you have comments on today's show please email the host directly leave a comment
on the website or record a follow up episode yourself unless otherwise status today's show is
released on the creative comments attribution share a like 3.0 license