Initial commit: HPR Knowledge Base MCP Server

- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Lee Hanken
2025-10-26 10:54:13 +00:00
commit 7c8efd2228
4494 changed files with 1705541 additions and 0 deletions

233
hpr_transcripts/hpr3911.txt Normal file
View File

@@ -0,0 +1,233 @@
Episode: 3911
Title: HPR3911: An overview of the 'ack' command
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3911/hpr3911.mp3
Transcribed: 2025-10-25 07:51:42
---
This is Hacker Public Radio Episode 3,911 from Monday 31 July 2023.
Today's show is entitled, An Overview of the Act Command.
It is part of the series Lightweight Apps.
It is the 150th show of Dave Morris, and is about 21 minutes long.
It carries an explicit flag.
The summary is a Pearl-based Grep-like tool that can search by file type.
Hello everyone, welcome to Hacker Public Radio.
My name is Dave Morris, and I'm talking today about Command, which I have put into the
category or the series of Lightweight Apps.
So what this is, is a Pearl-based tool that behaves like Grep.
It actually uses the name beyond Grep on the website, shall I refer to in the notes.
So this tool is called ACK, quite sure why, but anyway, it's a good thing.
And it's got three main features that I use.
I don't use it a huge lot.
It's not a thing I use for every search of a file, but it's great for certain things.
First of all, it can restrict the searches to files of a particular type.
So there's a way of classifying files in terms of type, which I'll talk about in a minute,
and it will search only those.
The regular expressions that it uses, I mean Grep will handle plain text stuff
and regular expressions of various sorts, including a Pearl one.
But this one is only Pearl that it uses.
I think it's actually possible to simplify it, but the default is Pearl anyway.
Pearl, of course, has one of the most powerful feature rich types of regular expressions.
So that's, to me, that's fantastic.
And it's got features like you can limit the search area within a file if you want to.
I'm not going to go into that.
There's a lot to say here if I was to dig deeply into all aspects of this command.
And I'm not going to do it because I don't want to bore you to death.
It's almost a series in itself if I were to go into that level of detail.
It's fantastic, I believe, but it's a little complex to use.
And I use it mainly in special cases where I need the features I've just mentioned.
So I'll just give you a flavor of what it can do and leave you to research it more if it sounds interesting.
So you can install it in the usual sorts of ways.
I actually installed it as a package with it.
I used Debian, so I used app to do it, app to install, pseudo-app to install app.
And that's fine.
I only, as I was preparing the show, I noticed that I was using verse 3.6.0.
And that's a little bit behind.
And there's a new version, 3.7.0, which you can get details about from the website.
It suggests that you might want to install it as a pearl module using CPAN.
But if you're not a pearl user, these are things you might not want to take on board.
Let's just talk briefly about pearl regular expressions.
As I said, they're very sophisticated and have grown tremendously over the years.
I think pearl was ground breaker in terms of regular expressions, because a lot of other regular expression engines follow pearl's lead.
It's certainly the power of, I've used pearl for quite a long time now, and the power of it is quite spectacular.
What happened during the last number of years was that there came to be a thing called the pearl compatible regular expressions library.
This was put together by a guy called Philip Hazel from Cambridge University. He did this in 1997.
I was particularly interested in this, because we, where I was working, I was working at ran XIM, the Myel Transfer Agent.
And PCRE, pearl compatible regular expressions, were implemented within it.
And Philip Hazel was the author of XIM. So he did this.
And PCRE is available in quite a lot of other areas.
I think at one point Python used it. I'm not sure it does now, but anyway.
But since the original PCRE, there's, there's come a version called PCRE2. So he's still developing it.
And it, it is very widely used.
The act documentation refers you to various pearl manuals for details of how to use the regular expression syntax in detail.
Although you can use it in a simplistic way, which is what I do mostly.
I don't use the full power of it, but it is quite nice to be able to get into some of the pearl stuff.
But there's, there's plenty of documentation if you're interested.
In fact, GNUGREP, which would be the one that most people will have on a Linux system,
it can use pearl compatible regular expressions when, when doing its matching.
But I can't remember what the option is, minus capital P is it?
Not sure. But if you look at the documentation for GNUGREP, it says this is experimental.
I've not really gotten into, into using that in a big way.
So let's talk about the file type issue I mentioned before.
So this act command has, has got rules for recognizing file types.
And it does this by looking at the name extensions fairly obviously.
So .html or .py.
It can also look at the file's contents to see the sort of first line,
which is, which has got, what do they call that?
Some sort of magic thingy that's determined what a file is called.
I completely forgotten the terminology.
You can find out what Act knows about in terms of types by giving it the option,
hyphen hyphen help, hyphen types.
Or you can use Act-based hyphen hyphen dump.
Some of the examples are CC, there's a type and there's a C files,
Haskell, Haskell files, Lua for Lua files, Python for Python files,
Shell for Bash and other Shell command files.
And these names can be used with the options,
Lua case, hyphen and Lua case T, followed by the type name.
Hyphen hyphen type equals the type name.
And also by preceding the type name with two dashes, if you wish.
I think that might be that particular way of doing it might not be available in the future
because it says it's deprecated somewhere. That's what I use.
You can also say files not of this type by using a hyphen capital T,
instead of a Lua case T, or hyphen hyphen type equals no, followed by the type string.
Anyway, not use that much of a software.
I don't usually want to look up files which are not Shell script,
because I'll be everything else and I don't want to.
But it's good that you have it.
So I've got a little example here to check files in the current directory,
which are Shell scripts.
Then you might want to do this with the command act space.
Hyphen hyphen shell, space, and then the search string which in my case is declare.
And I'm actually in the directory where I prepare my HBO shows.
And it finds one entitled bash snippet using co-proc with their skew light.
And it finds the shell script there and shows me online 11.
There is the occurrence of the word declare.
And it's quite nice that it, by default, gives you the line number.
It also does coloring of these things when you're searching.
All of which you can turn off and mess around with to enormous degrees.
You can add your own file types to act.
And there's a configuration file called.acrc, which you can add more types to and talk about that next.
So that configuration file.acrc and it contains, as the manual says,
command line options are pre-pended to the command line before processing.
So it's a useful way to add new types or modify existing ones.
There are a number of places it can be placed and the documentation will tell you that.
But I put mine in my home directory where I keep all my other configuration files.
It can be another, other more regular places in the config directory or whatever.
You can actually create a new.acrc with the option hyphen, hyphen, create hyphen, ackrc.
So it will write out an example series of settings on sanded out and you can just pipe it to the file.
That's all the defaults that are built into the script.
But it means that you've got them somewhere where you can change them if you wish.
Now I have a lot of markdown files in the directory where I do all my HPR talks.
I write everything in markdown.
And for some reason, I'm not sure why I did this.
I originally gave them an extension of dot mkd.
I must have seen somebody else do that or just seemed like the right thing and I wasn't sure what was better.
But ack recognizes dot md and dot markdown as signalling markdown file.
So I wanted to add dot mkd to the list and it was pretty simple to do.
There's two commands you can use within your ackrc.
There's one which is dash dash type dash add equals markdown colon ext colon mkd.
So that depends the particular extension to the existing list.
Or you can use dash dash type dash set equals markdown colon ext colon md comma mkd comma markdown.
What that does is to replace the existing settings.
That's why it's used as set supposed to add after type.
And you just give it the list, the currently existing list plus whatever else you've added to it.
You can put comments in the file too.
So if you then dump the settings, I've done this.
You can see that markdown is listed with dot md dot a markdown and dot mkd as detectable extensions.
So if I do a search for markdown in the directory I'm in, where are my various HPR shows live.
I keep them all around forever.
Ack, space dash dash markdown, space then encroached inner or in a case space ear.
I get back one match, one file match and then I get a bunch of lines that contain the string in it ear.
Now there are a lot of options to ack, the general usage pattern for using the command is that you type ack followed by list of options.
Followed by a pattern which is the thing you're matching.
Usually in in files but come onto that more in a moment.
And followed by an optional list of files or directories.
If you don't give a list or files or directories then it will look at the current directory and will recurse down into subdirectories.
And the pattern that I mentioned is the PCRE search string which is usually enclosed in single quotes so it doesn't get interpreted by the shell.
There are some cases where you don't use a pattern but we'll look at that briefly in a moment.
You can look at the full documentation for ack, the usual man ack command.
And the alternative to doing that is to use ack itself to report its man page which is ack space dash dash man.
There is also an option dash dash help which gives a summary of all the available options which I actually find more useful.
Because it's usually options I'm trying to remember and see if to scan through rather than the full documentation.
Just got a few options to refer to here.
There's quite a lot that are specific to ack and some of them since it's meant to be a grep standing there are some which are common to grep.
But I won't look at too many.
Well the first one is one that you do find in grep which is dash i and that makes the pattern matching case insensitive.
It's possible to do that within the pearl expression you can say I want these only to be matched with.
I don't care about the case when matching but it's a lot easier to use it as an option I find.
Then we get to dash f which is about searching for files by name.
So it only prints the files that would be searched.
It comes back with the list of file names doesn't do any searching but it's useful for finding things within a directory which are of a particular type or match a pattern or whatever.
dash g is the same as dash f but in this case you use a pattern and you look for files which match that pattern whose names matched the pattern not the contents.
But that overlaps what you get with the type business so it's useful in some cases but can be a little bit misleading.
Then we get dash L which reports the file names which contain matches for a given pattern.
So it's not actually showing you the matches but it's showing you the files which would match and the dash capital L reports file names which do not match the pattern.
Then you've got dash c which also get in grep which reports file names and the numbers of matches when you use it.
So it actually reports all files that match whatever it is you're matching against in terms of file types or names.
It reports them all and it gives you counts of zero if there are no matches.
Which I find a bit of a pain not that useful but if you use it with hyphen L then you only see the names of files that have matches and a count of the matches.
So I've just used it yesterday looking through a bunch of files to see if any of them had a particular string in them because I needed to edit it because it was a grammatical error.
So using the hyphen C and hyphen L was a great way to do it.
And then we have dash W which forces the search pattern to match only whole words.
So a lot of times there is no way of doing that in grep that I know of.
Maybe you're wrong actually I think some of the regular expression capabilities of grep allow you to do that.
But it's sometimes useful to be able to say look that sequence of characters I've just given you that I want you to look for is the word or are words as opposed to just being a B C anywhere in any any text.
So there's a lot it's actually a lot of power in this possibly too much.
I don't use it all by any means.
So I've got a few examples.
The first example is looking for all markdown files in a directory.
So first option was to use the dash F option.
So I typed ack space dash dash markdown, space dash F, space and then the name of a sub directory, knight core tube torch.
So I did some time ago.
And it comes back with a list of names which are the markdown files within the within that directory.
Now MKD files.
Now there are many other ways you could do that.
You could use the find command to find them.
That's that would have been what I would have used in the past, but I find ack just does a nicer job.
I give an alternative here using the dash g option.
So ack space dash g and then we're using a pattern.
The pattern is open quote backslash dot which patches an actual dot MKD dollar.
Close quote of that saying any file name you get back which ends with dot MKD and then the name of the directory knight core tube torch.
And it comes like the same file names achieved by a different method.
I think I would use the former in pretty much all cases.
But if you don't have a type that you can use to do the search then the icon g thing is an alternative.
So what about finding the names of files listing the names of files that contain a match with some string and the number of matches per file.
So this one is the ack command followed by dash dash markdown.
So looking to mark down files again.
This time I'm using dash LCI.
So I've concatenated three of the options together which you can do with single character options.
And by the way these options usually I think in all cases have no not all cases but in many cases have a single character version.
And a double hyphen followed by a long version.
So but I'm using the short versions here for demonstration purposes.
So LCI means use the options dash L dash C and dash I.
Now my match string my pattern is
open quote backslash B EAR backslash B.
Now that's using one of the pearl regular expression capabilities which is to denote a boundary.
And this gets a word boundary.
So the backslash B can be an opening boundary or a closing one.
So it's saying look for the word EAR, EAR which is a word, a standalone word.
Now there are other ways of doing that.
You can do something like that in GREP.
But the regular expression syntax is very in the way that this word boundary thing is done.
And it gets a little bit messy.
But the point is that it's separating out the sequence EAR from, for example, in the word pearl as an EAR.
But you don't want pearls to be returned because it's not a word.
I mean it isn't a word but I mean it doesn't got the word EAR in it.
So for example three we're just looking for words in a simpler way.
Similar to example two where we use the backslash B boundaries.
You can achieve this alternatively by making the pattern simpler, just the word EAR, EAR and preceding it with dash W.
Which as I mentioned before says treat this as a word.
Or words I think but I might have experiment with that.
But in both of these cases what you get back is list file names.
And each file name is followed by colon and a number which tells you how many matches for that word exists in the file.
And it's the same in both cases of course in example two and three.
So I thought I'd better stop with examples here because it's getting a bit too far, a bit too long otherwise.
So that's it then. There's some references to the pearl reference manuals and tutorials and stuff.
And the site where AC can be found and details about it.
And that's the end and I hope you found that useful.
Okay then, bye.
You have been listening to Hacker Public Radio at HackerPublicRadio.org.
Today's show was contributed by a HBR listener like yourself.
If you ever thought of recording podcast, then click on our contribute link to find out how easy it really is.
Hosting for HBR has been kindly provided by an honesthost.com, the internet archive and our sings.net.
On this otherwise status, today's show is released on our Creative Commons Attribution 4.0 International License.