Initial commit: HPR Knowledge Base MCP Server

- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00
commit 7c8efd2228
4494 changed files with 1705541 additions and 0 deletions
--- a/hpr_transcripts/hpr1099.txt
+++ b/hpr_transcripts/hpr1099.txt
@@ -0,0 +1,123 @@
+Episode: 1099
+Title: HPR1099: compilers part 2
+Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1099/hpr1099.mp3
+Transcribed: 2025-10-17 18:54:29
+
+---
+
+Hello, everybody.
+My name is Sig Flup and welcome to Miscellaneous Radio Theatre, 4,096.
+In this episode we're going to talk again about compilers, we're talking about compilers.
+This is a series where we talk about compilers.
+Last time we described the stages of a compiler and basically what high level and low level
+is.
+This time we're going to talk about at least one of the stages, we're going to talk about
+parsing, which is the second stage of compilation.
+The first stage is lexical analysis, but it's important that we talk about parsing first.
+I'll explain why in a bit.
+So imagine the big picture here is imagine a circle with finite state machine written
+in the middle.
+That's not what I'm going to talk about here, but it's important to realize that it's
+within an even bigger circle with context-free languages written in it.
+That's where we're going to start.
+Now there's even a bigger circle around context-free languages called Turing Machine Languages.
+This is the structured machine complexity and this structure is how the information about
+a program flows.
+It flows first through a lexical analyzer, that's a finite state machine, and then flows
+through a parsing stage, which is a push down finite automata, which is just a fancy
+term to describe the mechanism that implements context-free languages.
+Then flows into Turing Machine language with code generation.
+The reason why I'm not going to start with describing lexical analysis is because any
+circle in this model that we're imagining can emulate circles within it.
+So you could implement a lexical analyzer within a parser.
+Typically we don't do that, but you could.
+But let's talk about parsing.
+A context-free grammar is a grammar where every production rule takes the form of a non-terminal
+producing a terminal.
+Now you're thinking, SIGFLOB, what's a production rule, what's a terminal, what's a non-terminal.
+Let's relate it to English.
+A non-terminal is something like a paragraph.
+Paragraph produces more non-terminal called sentences.
+Sentences produce even more non-terminal called things like verbs, subjects, nouns.
+Verb subjects and nouns produce words, words, produce letters, and letters are our terminals.
+So the flow of production and the production of a grammar is the flow from non-terminal
+to terminals.
+This flow is called the parse tree, where terminals are the frontier.
+Let's take the simple grammar.
+S produces perenn, S and perenn, or S plus S, or S times S, or S divided by S, or ABC.
+This is simple algebra grammar where we have terminals of ABC and also plus multiply
+and divide.
+So let's see your compiling source code that looks like this.
+Perenn A plus N perenn times B divided by C. The top of the parse tree for this is going
+to be S because while that's the start of our grammar, S in our case produces S times
+S, S times S produces S perenn S on one end of the leaf, and S divided by S on the other.
+Perenn S and perenn produces perenn S plus S and perenn and so on.
+All these productions in the end produces a tree that describes our input perenn A plus
+B and perenn times B divided by C within its grammar constraint.
+The idea of a parser is validating that a particular input stream matches a grammar.
+Now, we can extend this grammar all the way to having file as the starting symbol to
+having bytes as the terminals, but we don't extend it that far in a compiler.
+In a compiler, the grammar is extended from file as the starting symbol to what are
+called tokens.
+Token production is what lexical analysis does.
+A token is a stream of terminals that can be assumed to be one terminal by the parser.
+Like the number 1337, it's made up of four terminals, but it can be summed up with one
+token, and that one token can be taken as one terminal by the parser.
+Another token might be a string, for instance.
+A file, a source code file for a very simple language produces declarations and functions.
+Plans produce symbol tokens or declaration plus equal sign plus constant and so on.
+When we talk about input streams, we don't have to produce the entire parse tree in order
+to do something.
+There's work to be done when we produce sections of the parse tree.
+For instance, in a function part of the parse tree, there might be a line of code section.
+We match a line of code, we do something, we match a function, we do something.
+And we match a declaration, we do something, and so on.
+So how does matching an input stream to a parse tree actually work?
+Well, it can be done with something called ship-reduced parsing.
+This is a bottom-up parser where we deal with bottom-up tokens instead of top-down productions.
+You can also call a ship-reduced parser a left-right parser.
+This is where we keep a marker on the input stream and reduce what's on the left and ship
+the marker.
+So our marker starts on the left of our input stream.
+Some reduction is done to the left side.
+It's shifted once again and some reduction is done to the left side.
+So the left side has non-terminals and the right side has non-terminals and or terminals
+and the right side just has terminals.
+Today we have a parser state that is A, B, C, marker, X, Y, Z.
+We first ship so we have A, B, X, marker, Y, Z.
+Now let's assume that in our grammar we have a production rule, D, produces, C, X.
+We then can reduce to A, B, D, marker, Y, Z.
+Reduction is the application of what's called an inverse production.
+Every time we do a reduction, we move the state higher in the tree we're making until
+we finally reach the top.
+Since we have a single state during the production of every bit of the tree, we check the left
+side of the state against production rules and when it matches a certain production, say
+file to declaration, file produced declaration, we can do something, bottom up.
+So what we do at every stage is build dangling nodes of what's called a syntax tree.
+We stitch the nodes together as we move up the tree.
+So our syntax tree of an entire program might look like this.
+Imagine if you will file at the top two declarations on the left, one function on the right,
+from the function we have two more declarations on the left and lines of code on the right.
+That's what we stitched.
+Performing all these actions, performing ship-reduced parsing, building parse trees, performing
+actions at every matching part of the parse tree, we stitched together another tree called
+a syntax tree and that's how a syntax tree is made.
+And that is parsing.
+And that's the end of this episode.
+Thank you for listening.
+And I look forward to recording another.
+Take care everyone.
+Bye-bye.
+You have been listening to Hacker Public Radio, where Hacker Public Radio does our.
+We are a community podcast network that releases shows every weekday Monday through Friday.
+Today's show, like all our shows, was contributed by a HBR listener by yourself.
+If you ever consider recording a podcast, then visit our website to find out how easy
+it really is.
+Hacker Public Radio was founded by the Digital Dark Pound and the Infonomicom Computer
+Club.
+HBR is funded by the binary revolution at binref.com, all binref projects are crowd-responsive
+by lunar pages.
+Of shared hosting to custom private clouds, go to lunarpages.com for all your hosting
+needs.
+Unless otherwise stasis, today's show is released under a creative comments, attribution, share
+a line, read our own license.