An MCP (Model Context Protocol) server providing access to the Hacker Public Radio (HPR) knowledge base, including episodes, transcripts, hosts, series, and community comments.
## About HPR
Hacker Public Radio is a community-driven podcast where hosts contribute content on topics of interest to hackers. All content is released under Creative Commons licenses, making it freely available for learning and sharing.
## Features
This MCP server provides:
- **Episode Search**: Search through thousands of HPR episodes by title, summary, tags, or host notes
Any MCP-compatible client can connect to this server via stdio. The server will load all HPR data on startup and make it available through tools and resources.
## Available Tools
### 1. `search_episodes`
Search for episodes by keywords in title, summary, tags, or notes.
**Parameters:**
-`query` (string): Search query
-`limit` (number, optional): Maximum results (default: 20)
-`hostId` (number, optional): Filter by specific host
-`seriesId` (number, optional): Filter by specific series
-`tag` (string, optional): Filter by tag
-`fromDate` (string, optional): Filter from date (YYYY-MM-DD)
-`toDate` (string, optional): Filter to date (YYYY-MM-DD)
**Example:**
```
Search for episodes about "linux kernel" from 2020 onwards
```
### 2. `get_episode`
Get detailed information about a specific episode.
**Parameters:**
-`episodeId` (number, required): Episode ID
-`includeTranscript` (boolean, optional): Include transcript (default: true)
-`includeComments` (boolean, optional): Include comments (default: true)
**Example:**
```
Get details for episode 16 including transcript and comments
-`query` (string, optional): Phrase to search for. Useful for exact-phrase lookups.
-`terms` (string[], optional): Explicit list of terms to search for; combine with `matchMode` for logical AND/OR searches.
-`matchMode` (`'phrase' | 'any' | 'all'`, optional): How to combine `query`/`terms`. Defaults to `'phrase'`. Use `'any'` to match if any term is present, `'all'` to require every term somewhere in the transcript.
-`limit` (number, optional): Maximum episodes to return (default: 20).
-`contextLines` (number, optional): Lines of context to include around each match (default: 3).
-`hostId` (number, optional): Only return matches for this host ID.
-`hostName` (string, optional): Only return matches for hosts whose name includes this value.
-`caseSensitive` (boolean, optional): Treat terms as case-sensitive (default: false).
-`wholeWord` (boolean, optional): Match whole words only (default: false).
-`maxMatchesPerEpisode` (number, optional): Maximum number of excerpts per episode (default: 5).
**Example queries:**
```
Find transcripts mentioning "virtual machine"
```
```
Find transcripts where klaatu talks about bash or python
- Query: `"pythoon"` → Finds episodes with **python** in the title *(fuzzy match, distance: 1)*
- Query: `"linx"` → Finds episodes with **linux***(may match exactly in summary/tags, or fuzzy in title)*
### Distance Thresholds
- **Hosts**: Maximum distance of 2 characters (handles 1-2 typos)
- **Episodes**: Maximum distance of 3 characters (more lenient for longer titles)
### What the AI Agent Sees
When fuzzy matching is used, results include:
-`matchType: 'exact'` or `matchType: 'fuzzy'`
-`matchDistance: N` (for fuzzy matches, indicating how many character edits were needed)
This allows AI agents to provide context to users, such as: *"I found results for 'klaatu' (you typed 'klattu')"*
### Technical Details
The fuzzy matching uses the **Levenshtein distance algorithm**, which counts the minimum number of single-character edits (insertions, deletions, substitutions) needed to change one string into another.
**Note**: Transcript search uses regex-based matching and does not use fuzzy matching, as the flexible regex patterns already handle many variations.