Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
159
hpr_transcripts/hpr3012.txt
Normal file
159
hpr_transcripts/hpr3012.txt
Normal file
@@ -0,0 +1,159 @@
|
||||
Episode: 3012
|
||||
Title: HPR3012: Sample episode from Wikipediapodden
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3012/hpr3012.mp3
|
||||
Transcribed: 2025-10-24 15:08:34
|
||||
|
||||
---
|
||||
|
||||
This is Hacker Public Radio, Episode 3, 2012, for Tuesday, 18 February 2020.
|
||||
Today's show is entitled, Sample Episode, from Wikipedia, PODEN.
|
||||
It is the 170th anniversary show of Ken Farloon,
|
||||
and is about 9 minutes long, and carries an explicit flag. The summer is.
|
||||
An English microsoday of their Swedish language podcast about Wikipedia,
|
||||
this episode of HPR is brought to you by archive.org.
|
||||
Support universal access to all knowledge by heading over to archive.org forward slash donate.
|
||||
Hello, this is Wikipedia, PODEN, with a special episode in English.
|
||||
We're coming to you from WikiTech Storm 2019 in Amsterdam.
|
||||
We are doing a couple of special episodes, both about the conference itself,
|
||||
but also the fantastic people that are here and the projects that they care about.
|
||||
So tune in and listen to our special episodes.
|
||||
This is Wikipedia, PODEN, with a special episode from the WikiTech Storm in Amsterdam.
|
||||
I'm here with Sandra for Konier, and you're on the team of the structured data on Commons.
|
||||
What is your role on that team exactly?
|
||||
That's a good question. I work with the GLAM team at Wikimedia Foundation,
|
||||
so the team that works with galleries, libraries, archives, museums,
|
||||
and GLAM Wiki collaborations and the structured data on Commons project.
|
||||
I make sure that we are also thinking about GLAMs as users and GLAM Wiki volunteers,
|
||||
as users of structured data on Commons.
|
||||
So when I do pilot projects, little projects that involve GLAM with structured data.
|
||||
And what is structured data on Commons in a nutshell?
|
||||
If it's possible to tell it in a nutshell?
|
||||
Well, let me do my best.
|
||||
Wikimedia Commons, I think many people, when the Wikimedians will know it,
|
||||
because they've uploaded images there for Wikipedia, etc.
|
||||
It has always been like Wikipedia itself, a Wiki with text.
|
||||
So you describe files with text.
|
||||
But people have always been, the problem with that is that a text is, in many cases,
|
||||
only one language. It's only in English mostly.
|
||||
And probably if you're a Swedish speaker and you start searching for files in Swedish,
|
||||
you will find less files than you would search in English, right?
|
||||
And for a long time, the community has said,
|
||||
we want Commons to become multilingual.
|
||||
And since Wikidata came around, so Wikidata, our, you know,
|
||||
multimedia, multilingual knowledge base, has come around.
|
||||
People have been starting to think it would be good to integrate Wikidata
|
||||
to help describe files on Commons, to make the multilingual among other things.
|
||||
And that's basically what we're doing in structured data on Commons.
|
||||
So now, actually, many things are deployed, so it's live.
|
||||
When you go to a file on Commons, you can, you see a tab that says structured data.
|
||||
And if you go to that tab, you can describe a file with actually multilingual data,
|
||||
with data from Wikidata.
|
||||
So if you have a photo of a table, for instance, you say the picked table,
|
||||
and it will use the Wikidata item.
|
||||
And even if someone, you know, searches Commons in Swahili,
|
||||
they type the concept of table in Swahili, they will find your photo.
|
||||
So that's basically what we've done.
|
||||
We've added support for adding descriptions with Wikidata.
|
||||
And what parts of this is already live that someone might have come across
|
||||
that they don't even know of yet?
|
||||
So if you have uploaded a file recently to Commons,
|
||||
you will have seen that you have an extra step now in the upload wizard,
|
||||
where you add structured data, you can add a caption,
|
||||
which is a multilingual piece of text about that that will help to file to people to find your file.
|
||||
And you can add statements, you can add something that is being depicted in a file,
|
||||
and you can add other bits of information like who made it, etc.
|
||||
And on the file pages that themselves also under the image,
|
||||
you will see these two new tabs file information and structured data,
|
||||
and that's new stuff that we've added.
|
||||
So if you've used Commons, you might have seen those things.
|
||||
Maybe you're a little bit hidden for everyone,
|
||||
they're not super, super visible with they're there.
|
||||
They're quite new since a few months.
|
||||
And what are some plans for the future? What's coming?
|
||||
What's coming? So in fact, structured data Commons was developed with an external grant,
|
||||
so with funding that came from the Sloan Foundation, that's an American Foundation.
|
||||
And that's grant period is now actually ending in the end of December.
|
||||
So officially like the development framework is almost ending.
|
||||
But we're still finishing up some things.
|
||||
And one of the things we really want to get done,
|
||||
and that will probably be worked on a bit after December still,
|
||||
is going to be a query engine, just like for people who know we get data,
|
||||
being able to ask the complex kind of questions to the data that you will also be able to do with Commons.
|
||||
We are finishing actually searching the picked statements in multilingual ways
|
||||
that you can do that. When you enter structured data,
|
||||
you will notice that you cannot see and cannot add dates yet,
|
||||
or you cannot add geographical coordinates yet.
|
||||
And so those tweaks will still be added in the upcoming months.
|
||||
That's the first things that we will still add in the upcoming period,
|
||||
and that you will actually quite soon already see.
|
||||
And are there some wild ideas that have come up during the work with this
|
||||
that you haven't had the time to implement in this session,
|
||||
but that you hope that someone will pick up in the future?
|
||||
Yeah, for instance, we've only had the opportunity to do very basic functionalities,
|
||||
like on one file page, you can add structured data per file.
|
||||
But we already see that many community members or some community members
|
||||
are developing tools to do more batch things.
|
||||
We really hope that there will be some community members coming up with batch upload tools
|
||||
that include structured data that is not there yet.
|
||||
One thing that we've been thinking about a lot
|
||||
is it might change the nature of galleries on Wikimedia Commons.
|
||||
So now they are hand-created.
|
||||
If you go to, I don't know, Barack Obama as a search German Commons,
|
||||
you might find a gallery that is hand-created by someone.
|
||||
But in the future with structured data,
|
||||
you could imagine that these galleries are also automatically created.
|
||||
Another thing that is definitely something that's difficult to develop
|
||||
because it's super complex.
|
||||
But it would be a beautiful kind of advanced search for Commons
|
||||
that you don't just type as you currently do a word
|
||||
and you only find things that have that words in the description.
|
||||
But that you would get a search that is a bit like the Google image search
|
||||
where you get suggestions and you can filter that would be super awesome to do.
|
||||
But we've been exploring it a bit in the team.
|
||||
What it would take to do it.
|
||||
I'm just saying it's very complicated to do it.
|
||||
It's not something you would develop in a month or so.
|
||||
But it's definitely a longer term dream to have much better search on Commons as well.
|
||||
So yeah.
|
||||
And if we're thinking about impact to other projects and just searching
|
||||
and finding things on Commons,
|
||||
what do you see in the future for example for Wikipedia?
|
||||
For Wikipedia, we think of situations that it's for people who write articles
|
||||
it should become with structured data a whole lot easier to find appropriate images
|
||||
for what they are writing about because of the link with Wikidata.
|
||||
Under the hood actually, when you write a Wikipedia article about a topic
|
||||
it is connected to a Wikidata item.
|
||||
And then through that Wikidata item you can then go to Commons
|
||||
and then find on Commons images that might be of already good quality
|
||||
because that structured data about that image is already there.
|
||||
So it might even help people who write Wikipedia.
|
||||
But we also expect that it will help people who want to develop WordPress plugins
|
||||
or something like that.
|
||||
You just type a concept and you get the best images about that concept.
|
||||
So searching for those kinds of functionalities and external tools should also be a lot easier.
|
||||
Yeah. Things like that.
|
||||
It should make it easier and more flexible for people to build tools
|
||||
and more powerful tools and multilingual tools.
|
||||
And we hope to see that in the future.
|
||||
That sounds like a very interesting future.
|
||||
I hope we get there soon.
|
||||
Thanks for taking the time and talking to me into the Wikipedia pattern listeners.
|
||||
Well, thank you and I hope you try it out.
|
||||
You have just listened to one of our special episodes
|
||||
from Wikitext Storm 2019 in Amsterdam.
|
||||
We'll soon be back with more episodes.
|
||||
You've been listening to HackerPublic Radio at HackerPublicRadio.org.
|
||||
We are a community podcast network that releases shows every weekday, Monday through Friday.
|
||||
Today's show, like all our shows.
|
||||
We are a community podcast network that releases shows every weekday, Monday through Friday.
|
||||
Today's show, like all our shows.
|
||||
You've been listening to HackerPublic Radio at HackerPublicRadio.org.
|
||||
We are a community podcast network that releases shows every weekday, Monday through Friday.
|
||||
Today's show, like all our shows, was contributed by an HPR listener like yourself.
|
||||
If you ever thought of recording a podcast and click on our contributing to find out how easy it really is.
|
||||
HackerPublic Radio was founded by the digital dog pound and the Infonomicon Computer Club.
|
||||
And it's part of the binary revolution at binrev.com.
|
||||
If you have comments on today's show, please email the host directly.
|
||||
Leave a comment on the website or record a follow-up episode yourself.
|
||||
Unless otherwise stated, today's show is released on the creative comments,
|
||||
attribution, share a life, 3.0 license.
|
||||
Reference in New Issue
Block a user