Files
hpr-knowledge-base/hpr_transcripts/hpr3035.txt

124 lines
9.5 KiB
Plaintext
Raw Normal View History

Episode: 3035
Title: HPR3035: Decentralised Hashtag Search and Subscription in Federated Social Networks
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3035/hpr3035.mp3
Transcribed: 2025-10-24 15:29:50
---
This is Hacker Public Radio Episode 3035 for Friday 20 March 2020.
Today's show is entitled Decentralized Hashtag Search and Subscription in Federated Social Networks
and is part of the series Social Media. It is hosted by Ahuka
and is about 11 minutes long
and carries a clean flag. The summer is
Activity Pub Conference 2019.
A proposal for how we can use hashtags to find and subscribe to content.
This episode of HPR is brought to you by an Honesthost.com
Get 15% discount on all shared hosting with the offer code
HPR15 that's HPR15
Better web hosting that's Honest and Fair at an Honesthost.com
Music
Music
Music
Music
Music
Hello, this is Ahuka, welcoming you to Hacker Public Radio
and another exciting episode.
And I'm continuing my look at the Activity Pub Conference of 2019
and a talk by a young fellow named Trolley Schmidtlach
and he's a computer science student at the Technical University in Dresden.
This talk is organized around some work he has done as a student
and is more of a proposal for discussion than a finished piece of work
though he has worked out a lot of his proposal.
And the name of his talk is decentralized hashtag search and subscription
in federated social networks.
I don't know whether that sounds exciting, actually the more I listen to it, the more interesting it got.
Now, hashtags may have started on Twitter.
I believe they did, but they've become a standard on virtually all social media platforms
which is a strong argument that they meet a need.
They are used for events, political discussions and general discussions.
They've been used to coordinate demonstrations and other activities
and for social movements like Me Too and Black Lives Matter.
And of course, hashtags are used in the Fediverse.
The problem here is that in a decentralized environment
you don't see all the posts around a given hashtag, just the ones on your node.
This could push people to move to larger nodes to see more posts,
but that's the opposite impetus to what federation is about.
So this is a problem.
Now, currently you can subscribe to someone on a different instance
if you know their username and the name of their instance.
As I mentioned before, I am at Ahuka at octadon.social.
So, someone on another instance wants to follow my particular posts.
They can use that full address to send a subscribe message to my instance
and from then on they will see all of my posts.
But they would not see the posts of anyone else on my instance unless they had explicitly subscribed to them.
There is a partial solution right now, which is something called a relay.
And from Astonon there is a relay, the relay dot mastodon dot host.
And this is described as a service type activity pub actor
that will re-broadcast anything sent to it to anyone who subscribes to it.
Okay, that sounds like the fire hose, which is not what we need.
Anything sent to it is going to get bounced back out.
There's other problems with it.
It's a centralized actor relaying all incoming posts.
Therefore, it is a single point of failure.
I'm always a little leery of a single point of failure.
Bringing in all the posts is a huge load to place on a small instance.
And you only see the posts sent since you subscribed.
So the proposal from Trolley is for an architecture that would relay and subscribe.
Instances can subscribe to all public posts of a given hashtag.
Store and query instances can retrieve one year history for a hashtag without needing to subscribe.
And fully decentralized, so there's no single point of authority for all of the tags.
Now, to accomplish this, he proposes as a core idea a distributed hash table based on cord
that would distribute responsibility for hashtags among instances.
Now, here's where we need to be careful since we now have the term hash in two different contexts.
Cord is a distributed hash table, D-H-T.
And what that means is it stores S-H-A-1 hashes.
And those hashes could represent anything.
It's simply a standard label and obviously unique.
I mean, that's the other thing about hashes is that they're unique or darn close to it.
In any case, the hash would be described would be in this table.
And the idea is you could subscribe to a hashtag using that hash just as you can now subscribe to an individual.
So the way this would work is that you would calculate the hashes for all given hashtags and all nodes.
And they would share a namespace.
Each node would then keep a routing table or rooting table if you prefer.
Then we can put it to work.
Now, the lifecycle of a new post would now look like the publishing instance calculates the hash of each hashtag added to the post.
And looks up the responsible relay instance on the distributed hash table for each included hashtag.
Then the publishing instance sends a post to the responsible relay instance.
The relay instance looks up the responsible storage node on the DHT.
Note that this implies that being a relay node and being a storage node can be two separate roles on two separate devices.
The relay instance verifies the incoming post signature then relays the posts URI, which is its identification to all subscribers and the storage node.
Subscribing instances can now retrieve the full authenticated post from the received post URI.
Now, note that there are other problems to be addressed and he does address security like stopping man in the middle of text that would suppress certain hashtags.
He mentions load balancing and redundancy. I'm not going to go in all the technical details as I have mentioned before I am not actually a programmer.
So I have an interest in free culture and open source software and all of that.
But you know, I was previously a project manager and that everyone knows project managers don't actually know anything.
We just somehow coordinate activity.
So if you want to get more details as with all of these talks that I'm discussing, there's a link in the show notes that would let you take a look at the video and you can get more information there.
You know, at the end, he opens it up for discussion beginning with the social aspects. Do we want? Do we even want global hashtags in the Fediverse?
Now, there are positive benefits in allowing more conversation and coordination of activities.
Think of Terrier Square as an example of this, you know, the Arab Spring in Egypt. That was coordinated through stuff like this.
Now, they were using Twitter, but if we want to be able to do something similar in the Fediverse, having some way to use hashtags globally makes a lot of sense.
So that's very positive. Is there a downside? You know, that might facilitate things like spam and harassment.
Then there's the question of visibility level. Should this only apply to public posts or should it also include unlisted posts?
And do we need a new level to make this work? Also, none of the necessary architecture exists in activity pub right now, including routing for the DHT.
You know, from a security standpoint, we need to make sure no attacker can gain control over a given hashtag and also not introduce an arbitrary number of nodes.
You know, again, think of Terrier Square and use the Egyptian government as the attacker and you start to see what the issue is and what is at stake.
Now, I thought it was an excellent presentation and provided a lot of food for thought because one of the big problems I've had with the federated media is I feel like I'm sort of limited to following the people I accidentally stumble across.
You know, and you know, when I think back on it, you know, there were a whole series of people I followed on Google Plus and then it disappeared and I lost most of those people.
I've stumbled across a few of them on mastodon now and one or two more on diaspora. And that's nice.
But where's everyone else? I don't know. And the idea of activity pub should be that regardless of which application they use, maybe they don't want to be on mastodon, maybe they don't want to be on diaspora.
If everyone used activity pub and I had some way of finding them, I could keep up with what they're doing.
And you know, the idea of this hashtag thing is maybe you can find people read it around interests.
You know, what if there were some way of locating all of the hacker public radio hosts that are on federated media?
You know, I might want to follow them. I'm following a few of them right now.
But you know, it's an interesting idea. Now, his talk is based on a paper and I've got a link to the paper as well, which has a lot more detail.
So if this is the sort of thing that, you know, interests you, you can get a copy of the paper and get some more detail on what he's doing.
But for now, this is Ahuka signing off for hacker public radio and reminding you to support FreeSoftware. Bye-bye.
You've been listening to Hacker Public Radio at HackerPublicRadio.org.
We are a community podcast network that releases shows every weekday Monday through Friday.
Today's show, like all our shows, was contributed by an HBR listener like yourself.
If you ever thought of recording a podcast, then click on our contributing to find out how easy it really is.
Hacker Public Radio was founded by the Digital Dove Pound and the Infonomicon Computer Club, and is part of the binary revolution at binrev.com.
If you have comments on today's show, please email the host directly, leave a comment on the website or record a follow-up episode yourself.
Unless otherwise status, today's show is released on the Creative Commons, Attribution, ShareLite, Free.O license.