Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
198
hpr_transcripts/hpr0227.txt
Normal file
198
hpr_transcripts/hpr0227.txt
Normal file
@@ -0,0 +1,198 @@
|
||||
Episode: 227
|
||||
Title: HPR0227: Local Squid
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0227/hpr0227.mp3
|
||||
Transcribed: 2025-10-07 14:22:29
|
||||
|
||||
---
|
||||
|
||||
So
|
||||
Hello there ladies and gentlemen, my name is Ken Fallon and today we will be talking
|
||||
about Squid Proxy for local use.
|
||||
Now first of all what is Squid, it's a free open source proxy server, what's a proxy
|
||||
server, well that's nothing more than a server that clients and an internal network will
|
||||
go through to have their request forwarded out onto the internet and the reason you might
|
||||
want to do that is in a corporate IT environment you typically will block access to the internet
|
||||
so that worms and viruses and all that sort of good stuff can get out if a PC is infected,
|
||||
they can get straight out to the internet.
|
||||
So they go through a proxy server that may or may not require a username and password
|
||||
to get out and then on the proxy server, dangerous URLs can be blocked or more than likely
|
||||
has been used for in the past is to monitor employee access so that they're not waste
|
||||
in time on the internet.
|
||||
That of course is illegal in some countries specifically Germany so you want to check
|
||||
the licensing laws in your country when applying a proxy server, however that is not why
|
||||
I would use a proxy or why what spawned me to install an proxy in the first place.
|
||||
There's really two reasons, so first of all is getting a secure connection to the internet
|
||||
when you're at a hostile location and the other one is getting around URL obfuscation
|
||||
or hiking of URL, well let's talk about the first one first.
|
||||
So say for example you've got a laptop and you're at an internet cafe and you want to
|
||||
securely check your email or you want to do some banking.
|
||||
Now there's no idea who you have no idea who's listening on that connection.
|
||||
So what you typically do, excuse me, what you would typically do is you would set up a
|
||||
proxy server on your home server, you would set up a secure shell tunnel into that server
|
||||
and then you would redirect your browser to use your local host and you would forward
|
||||
a port over the SSH tunnel into your browser, into your server at home.
|
||||
There's been a few episodes on that but a few podcasts on that, I don't think there's
|
||||
been any on the HPR network but I'll give you all what you need to do.
|
||||
First of all you need to install it and that is get install squid.
|
||||
So then you would create, once you've done that you've got a squid proxy listening on
|
||||
the default port of 3128 and I think by default and most distributions it allows the local
|
||||
host to connect out specifically for this reason but I won't let anything else.
|
||||
So then you drop to a shell and you would type SSH space minus capital L, space 3128
|
||||
colon local host colon 3128 and then use the name add server and whatever other options
|
||||
that you have.
|
||||
What that will do is it will open a SSH shell to your home server and it will say anything
|
||||
coming in on port 13128 on this piece in my laptop, PC in the hostel environment, you
|
||||
encrypt all the traffic and you pump it out the other side of the tunnel and you dump
|
||||
it up to the local local port of 3128 on your server.
|
||||
So then you open up a browser and you go into your browser settings and you set the
|
||||
proxy server to be local host and the port to be 3128 and in Firefox you'll set all
|
||||
the, use this for all the other ports.
|
||||
There's a nice little add-on called quick proxy I think for Firefox that will allow you
|
||||
just quickly turn on and turn off the proxy server.
|
||||
Okay, so that's handy.
|
||||
You could also use that theoretically if your company blocks, blocks access to the internet
|
||||
but they do allow SSH out, I would strongly advise you not to do them that because the
|
||||
traffic on an SSH tunnel can be examined with deep packet analysis and also it's kind
|
||||
of unprofessional so I would recommend do your work at home unless if your company has
|
||||
specific exemptions that allow you to do that, do your work at home, otherwise just don't
|
||||
do it, there's no point but near job at risk.
|
||||
Now the other reason, the real reason that I installed this was to get around URL upsthucations
|
||||
so word that I don't like so we're referring that to that as URL hiding.
|
||||
So what they tried to do is to try to make it very difficult for you to work out what
|
||||
the links are on the page.
|
||||
So they kind of do that too is, first of all they'll use a whole go of tricks so if you
|
||||
go to www.pchelp.pc-help.org you'll forward such upscure.htm you'll get a list of tricks
|
||||
but it's also very common now to use javascript to do that so they'll have several include
|
||||
files and some functions and they'll generate the URL based on different parts of the
|
||||
in different locations of the file and then they'll put it all together and they also quite
|
||||
common to use things like timed URLs where the URLs only remain valid for a certain period
|
||||
of time and the logic is that by the time you figured out, by the time you figured out
|
||||
what the URL is, the URL is no longer valid.
|
||||
So all that does matter amount of hill beans because if you pipe your redirect your browser
|
||||
to use a local host on either on a remote machine or use the squid proxy either on a local
|
||||
machine or on a remote machine you can just tail the squid logs and your browser will
|
||||
have done all the hard work for you and that will simply give you a list of guest commands
|
||||
or posts or whatever typically to get commands with the URLs all reconstructed for you and
|
||||
you can copy and use a copy them and use wget to get whatever you want.
|
||||
Now why would somebody be bothered to do that?
|
||||
First of all, most common one reason is to prevent the loss of their intellectual property
|
||||
and the other reason I have seen is to do restrictions on streaming servers.
|
||||
So let's deal with the first one first.
|
||||
An example of that would be a cycle like called marked platz.nl, doesn't know, which
|
||||
is a Dutch version of eBay, they're actually owned by eBay but they were bought over.
|
||||
And what they do is they put a, for some reason they don't want you to take while they
|
||||
don't want people to create databases of stuff on marked platz.
|
||||
So what they do is they hide the URLs and they put a transparent gif file over all the
|
||||
images.
|
||||
So if you're looking to create a scraper or something that would go out and you know check
|
||||
the website for you know new deals or whatever.
|
||||
So they have it so that they, if you right click on the image, it, they have a transparent
|
||||
gif over the image of say, you know, somebody selling a telephone or whatever.
|
||||
They have a transparent gif over the image itself and when you right click on that and go
|
||||
follow save as you get the transparent gif because that's over.
|
||||
You can, it's also on YouTube but for a different reason and that is that YouTube uses a nice
|
||||
trick where they have all the graphics on the page are sent down as one single image.
|
||||
So the YouTube logo, the up arrow down arrow keys, the stars, all the stuff that you
|
||||
see in a typical YouTube page is generated and sent down as one image.
|
||||
And then they use CSS to show certain parts of that image on the web page.
|
||||
I thought that was pretty cool.
|
||||
To get on to the other reason why they would use stream, they would you hide the URLs
|
||||
is for streaming servers.
|
||||
So you have regional licenses in place where TV traditionally has been broken off onto
|
||||
regions.
|
||||
And for example, if you want the latest episode of desperate housewives, for instance,
|
||||
the Republic of Ireland would have a different deal done than with the island of England.
|
||||
And they would have a different deal done with the Netherlands.
|
||||
So they would release it at different times and the charge different amounts depending
|
||||
on where you are and where you live and your market segment, bloody, bloody, bloody.
|
||||
However, that doesn't hold up any water in this internet age where people can connect
|
||||
into machines and where people can connect from anywhere in the world to your streaming
|
||||
service.
|
||||
So to get around that, they use a service called GOIP and that's at www.maxmind.com and if
|
||||
you go to slash app for slash support, you'll see that they actually provide free, and
|
||||
I don't know what the license is.
|
||||
I don't think it's GPL, but they provide modules for, let me have a look here.
|
||||
Various different, various different programming languages that provide a module for Apache.
|
||||
They provide a C-Lower, Bridge, Java, Class, Pascal, Per module, PHP module, Python and Ruby.
|
||||
They also provide Windows APIs, Pascal and various different things.
|
||||
And they also supply the GOIP country database in various different formats, CBS, MySQL,
|
||||
Oracle and that sort of thing.
|
||||
And they allow you to use the country data for free.
|
||||
They won't catch you that you need to automate the downloading of it yourself and they don't
|
||||
provide city or regional level or company name level, which they do do for, you know,
|
||||
to sign up to the service button.
|
||||
This is used by a lot of TV streaming people to block access to the playlists for the
|
||||
streaming service.
|
||||
Interesting side note is that quite often the streaming service themselves are proprietary
|
||||
blobs and they don't support GOIP and to put in firewall rules on firewalls or whatever
|
||||
slows down the streaming service.
|
||||
So quite often if you get the URL somehow, for example, having a proxy server hypothetically
|
||||
in your brother's house and you can use the Squid proxy over there to get the real audio
|
||||
file and you can look at that to get what the real audio stream actually is and hypothetically
|
||||
purely for research and reference purposes.
|
||||
You open up a player from say another country, for example, the Netherlands, you could hypothetically
|
||||
stream directly from the hypothetical servers that might hypothetically be in RTE.i.e.
|
||||
However, that of course would probably be illegal.
|
||||
Well, you're moral.
|
||||
Anyway, to install Squid, we did aptitude install Squid and here's a quick technical tip
|
||||
for working with config files.
|
||||
Now the Squid config file is very much like the traditional Unix or Linux config file
|
||||
in that it's got a lot of commons in it.
|
||||
It explains that the config file itself is self-documenting in that the documentation
|
||||
and on the values are in the config file and the documentation is typically commented
|
||||
out with the oct-torp hash character.
|
||||
And a quick tip that I've used here is how to filter out the common files and filter
|
||||
out the blank line files so that if you know the configuration of a config file rather
|
||||
than it would be in 16 pages long, you can cut it down to just the meeting bit it was
|
||||
over and get the very essence of all commands that are actually been run.
|
||||
So for the Squid config file, which is kept on ETCSquidSquid.conf, what you would do is you
|
||||
would run the command grip, space minus v and the v says rather than displaying all the
|
||||
files that you've found, it says don't display these files that don't display the lines
|
||||
that correspond to the this search entry, excuse me.
|
||||
So we have grip, space minus v, double quotes and we have the shark character which is the
|
||||
circumflex accent and it's typically over the six in the US keyboard, I'll include a link
|
||||
to that.
|
||||
Some people have referred to it as a Chinese hat before.
|
||||
Then you have the octetork character and what that is is the tic-tac-toe, what incorrectly
|
||||
is called the pound sign and the double quotes space ETCS for slash ETC for slash squid for
|
||||
slash squid.conf and then pipe that into grip, space min v, space double quotes, the Chinese
|
||||
hat again and the dollar sign and the double quotes and what that does, the first one says
|
||||
look for any hash marks that are at the beginning of the line and the second one says look for
|
||||
any beginning in the line and the end of the line where there's no space between them,
|
||||
essentially a carriage return line and don't show those.
|
||||
So essentially then you get a nice list of just the comments in the file.
|
||||
So I'll include links in the show notes to that and what actually if anyone is looking
|
||||
for an idea to do a hacker public radio episode, deviate and doesn't want to do a LPI certification
|
||||
module, although I don't know why because all the documentation is there all you have
|
||||
to do is do it and then a topic on regular expressions would be very cool because while
|
||||
the syntax changes more or less a little bit between Pearl and different websites and
|
||||
bash and whatever, sorry, Pearl and bash and the syntax changes between programs and the
|
||||
ideas more or less remain the same.
|
||||
Okay, the important config changes that you need to make in that file are ACL local holes
|
||||
source 172.0.0.1 for such 32.
|
||||
More than likely that will be allowed and to allow access from the local holes you need
|
||||
to have the line HTTP underscore access allow space local host also uncommented.
|
||||
Now if you want to have a proxy server on serving the machines on your local network and
|
||||
use private net networking, private subnets, you also need to have the line's ACL local
|
||||
net space source space and then you have 3 10.0.0.0 for such as 172.16.0.0 for such 12 192.168.0.0.0 for such 16,
|
||||
whichever one of those you're using uncommented probably already is, then you need to scroll
|
||||
down to the little bit further in your file and look for another line which says HTTP underscore
|
||||
access space allow space local net and once you do that save that file,
|
||||
you can restart squid by calling ETC forward slash in a D forward slash squid space restart and
|
||||
then you will be able to proxy out from the internal network. Now one good thing about squid is
|
||||
for some reason you're not allowed out, you will still see a squid error message coming up
|
||||
in your browser so you'll know that that part is working. If you don't see anything then your tunnel
|
||||
is probably not working correctly or you can't communicate with your proxy server. So
|
||||
that's a good good way to know where they are lies. If you don't see anything,
|
||||
it's a tunnel issue communication with the proxy server. If you see or on the proxy server,
|
||||
you know it's a permissions issue on the proxy server. Another thing that you can do is go to
|
||||
whatismyip.com or ipchicken.com or moremyip.com and you will be able to see that your IP address has
|
||||
changed to the IP address of your machine at all. Okay ladies and gentlemen, well that's been
|
||||
another episode of Hacker Public Radio. It's actually the fourth time that I've recorded. I tried
|
||||
to record it this morning on the train platform. I'm going to work on my bicycle and on the train
|
||||
following advice from Davids and that did not work out at all. Okay, I hope you found something
|
||||
interesting on that. I am available as always. The email address can.fanon at gmail.com. You can also
|
||||
see comments on Hacker Public Radio.org for this episode or it should also be available on my blog
|
||||
canfanon.com. Feel free to send me your comments and suggestions and that's all I have to say for now
|
||||
and I wish you all a very good day. Thank you for listening to Hacker Public Radio.
|
||||
HPR is sponsored by caro.net so head on over to C-A-R-O-DOT-E-C for all of us in the
|
||||
Reference in New Issue
Block a user