Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
184
hpr_transcripts/hpr0279.txt
Normal file
184
hpr_transcripts/hpr0279.txt
Normal file
@@ -0,0 +1,184 @@
|
||||
Episode: 279
|
||||
Title: HPR0279: cfengine
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0279/hpr0279.mp3
|
||||
Transcribed: 2025-10-07 15:26:29
|
||||
|
||||
---
|
||||
|
||||
music
|
||||
Hello everybody and welcome to today's episode which is on CF Engine and from Wikibedia
|
||||
CF Engine is a policy-based configuration management system written by Mark Burgess
|
||||
from Oslo University. Now I first come across CF Engine when I started working at Shuburg
|
||||
Phyllis. A colleague of mine, Ian Seldom, was of great help in explaining not only how
|
||||
it works but also the philosophy behind it so I thought this would be an excellent opportunity
|
||||
to get Ian around the table here so that we can explain what CF Engine is and what it does.
|
||||
So welcome to you. Hello, how long have you been using CF Engine?
|
||||
I first use CF Engine back in the 90s. Probably about 98 or 99. Has it changed a lot?
|
||||
Not particularly. No, I mean obviously there's a great, there's the normal sort of function creepy
|
||||
get where you get a new features coming but the fundamental idea behind CF Engine is remained
|
||||
pretty much the same since the beginning. So what can you do with it? Oh, maybe we need to
|
||||
wind back a bit. Why did we ever need CF Engine? We used to have IT departments and in our IT
|
||||
department we would have a big mini computer. I don't know, a honey well or a DPX20 or whatever.
|
||||
One machine, one operating system, 60LG uses all on the O Green screens. Then we came into clients
|
||||
a client sort of models of computing where we suddenly had and the explosion of the internet
|
||||
where we had hundreds of hosts suddenly and trying to manage our 100 hosts to make sure that
|
||||
they're all the same, that the host files are the same, that the DNS configurations are the same,
|
||||
that the Apache web service configurations are the same. It's just a nightmare and it's not what
|
||||
you want to be using your brain as for. So I think what most system administrators is quite a bit
|
||||
a clue at the time and at least those quite a bit of clue. Decided to do was to start writing all
|
||||
sorts of R-Sync scripts and RSH scripts and SSH scripts which went out and tried to manage these
|
||||
systems and you basically got two big disadvantages here. Firstly every system admin takes a different
|
||||
approach. Somebody's written one thing in Perl, somebody else has written another thing in Shell
|
||||
or whatever and so there's no real consistency to how you approach the problem. In the second problem
|
||||
is that you run your magic installation script but one week later one of your colleagues is going
|
||||
to change something and your beautiful configuration is no longer consistent. In fact your systems
|
||||
are diverging and after one or two years they'll become sufficiently divergent. There is
|
||||
starts to cause your problems. They actually will cause you downtime or system problems or
|
||||
inaccurate data or information or so. Any other things that we hear in IT just don't want to have.
|
||||
I come from the school where you take a server, you put in a devian CD, you do a minimal install,
|
||||
you install the packages you want and then you manage that system by pushing out scripts via
|
||||
SSH. What's wrong with that idea? Well there's nothing in Perl saying much wrong with the
|
||||
providing one you're the administrator so there isn't anybody else doing it some other way
|
||||
and doing it in a way you're not expecting or putting the files in some other place or editing
|
||||
them by hand so it's that you actually go and regress your previous changes when you run your
|
||||
script. I'm familiar with this concept. So I mean great if you've got 50 hosts and you manage
|
||||
them yourself when you're very controlled and you're like nobody else in that system will probably
|
||||
work. Of course the second problem you've got is that in the meanwhile you've probably been
|
||||
writing some little web page somewhere or in the modern day parlance a wiki page or some
|
||||
documentation in your boss's favorite Microsoft Word format or whatever and after about 10 days
|
||||
that documentation is also completely out of date and it's again it's useless. So CFNG
|
||||
offers you a way of doing all these things you used to previously do with your ad hoc scripts
|
||||
but in a structured meta language which means everybody does it the same way it's very very
|
||||
flexible and it offers a high degree of self documentation which is incredibly useful so
|
||||
somebody who comes after you can see what you've done and how you've done it. I mean
|
||||
the value of that is currently understated in actual fact. And the second thing is of course
|
||||
it has self-healing so your policy within your self CF engine rules that is going to be
|
||||
continually continuously apply to the service which belong to the class. Okay now a service
|
||||
which you've defined. Can I just stop you there because I think we need to go back and explain
|
||||
to everybody what CF engine can do. So I'll give you a rundown from what I found on the web.
|
||||
You can check file permissions, ownership and fix them. It can restart fields, demons,
|
||||
it can install software, it can edit files, it can execute commands, it can configure networks,
|
||||
it can compressively files and it can so essentially you can make scripts that can do all these
|
||||
things and basically manage it. You can define rules such which do these things. Okay and theoretically
|
||||
you can do anything with CF engine because ultimately there is always the action type shell commands
|
||||
and in action type shell commands you can remember what the heck you like. It requires a slightly
|
||||
different way of thinking and the thing you shouldn't forget is of course all of these things can
|
||||
be combined. So quite often you're not looking at doing a single thing. So for instance you have a
|
||||
web server. Now your web server first you have to create some directories, you have to set the
|
||||
permissions correctly on the directories, you have to send in a web server configuration file.
|
||||
When the web server configuration file updates you want to restart the demon. You want to make sure
|
||||
the illusion is certain place and the elogs get rotated each evening and when after a certain time
|
||||
they get deleted. So in actual fact what you do is using something like seven different functionalities
|
||||
of CF engine in a single workflow. So if I change these directories or if I update or write
|
||||
update the web server configuration I need to restart the demon. If the demon is no longer running
|
||||
because it's crashed for instance I need to restart the demon. If it is the first quarter after
|
||||
midnight each day I need to go and rotate the logs and restart the demon. At four o'clock in the
|
||||
morning I need to go and delete any files which are more than 222 days old or whatever. And all of
|
||||
that is just simply done for you. These are the sort of things which classically kill systems
|
||||
that disks get full because like files I get rotated properly. You've got a farm of 14 Apache
|
||||
web server instance and each web server configuration is slightly different than the other one.
|
||||
Your permissions are maybe inconsistent so one of your web servers may have very very liberal
|
||||
permissions which allows your server to be hacked or content you prefer wasn't there to appear there
|
||||
or obviously it doesn't allow your users to actually upload files which maybe they should be
|
||||
allowed to. So they have a 1 in 14 chance of failing or something. With this sort of approach those
|
||||
sort of problems that disappear. It installs on basically all the versions of Unix that there is
|
||||
and through single init you can install it on Windows. Correct. There is actually a native
|
||||
compile for Windows but it requires some patching and it's a bit messy.
|
||||
I've got three servers. I want to manage two one as an admin server.
|
||||
Oh well yeah if Mark was here Mark the author he probably wouldn't like I'm going to describe
|
||||
this because the way we use CF Engine is to use it in a centralized way. So we have a policy
|
||||
host which defines how everything else is going to look. Of course CF Engine doesn't have to work
|
||||
that way. It can have multiple policy hosts. It can be very very distributed.
|
||||
That I'll leave as an exercise to the reader. There are some links I believe which come with this
|
||||
broadcast with people can read more on that but we'll look at it from the centralized point of view
|
||||
from well because effectively we're a data center where I manage services companies so we're not
|
||||
interested in giving people freedom. We're interested in making sure things are right and
|
||||
to keep working. The centralized model works very well for us and so effectively you're going to
|
||||
define a policy host or multiple policy hosts for redundancy. Some of our customers require
|
||||
that we're able to work in across geographical distance into separate data centers etc.
|
||||
Within this you then have a repository. Repository can contain your files and your rules which
|
||||
you're going to use. So let's see if I can just distribute. That's just a directory tree. It is
|
||||
just a directory tree which within your server configuration you've said I will allow CF Engine
|
||||
clients to access files within these direct trees. Put your control files in there and
|
||||
the other files come in, do the other clients come in and get them or do we push them out or
|
||||
how does that work? It's on to not a pool basis so effectively this is one of one of Marx's key
|
||||
things is that the idea is that the server never does something to a client without the client's
|
||||
permission. So there is no there's no idea of duress or force or I'm the boss and this is how it's
|
||||
going to be. The clients ask for the new configuration and so and then they and and when they get
|
||||
the configuration they then run the rules locally and that may involve getting some files of the
|
||||
policy host it may it may involve brunison edit files rules or changing some permissions or
|
||||
updating the local password file any number of things. This is actually the point that got me in
|
||||
the old presentation. When we were talking about a presentation here just for the listeners
|
||||
company flew in Mark and he got the presentation here but it was this idea that got me from my old
|
||||
old view of putting the CD in the server it was they you know they he had a diagram of a black hole
|
||||
and he had a diagram of of amounted and the idea was that the CF engine way is you're always
|
||||
going to try and get to the to the center so like in a pinball machine you throw the pinball
|
||||
up and no matter what happens your server will return to a well-known configuration.
|
||||
Yeah they're close to the standard way of which would be you have a blind man at the base of
|
||||
the mountain and you say okay you walk 10 feet this way you go up you cross the bridge you go up
|
||||
the canyon and if anything happens on if anything changes down through time if the path you know
|
||||
an avalanche comes and you can no longer pass the path your server is no longer in the same state
|
||||
that kind of got me that you're never going to have a perfect server. No you can't I mean
|
||||
that that's the whole point there are four or five good configuration management tools out
|
||||
the most of them at GPL so that they're entirely free to use and for some some people some
|
||||
approaches may work better than CF engine. CF engine is all about accepting that there is no
|
||||
perfect state. Your owe is moving towards conversions and eliminating divergence but the
|
||||
idea even if you actually build systems from from images and you make them read only
|
||||
yeah immutable file systems you still got differences you're still going to
|
||||
differences within your cashiers within your memory footprint within your within your logs
|
||||
so the systems are never identical so the whole point of controlling systems with CF engine
|
||||
is identifying what matters and making sure that that is controlled and accepting that the rest
|
||||
doesn't need to be and to something great actually accepting it doesn't matter yeah it simply
|
||||
doesn't matter. This is what we wanted to do yeah that's right and there's a lot of things you
|
||||
just really don't care about in the end and that's yeah that's the CF engine the the CF engine
|
||||
also if you take convergence has been the primary idea what's brilliant about CF engine of course
|
||||
is if you already have a farm even if it's a farm of hundreds of servers you can begin very small
|
||||
with conversions you can say I'm going to write one rule which just simply controls the host file
|
||||
for all my systems and then you've got you know if you have some sort of imaginary goal of 100
|
||||
percent convergence you've just gone half a percent to which you want to be you've got a second
|
||||
rule which you go and make sure all the users all these system administrators public keys are
|
||||
putting the right place and you then two and a half percent closer to you you're black hole
|
||||
okay do we not have a security problem where we've got all these machines being controlled by one
|
||||
central location what's what's the deal here with that yeah of course I introduced this is a
|
||||
security implication because it means you have to make sure you protect your CF engine policy
|
||||
host very carefully but yeah and you also have to balance I mean if you if you do not know the
|
||||
state of of machines within your park and remember a company like Schubert Phyllis we have literally
|
||||
thousands and thousands of servers working after here I mean we're not going to log on to all of
|
||||
these servers every day and in actual fact we don't want to we're always aiming for a point we're
|
||||
a good a good wall behave servers when you never log on to so natural fact is the other way around
|
||||
some systems like to sit there with the multi-tail watching watching log files go past every day and
|
||||
logging on to all the machines and give them a little stroke we don't want to do this we just want
|
||||
them to just work and so if if we go back to the security implication a non-managed system is
|
||||
of course a huge security problem because you don't know what's there yeah you could you could
|
||||
if you've got a thousand hosts and ten of them have been in some way hacked owned whatever you
|
||||
want to call it you could be blissfully unaware of that maybe until you get your bandwidth bill or
|
||||
something you know whereas whereas of course for the policy host you provide your policy host is
|
||||
adequately secured man you've made sure that you know who who's allowed that who's using get who
|
||||
can edit the files you make sure that you've got a CVS control and so on on the on the on the
|
||||
fire revisions you know exactly who's changed what you're in a far stronger position but you are
|
||||
indeed introducing a centralized is that machine gets owned then you're in a lot of trouble
|
||||
but you only have to watch one machine now okay so we've got a server file and we want to control
|
||||
that what are the steps it's greenfield site okay here's we're going to define a policy host
|
||||
within our policy host that's where our CF engine server is going to run that's where we're
|
||||
going to keep our rules files that's where we're going to keep our repository of files we're
|
||||
going to distribute so that's where all the other servers are going to enter this is the magic
|
||||
centralized server and I stress again as I said earlier of course this is not the only way of doing
|
||||
it this just happens the way yeah the idea here we have links and shounals anybody who's
|
||||
interested can go there and and find out the other ways of doing it but yeah it is one of the
|
||||
joys of CFNs now that it doesn't tie you down to one model you can use spoke and hope you can
|
||||
use centralized you can use distributed it it really doesn't matter okay we've got our
|
||||
CF server derelict yeah that's right and I mean the simplest configuration we're basically going
|
||||
to go run a CF agent on all of our hosts we wish to control and the CF agent's going to
|
||||
addy specific timings of what which we which we control log onto the policy host grab the
|
||||
grab the files execute the rules and in the end in the modern CFNs parlance each policy
|
||||
statement or rule is considered a promise and it's going to then go through these files and
|
||||
check each promise and then at the end it'll even tell you that a 100% of promises have been
|
||||
capped or maybe 90% of promises being capped and 10% needed to be fixed so and the promise can
|
||||
be anything the most simple thing is going to say my et cetera shadow file must have permissions of
|
||||
400 yeah and if you're et cetera shadows file has permissions of 644 which would be rather
|
||||
unusual but yeah it did it's going to change from about to 400 and that's going to be reparing
|
||||
the broken promise okay and thanks very much it's a lot to digest but again all the links
|
||||
for this is in the show notes and hope to be pulling in here again to do some more
|
||||
episodes well that that's great and thanks for your time
|
||||
thank you for listening to Hack or Public Radio
|
||||
hpr sponsored by caro.net so head on over to c-a-r-o dot anything for all of those
|
||||
Reference in New Issue
Block a user