Files
hpr-knowledge-base/hpr_transcripts/hpr0071.txt

132 lines
7.4 KiB
Plaintext
Raw Normal View History

Episode: 71
Title: HPR0071: Beowulf Cluster Introduction
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0071/hpr0071.mp3
Transcribed: 2025-10-07 10:59:41
---
You
Hey, it's Steve Teak and welcome to Hacker Public Radio, episode 71 for Tuesday, April
8, 2008.
Today's topic is the Bay of Wolf Cluster.
You know, you can make a career out of certain kinds of computing, but I don't want to make
a mini-series out of this topic even though I could, so I plan to talk in terms of concepts,
so you can figure out what to do if you want to build one of these on your own.
Or by that I mean that I don't intend to do have a real handholding experience, I just
want to try to keep this as sure as I possibly can.
So let me draw for you a little roadmap in your mind's eye of where this kind of compute
cluster is and the scheme of cluster computing.
Basically, there are two broad categories of cluster computing, high availability and high
performance.
High availability is, for example, where there is an application that cannot go down and
so a cluster of computers is formed in such a way that if any computer fails, the other
computers in the cluster can carry out the work so the application stays up, although the
performance of the application can be affected.
The other category, high performance, is where the Bay of Wolf Cluster design fits into
place.
The Bay of Wolf is built for speed and more specifically for parallel processing.
I'm sure you know what a cluster is, there's a bunch of computers linked together to get
something done faster.
If the computers are linked together in any old, old way on a land, it is called a cow
design, which stands for cluster of network, oh, I'm sorry, cluster of workstations.
Some call it a now design, which stands for a network of workstations.
They're essentially the same thing.
But a Bay of Wolf is different and the biggest difference is in the topography of the land
it is on.
A Bay of Wolf has a head node and on a private network, which is to say a network dedicated
only to the Bay of Wolf cluster.
A bunch of other nodes, some call this the master and the bunch of slave nodes.
The head node sometimes has an additional network connection to the regular land, so people
can secure a shell into the cluster.
But all the slaves and the head have a non-commingled network.
This is the major difference between a plain old cluster and a Bay of Wolf.
There are two more criteria that set apart the Bay of Wolf type cluster, and the first
of those is that a Bay of Wolf cluster consists of Cots, C-O-T-S, which stands for commodity
off the shelf hardware.
Imagine calling up a retailer like Dell or Gateway and saying, hello, send me five of
your cheapest ATHLEAN64s.
That's the kind of concept we're talking about.
Now online and linked off the Bay of Wolf.org website is a book called Engineering a Bay
Wolf Style Compute Cluster by Robert Brown of Duke University.
He says it best, quote, the point of Bay of Wolfery has never been to glorify Lennox
per se, but rather explore building super computers out of the mass market electronic equivalent
of code hangers and chewing gum.
I love that quote.
The other major difference is that all the nodes run FOSS, F-O-S-S, or free open-source
software.
Some of these clusters can get to be in the hundreds of nodes, and the idea of either
paying license fees or not being able to fix something because closed-source systems
just don't let you get under the hood of the computer is an anathema to this form
of computing.
So what are these used for?
Well, many of them are in scientific and academic computing.
Things where physicists and astronomers have to run really big chains of calculations.
I'm talking about multi-day runs of calculations on many computers.
The closest thing we have to that kind of need is hackers is pre-computing the hashes of
all the passwords and a certain space of passwords or distributed cracking of a password protected
files.
I'm not talking black hats stuff here.
It is totally possible that a hacker may want to see the strength of a security scheme
by breaking it as a test to the system, or that a hacker may be employed by a law firm
to decrypt files under court order.
Personally, I had what can be called a status desire.
Now, I admit it, I like to have a muscle computer.
And what says muscle computer better than a Baywolf cluster?
You know, can you imagine, oh, I got a Dell, I got a Gateway, what do you got?
I got a two-note Baywolf cluster.
What?
You know, that kind of thing.
The thing is that I already own a muscle computer and I did not feel the need to actually
do this until recently.
I want to give Shouch to Klatu for his excellent HPR series on video encoding.
And now that I want to save my collection of Japanese anime in the original format, as
I got it in as well as the free Fiora format, I have the need for more computing power.
So that is my example.
And it does not even require you to be a nuclear physicist who want to do this.
Just be a film or anime buff and you can put this to good use.
Here is how I did it.
Once again, I would like to do this as a conceptual overview as I feel this episode is already
long enough.
So I'm going to move quickly.
Perhaps the feedback indicates a desire for detail on a certain point.
Maybe I will have the inspiration for another HPR episode.
Okay, I have an AMD 64 monster on the desk and a Pentium 4 laptop and I want to link
this as a bail of cluster.
I can always go for more notes later if need be.
Step one is hardware.
A separate LAN card and a crossover cable will do fine.
It's that simple.
After installing the LAN card, it was a matter of defining a second network.
So I had 192.168.1 whatever as my regular LAN, which is how I get out to the internet or
AVMO computer and I have 192.168.2 whatever as my second network.
So why a second network?
It should be an obvious question because I can just as well go into this to a regular network.
Well by closing off the slave computer to the head computer only, we can stop worrying
about security headaches.
Once my slave note came up, I installed plain old telnet server on it.
Normally SSH would be used, but encryption robs us of compute cycles, which I want to
use for the video encoding.
The second piece of software for this project was NFS, the Unish network file system.
Again, no worries about outside intrusion.
You define in the configuration files that you are only exporting file shares to the
second network.
That is all it took.
I then brought up the laptop, were on a script that wrote that brought up telnet and the
second network and telnet it into my laptop.
I used a tab X term and had one tab be my movies directory on my big desktop computer and
the other tab was the telnet session to the laptop, where I changed directory to the
exported file share that was my movies directory on my desktop.
In this way, I was able to run ffmpeg2theore on the same directory from two computers to my desktop.
Basically, the desktop is twice as fast as the laptop.
So now, in the time it takes to convert two enemies to ffmpeg2, I can convert three very
nice.
I might take another computer out of mothballs and go even faster, but that is a project
for another day.
So I am going to wrap it up now in the interest of keeping it short.
So thank you for listening to Hacker Public Radio.
Feedback is always appreciated at hpr at deepgeek.us.
Thank you for listening to Hacker Public Radio.
hpr is sponsored by caro.net.
So head on over to caro.nc for all diversity.
Thank you for listening to Hacker Public Radio.