- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
292 lines
26 KiB
Plaintext
292 lines
26 KiB
Plaintext
Episode: 316
|
|
Title: HPR0316: Raid LVM
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0316/hpr0316.mp3
|
|
Transcribed: 2025-10-07 16:07:06
|
|
|
|
---
|
|
|
|
music
|
|
Hi everybody, just a quick note before the episode begins. We recorded this episode and
|
|
then we went to Jacobabia Gradyo and saw that Kevin Benker has released an episode on
|
|
LVM2, episode 314. He deals with LVM2 in great detailness episodes. I suggest all listeners
|
|
who are interested in also listen to that show. In this show, I deal with LVM at a much
|
|
higher level and it's at the end of the show. So if anybody doesn't want to hear it
|
|
mentioned by LVM2, they could just ignore the last part of the show. Thank you very much
|
|
and now on with the show. Hello everybody, it's Mark Clark here and welcome to another episode
|
|
of Jacobabia Gradyo. This episode I'm going to talk about raid and logical volume manager,
|
|
LVM and how this can be used as both the home user and small businesses to great advantage.
|
|
First I'll deal with raid before moving on to logic of volume manager or LVM. So the first
|
|
question is what is raid? Raid stands for redundant array of inexpensive disks. Now there are
|
|
a couple of principles behind raid which you need to kind of know is to understand how raid works.
|
|
One is that raid provides redundancy. So this means if one disc goes, your system keeps
|
|
on operating. It might be in a reduced state, but also if you're operating your data will be safe.
|
|
Another is also provides speed improvements. This is only involved when you talk about strapping.
|
|
So basically you have to do this reading at the same time, feeding data through to the CPU
|
|
and obviously you get higher 3 foot 4. These two principles that were combined together in
|
|
different ways to come up with the categories of how raid is defined. Typically there are three
|
|
well known categories for raid is raid 0, raid 1 and raid 5. raid 0 was the first raid that came
|
|
out and basically what this did was to strap the data across the disks. So as I was saying,
|
|
you're reading data or two disks at the same time when you're reading it into the memory,
|
|
there's a lot faster and you get this speed improvement. Here there's no redundancy in raid 0,
|
|
it's just got strapping. raid 1 came up with also known as mirroring, was basically two jars
|
|
of mirrored. So what happens there are essentially two disks, but only one disk is usable from the
|
|
operating systems point of view. Another disk is basically a copy of the data. Should the one
|
|
disk fail, the other disk can take over and the machine can continue to operate in a reduced state.
|
|
Then came raid 4 and raid 4 better to combine these three principles. We have strapping,
|
|
you need at least three disks for raid 4. For murdering obviously you need these two disks.
|
|
For raid 5 then you have three disks and the data strapped across the three disks as well as
|
|
what they call a parity data is put across the three disks. So one of the other two can continue
|
|
to operate in a reduced state. This is you get mirroring and you also get the three-foot gains.
|
|
Going to think about raid 5, you start adding one more disk to the raid, you only lose one,
|
|
so as you grab, say you've got five disks in there, you're still only using one disk. So as a ratio
|
|
of this space lost to the redundancy, as you increase raid 5 is quite scalable from that point of view.
|
|
Now do you know there are many other raids out there these days like raid 6, raid 10,
|
|
1o depending on how you call it. These are basically just more permutations of the above.
|
|
So if you want to find out more about the new more advanced sort of raids out there,
|
|
you can go to Wikipedia and have a look at where they got them documented. But there's something
|
|
which is subject to change and the first three that are outland of the well-known ones,
|
|
which you might see who will talk about when you're talking to them about raid.
|
|
Another aspect of raid that you might come across is hot supperable. Now what does hot supperable
|
|
mean? Hot supperable means that there is either a spare drive which is existing in your raid,
|
|
typically raid 3, even in your bird drafts. Within the 1,000 automatically takeover from the other
|
|
and the raid will start with the core rebuilding on the spare drive. Also sometimes a fear to the
|
|
fact that you can hot deploy a drive. So you have to turn the machine off of your draft files,
|
|
you move from the raid, plug in a new job, add it to the raid in a way you go without having to
|
|
advance your machine. So these are things that will also come across if you will talk about raid.
|
|
And the utility and learnings that we're going to be using MD, Edmund or multi-disk raid,
|
|
software raid allows you to do this as well, especially with shorter drafts.
|
|
You can use it on RDE drafts as well, but the hot supperable component isn't recommended
|
|
with RDE drafts. Okay, so one of the people come across when dealing with raid now is different
|
|
types of raid. It's hardly raid, it's supper raid and it's something which is called fake raid.
|
|
Now hardware raid doesn't go high in servers and other machines, essentially it's a piece of
|
|
hardware that handles the raid itself. So the operator says completely unaware of it, it just
|
|
says there's one disc and I won't really be dealing with that in this episode, I'll be dealing
|
|
with more supper raid. So supper raid is where the raid is done in the operating system itself,
|
|
at operating system level. Okay, and this requires some CPU time and resources and that's why
|
|
sometimes in the larger machines you use hardware raid, let's upload it to the dedicated hardware
|
|
controller, the disc controller. Another thing which you hear about these days is called fake raid.
|
|
Now fake raid, if you get most of the modern motherboards and you got salted drafts on it,
|
|
they will like to configure raid in the BIOS level. This is called fake raid because essentially
|
|
it's supper raid handled in the BIOS. And you know most people recommend that you do not use this
|
|
when you can use the supper raid that comes with Linux. The only argument I've come across
|
|
for using this kind of the built in BIOS level raid is when you drill beauty with the windows
|
|
in the windows partition. That's something which I never do, so it's not something which I
|
|
which I've tried out. Okay, so let's have a look at one of the benefits for raid. What I mean,
|
|
this is quite obvious in terms of speed improvement and redundancy. It's not only businesses that need
|
|
to look at these things or think it's only advantage for them in it. You also home users can
|
|
benefit from it because one thing that Linux software raid allows you to do, it allows you to
|
|
only use certain partitions of two drafts as a raid configuration. So essentially you don't have
|
|
to lose an entire disk to your raid setup. If for example you only want to have redundancy in your
|
|
home directory and everything else will just be on a normal partition because you can reinstall it
|
|
and popping in the destroyed disk and reinstalling the machine, then you can do that. And I'll cover
|
|
that when I talk about MD admin raid or multi-disc raid. For business as well, would you want to use
|
|
software raid as opposed to buying a server with a controller in it? Obviously the hard way is
|
|
much more expensive than you have a built-in controller. So for smaller business as SMEs, you want
|
|
some redundancy, you want some of the benefits of raid, but you don't have to pay for the more
|
|
expensive machine than you should definitely consider using software raid. And the big thing obviously
|
|
for small businesses as well is that you can have little downtime with software with any kind of
|
|
raid, but with software raid as well. So for example, on your disk guys you don't have to have your
|
|
machine done, you can continue to operate, you can continue to provide services to your staff
|
|
with your customers, and then you can replace the disks, you know, when the appropriate maintenance
|
|
time arrives or as soon as you can, it's always better to replace as soon as possible. But you only
|
|
have to do the machine as an offline until you replace the drive as such.
|
|
One last thing I want to talk about before actually going into multi-disc raid is to talk about
|
|
linear raid. Now linear raid basically, you can use it with the software raid in Linux, and it
|
|
makes two years look like one big disk. However, I don't suggest people use software raid for this,
|
|
LVM is far better solution for combining disks into one larger disk.
|
|
Okay, so let's start looking at software raid. The utility in Linux is called MD admin, it's MD ADM,
|
|
and basically as you demonstrate the software raid, it's also known as MD stands for multi-disc.
|
|
I'm not quite sure how he will pronounce it, I'll refer to this MD admin, the utility throughout
|
|
this episode. Now one of the key things to understand when trying to get your mind around how
|
|
a raid works in Linux is to have a conceptual model that enables you to understand what's going on.
|
|
Because most of you when I approach raid like myself, we use to physical drives with physical
|
|
partitions, and then we tell the Linux operating system that this is the physical partition as
|
|
you're going to use. What happens with the software raid is that the partition that the operating
|
|
system sees is actually abstracted away from the physical disks, and this is important concept to
|
|
keep in mind. So for example, when you're operating system, you'll be routing not to like SDA 1 or SDA 2,
|
|
you'll be routing to the multi-disc, the vast, the vast driver or the vastes, and that is all like
|
|
dev slash dev slash MD zero, MD one. So essentially think of those now as SDA 1 and SDA 2,
|
|
from the same role, not many different point of view as SDA 1 and SDA 2 did, for example,
|
|
when you're using Linux without raid. So conceptually now, you're going to create these new
|
|
lock fake drives for full operating system, and this is where the MD admin utility comes in.
|
|
So essentially until this drive, the MD zero, MD one, MD two is made up of the following physical
|
|
hard disk partitions. So you've got that in mind in what you'll do is you'll take your hard disk,
|
|
you'll partition it, and then you'll add the partitions to the multi-disc drive, and then
|
|
the operating system just sees a multi-disc drive. So why is this conceptual model important?
|
|
It comes into play, for example, let's say you'll take a multi-disc drive out of one machine,
|
|
and you want to mount it on another machine. You cannot mount that partition directly under SDA,
|
|
when you can try like SDA 1, but essentially you'll probably destroy the data on that disk,
|
|
because it has to be first added to a raid, because you've got a whole lot of raid information
|
|
and then you mount the raid device, the MD zero. So typically if you get a machine that comes in and
|
|
for some reason it's not booting or something wrong with it, you need to get the data off that disk,
|
|
what you'll do is you'll take the, this is a raid one, you'll take the one surviving raid one
|
|
drive out, you'll have to create a multi-disc drive, and then you mount the multi-disc
|
|
device under Linux, and then you can start reading the data off of it. So for example, you
|
|
shouldn't run if it's checked directly on the SDA 1, if it's part of a multi-disc raid,
|
|
you have to if it was mounted under MD zero, and then run the check on that.
|
|
So I'm not sure if I explained that very well, but you know what I do find that to understand
|
|
the software is very simple, actually once you understand conceptually how it works,
|
|
but it's getting a conceptual model in your mind sorted out, to know if you understand what's
|
|
going on at the operating system level, that's very important. So that's why I've
|
|
labed with the point a little bit there, and I hope it's clear, you know, I did my best to try
|
|
and explain to people, but hopefully it's giving you enough so that you can go on and build
|
|
the own conceptual model of how this works. So I'm going to go on to the steps of how to
|
|
create software raid in Linux. I'm not going to read out the commands as you start from
|
|
the screen, because I find it doesn't work for me, it doesn't work that well in podcast format,
|
|
but at least I'll guard you a bit, and I'll also provide some links and the show notes to
|
|
online tutorials that will explain that better for you. One thing to note is that software raid is
|
|
very, very robust, you don't have to be afraid of it, it's the key thing, it's just to start
|
|
using it, once you start using it, you'll be amazed at how robust this software really is,
|
|
and how great it is for actually getting raid very cheaply into your infrastructure.
|
|
So first time we see that raid is during the installation process of your distribution,
|
|
that distributions have different when you get to the screen when you're partition it,
|
|
some of them don't really explain the possibility of raid very, very well. I found the Ubuntu
|
|
installer was much easier to use than the raid had installed. The raid had installed was very
|
|
confusing to use the setup raid, but the steps are basically the same. When you're sitting
|
|
up raid, would you do installation time or installation time when you're adding another partition
|
|
to your system? First of all, you have to partition the hard, as you normally do, STI1, STI2, STI3,
|
|
okay, what you need to do then is various partitions that you want to use as raid,
|
|
you're going to have to always have two discs at least, and you're going to partition the
|
|
the similar sizes, and you can have one partition bigger than the other partition that you
|
|
mirroring to, but that's just wasted this space, because you're going to mirror up to the
|
|
the minimum space that you have there. So you partition your jars, and when you do that, you
|
|
must, you go into the using if disk, you must toggle the top of the partition and you set it to
|
|
raid auto detect, okay, which is normally just if D, which is a number that is used to flag
|
|
it as a raid auto detect partition. So now you've done the first part, you partition the jar,
|
|
that does not mean now that you have set up multi-disk yet, you've only set up the partitions,
|
|
now you need to go and set up the actual multi-disk draft, okay, this is where the MD0, MD1,
|
|
and MD2 come in. So then you go and you say, okay, and if you're using the installer,
|
|
you will say I only set up a multi-disk raid, and you add the partitions that you want to the raid,
|
|
okay, if you're doing this at the command line, after you installed, it's so much easier to
|
|
conceptually understand, you're going to create, you'll use the MD admin, create command,
|
|
and so I'm going to create the MD0, and I'm going to add the following two devices to it.
|
|
At creation time, you till it, what kind of raid, this is a raid one, it's mirroring,
|
|
it was a raid five, or raid zero. So you can, at that point, you till it, what type of raid you
|
|
constructing. One thing to remember is that your breed partition could only be on a raid one
|
|
of us, it can't be on any other raid, because your machine just won't boot, so till it only
|
|
is raid one for your breed partition, other partitions, you can use other tasks of raid that are
|
|
available. Okay, so once that's done, then you can then start installing your machine,
|
|
or setting up your new partition, as you would any other like SDR, SD1 partition, but to say,
|
|
using div MD0 and MD1. Okay, once your raid is set up and your boot into your machine,
|
|
you can see the status of your raid devices by using MD ADM utility, MD admin utility,
|
|
or you can just go and cat the proc MD stat file and see what status your draws in. Now,
|
|
typically if a draft fails, you will have to mark the draft as faulty using the MD admin utility,
|
|
and then remove the draft. Then when you add a new draft back in, the raid will not automatically
|
|
just start building on it. First of all, you have to partition the draft, mark the partitions as
|
|
as raid auto, auto detect partitions, and then basically add them back to the raid controller,
|
|
and then the raid will start rebuilding on it. So it's not like some of the hardware raids,
|
|
we're just going to shove the disk in and away to go and start rebuilding itself, yet it's
|
|
a little bit of work for the rebuild to happen on it. Then mark experience with MD admin,
|
|
the software raid is extremely robust, and it really can handle quite a bit of abuse,
|
|
and your data is still fine, it doesn't get lost, so don't be too afraid of playing with it,
|
|
obviously don't do this in a production environment, but in your lab while you're getting used to it,
|
|
really don't be afraid of actually trying it out and seeing if it works and hard works.
|
|
Your data and also your data will be there in most cases, you just have to mount it properly and
|
|
you can then read the data of the drafts. So one thing I would like to just repeat is that
|
|
software raid is quite flexible in how you architect a raid solution for your machine,
|
|
because you don't have to have all the petitions on one draft given over to redundancy or to your
|
|
raid. You can pick and choose petitions across your drafts to use in a raid, so you can have
|
|
three discs in there, for example, you can add two discs, one completely dedicated to raid,
|
|
other one just some of it, this is dedicated to raid, at least for example you have one
|
|
terabyte disc and two 500 gig discs, you can use the one 500 gig disc completely, you can then
|
|
merge the 500 gigs from the terabyte onto the other 500 gigs and still use the other 500 gigs
|
|
on the terabyte disc as another partition. This is one of the great advantages of software raid
|
|
is its flexibility. Okay, so that's it for software raid on Linux using the MD admin utility,
|
|
I'm now going to talk about Logic of Volume Manager or LVM for short. Logic of Volume Manager is
|
|
essentially what are largely dynamically resized with discs and grotesque discs as you need space.
|
|
I mean you all come across this problem by home use in business, you don't know what it's about,
|
|
terabyte disc or 500 gig disc, when you're partitioning your draft, it sets up, you know, should
|
|
I set aside, on this terabyte disc could give 500 gigs to partition A and partition B,
|
|
and you're not quite sure, you know, what's going to run out of this space sooner.
|
|
You know, this manifests itself later, in suddenly you got like 300 gigs left on the one draft,
|
|
but the one partition, but the other partition is now running out of space, so what do you do?
|
|
And normally then you have to get another disc, get a partitioned, you have to then go and
|
|
read data around manually, or set up about extra partitions and your FS tab file to load it,
|
|
boot time, all those kinds of things, it's really inefficient use of disc space, and it can also
|
|
be time consuming to fix when you start running out of disc space. And normally I'm solving these
|
|
problems in a really elegant and efficient manner. What allows you to do, what allows your
|
|
partitions, which the operating system sees as partitions, allows you to dynamically resize either
|
|
grow them as fast, or you can reduce them, so you want to release disc space, it isn't used to
|
|
use another partition, you can do that. Okay, so once again, what's important, you have your
|
|
conceptual model of the hard operating system sees discs. As with software-ride, the operating
|
|
system now sees what the normally called logical volumes, which are called logical volumes,
|
|
as the partitions in which it's operating system and all those data is stored. So another
|
|
incident point is you did STI1 or STI2 or 3, we now have you referred to what is dev,
|
|
normally it's under the criteria of the vastness, normally the dev of all R1, they're not given
|
|
and I'm like something logical volume 1, logical volume 2, logical volume 3. So those are the
|
|
the vastness of the operating system sees where the data is stored.
|
|
Okay, so when talking about RVMA3, concepts that you need to be aware of,
|
|
okay, one is the physical volume. Physical volume is either partition on a disc or a whole disc,
|
|
if you're going to just use the whole disc dedicated to the logical volume of volume manager,
|
|
then you have the volume group, okay, volume group is a collection of physical volumes,
|
|
and logical volume is basically a partition over that volume group, so you can partition
|
|
the volume group into different partitions. So I think like this is a logical, works logical way.
|
|
So you get your disc, your physical disc, either you got partitions on that disc or the whole
|
|
disc and you create a physical volume on that disc. If you're going to use partitions,
|
|
if you're going to add to the volume group, you need to then toggle the partition RDE
|
|
and if disc, to 80, to 80 I think it is, which sets it up as a Linux over the importition.
|
|
Or you're going to do the whole disc, then as far as the disc can remain with no partition on it.
|
|
You then run the PV, create command, which then marks that partition or that disc
|
|
as a physical volume, okay. The physical volume thing is added to a volume group. Now a volume
|
|
group essentially like a space or a name holder, it allows you to add more than one disc or partition
|
|
to the volume group. So if you got, let's say two discs that are 500 gigs, you can add it to the
|
|
physical to the volume group and essentially you have a one gig volume group. You then create
|
|
partitions essentially on top of the volume group, they're called logical volumes. So you divide
|
|
that up as how you require it for your disk, for your partition, then how you want to solve your system.
|
|
So in theory, you can have like three discs, you can add them to the volume group and then this
|
|
is equal to, you got 3,500 gig discs, because you 1.5 terabytes of data. But you want to see that
|
|
as one 800 gig disc and one 700 gig disc, say for example, then you can do that. You just
|
|
add the disc to the volume group and then on the volume group, you then divide it up as how you want it.
|
|
This is a great thing about logical volume engine. I was to treat your disc as a resource,
|
|
essentially as a pool of resources that you can allocate at will and use as you want.
|
|
So once you got your volume, you can set up, you can then install onto that.
|
|
So what is the advantage? So let's say a couple of months are around, you start running out of
|
|
this space on your logical volume 1, let's say which is your root volume. You look at your
|
|
root volume and you realize that there's 200 gigs there that you can use. You can actually shrink
|
|
your root volume, logical volume down, and you can increase your root volume by those 200 gigs.
|
|
Unless you can do on the system while it's live, you have to be a bit machine, you have to
|
|
bring the machine down, you don't have to basically move data around and start re-protectioning
|
|
and backing up and copying data all over the place. So when I point at you, it really is a
|
|
efficient use of your resources that you have available to you. You can also, of course,
|
|
if you're running out of this space, you can always add a bar near this, you can add it in,
|
|
you can create a physical volume on it, load it to existing volume group, and then expand your
|
|
logical volume to take up that space. So it really makes this allocation dynamic, dynamic
|
|
affair. You don't have to worry too much about planning it and getting it right up front,
|
|
and also having all this waste to do is also long around, it could be better and more
|
|
efficiently used elsewhere. One thing to bear in mind when I'm using logical volume manager on
|
|
your system, your brute partition cannot be LVM partition. So your brute partition effectively
|
|
must be either a commune or a normal partition, otherwise your machine will not boot.
|
|
Besides the ability to dynamically resize and partition disks and allocate your disk resources
|
|
as a pool of disk space rather than as individual disk and petitions, the LVM also allows you to do
|
|
your thing called snapshots. So you can do, for example, you can only snapshot my disks
|
|
as they are at this point in time, and then you make a copy of it, and what LVM does, it doesn't
|
|
think will copy and write. All changes since the time that you say you take your snapshot can
|
|
return to a separate part of the disk. So why would you want to take a snapshot of your disk?
|
|
This is great for backups. Typically one of the hardest things to do when you're trying to take
|
|
a backup of a large system is that if you wanted to do like a clearing of it, you'd have to take
|
|
the machine down, take it off line, and then clone it. With the snapshot, you can take a snapshot,
|
|
it will stay static, and you can then take a consistent backup of that snapshot.
|
|
Once you're finished doing the backups, you can then delete the snapshot and then release the space
|
|
back to the machine. And really this is, I cannot, if that's how useful this is, in production and
|
|
RAMs, which cannot have any data, we need to take backups of the space of your disks. So that
|
|
is really another one of the key advantages of LVM. So that covers LVM pretty quickly. As I say,
|
|
to get used to the LVM and RAID, it's getting the conception or offline from Marker,
|
|
it's getting the conception model and my mind worked out. And after that, it's pretty easy to
|
|
understand what the utilities were doing and how they were doing it. Now what's great with
|
|
LVM and RAID, it can use these two in combination. So what you can do is you can have
|
|
a feature disk as a resource of disks and have redundancy and speed improvement at the same time.
|
|
And this was great about Linux approach to solving problems. We build small modules that deal with
|
|
a particular problem and then you combine them to get it to build up a solution, which is bigger
|
|
than some of the spots. So what you can do, for example, you can set up a RAID 1 MD0, MD1, MD2,
|
|
and you add those, rather than a physical partitions, to your volume group and then you basically add,
|
|
locate your, I mean, allocate your logical volumes over that. So now you can have hot,
|
|
softable drives, you can have redundancy, you can see the hands and you can have the ability to
|
|
dynamically resize your disks. And the benefits of that, as well as they'll take back up to the
|
|
snapshot and everything, the benefits of that really for any business or any home users,
|
|
considering using it really as well with the effort of investigating how those two utilities
|
|
work and spending some time and effort getting to know them. Okay, that's it for now. I hope
|
|
everybody got something out of this tutorial or out of this episode. If they're any erota,
|
|
which I'm sure they are, please feel free to let me know and correct my errors. And I'll put
|
|
them up on the show notes as there are any errors or creatures that come to large from any listeners.
|
|
Thank you for listening and hope to see you again soon. Bye.
|