hpr-knowledge-base/hpr_transcripts/hpr0316.txt

Episode: 316
Title: HPR0316: Raid LVM
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0316/hpr0316.mp3
Transcribed: 2025-10-07 16:07:06

---

music
Hi everybody, just a quick note before the episode begins. We recorded this episode and
then we went to Jacobabia Gradyo and saw that Kevin Benker has released an episode on
LVM2, episode 314. He deals with LVM2 in great detailness episodes. I suggest all listeners
who are interested in also listen to that show. In this show, I deal with LVM at a much
higher level and it's at the end of the show. So if anybody doesn't want to hear it
mentioned by LVM2, they could just ignore the last part of the show. Thank you very much
and now on with the show. Hello everybody, it's Mark Clark here and welcome to another episode
of Jacobabia Gradyo. This episode I'm going to talk about raid and logical volume manager,
LVM and how this can be used as both the home user and small businesses to great advantage.
First I'll deal with raid before moving on to logic of volume manager or LVM. So the first
question is what is raid? Raid stands for redundant array of inexpensive disks. Now there are
a couple of principles behind raid which you need to kind of know is to understand how raid works.
One is that raid provides redundancy. So this means if one disc goes, your system keeps
on operating. It might be in a reduced state, but also if you're operating your data will be safe.
Another is also provides speed improvements. This is only involved when you talk about strapping.
So basically you have to do this reading at the same time, feeding data through to the CPU
and obviously you get higher 3 foot 4. These two principles that were combined together in
different ways to come up with the categories of how raid is defined. Typically there are three
well known categories for raid is raid 0, raid 1 and raid 5. raid 0 was the first raid that came
out and basically what this did was to strap the data across the disks. So as I was saying,
you're reading data or two disks at the same time when you're reading it into the memory,
there's a lot faster and you get this speed improvement. Here there's no redundancy in raid 0,
it's just got strapping. raid 1 came up with also known as mirroring, was basically two jars
of mirrored. So what happens there are essentially two disks, but only one disk is usable from the
operating systems point of view. Another disk is basically a copy of the data. Should the one
disk fail, the other disk can take over and the machine can continue to operate in a reduced state.
Then came raid 4 and raid 4 better to combine these three principles. We have strapping,
you need at least three disks for raid 4. For murdering obviously you need these two disks.
For raid 5 then you have three disks and the data strapped across the three disks as well as
what they call a parity data is put across the three disks. So one of the other two can continue
to operate in a reduced state. This is you get mirroring and you also get the three-foot gains.
Going to think about raid 5, you start adding one more disk to the raid, you only lose one,
so as you grab, say you've got five disks in there, you're still only using one disk. So as a ratio
of this space lost to the redundancy, as you increase raid 5 is quite scalable from that point of view.
Now do you know there are many other raids out there these days like raid 6, raid 10,
1o depending on how you call it. These are basically just more permutations of the above.
So if you want to find out more about the new more advanced sort of raids out there,
you can go to Wikipedia and have a look at where they got them documented. But there's something
which is subject to change and the first three that are outland of the well-known ones,
which you might see who will talk about when you're talking to them about raid.
Another aspect of raid that you might come across is hot supperable. Now what does hot supperable
mean? Hot supperable means that there is either a spare drive which is existing in your raid,
typically raid 3, even in your bird drafts. Within the 1,000 automatically takeover from the other
and the raid will start with the core rebuilding on the spare drive. Also sometimes a fear to the
fact that you can hot deploy a drive. So you have to turn the machine off of your draft files,
you move from the raid, plug in a new job, add it to the raid in a way you go without having to
advance your machine. So these are things that will also come across if you will talk about raid.
And the utility and learnings that we're going to be using MD, Edmund or multi-disk raid,
software raid allows you to do this as well, especially with shorter drafts.
You can use it on RDE drafts as well, but the hot supperable component isn't recommended
with RDE drafts. Okay, so one of the people come across when dealing with raid now is different
types of raid. It's hardly raid, it's supper raid and it's something which is called fake raid.
Now hardware raid doesn't go high in servers and other machines, essentially it's a piece of
hardware that handles the raid itself. So the operator says completely unaware of it, it just
says there's one disc and I won't really be dealing with that in this episode, I'll be dealing
with more supper raid. So supper raid is where the raid is done in the operating system itself,
at operating system level. Okay, and this requires some CPU time and resources and that's why
sometimes in the larger machines you use hardware raid, let's upload it to the dedicated hardware
controller, the disc controller. Another thing which you hear about these days is called fake raid.
Now fake raid, if you get most of the modern motherboards and you got salted drafts on it,
they will like to configure raid in the BIOS level. This is called fake raid because essentially
it's supper raid handled in the BIOS. And you know most people recommend that you do not use this
when you can use the supper raid that comes with Linux. The only argument I've come across
for using this kind of the built in BIOS level raid is when you drill beauty with the windows
in the windows partition. That's something which I never do, so it's not something which I
which I've tried out. Okay, so let's have a look at one of the benefits for raid. What I mean,
this is quite obvious in terms of speed improvement and redundancy. It's not only businesses that need
to look at these things or think it's only advantage for them in it. You also home users can
benefit from it because one thing that Linux software raid allows you to do, it allows you to
only use certain partitions of two drafts as a raid configuration. So essentially you don't have
to lose an entire disk to your raid setup. If for example you only want to have redundancy in your
home directory and everything else will just be on a normal partition because you can reinstall it
and popping in the destroyed disk and reinstalling the machine, then you can do that. And I'll cover
that when I talk about MD admin raid or multi-disc raid. For business as well, would you want to use
software raid as opposed to buying a server with a controller in it? Obviously the hard way is
much more expensive than you have a built-in controller. So for smaller business as SMEs, you want
some redundancy, you want some of the benefits of raid, but you don't have to pay for the more
expensive machine than you should definitely consider using software raid. And the big thing obviously
for small businesses as well is that you can have little downtime with software with any kind of
raid, but with software raid as well. So for example, on your disk guys you don't have to have your
machine done, you can continue to operate, you can continue to provide services to your staff
with your customers, and then you can replace the disks, you know, when the appropriate maintenance
time arrives or as soon as you can, it's always better to replace as soon as possible. But you only
have to do the machine as an offline until you replace the drive as such.
One last thing I want to talk about before actually going into multi-disc raid is to talk about
linear raid. Now linear raid basically, you can use it with the software raid in Linux, and it
makes two years look like one big disk. However, I don't suggest people use software raid for this,
LVM is far better solution for combining disks into one larger disk.
Okay, so let's start looking at software raid. The utility in Linux is called MD admin, it's MD ADM,
and basically as you demonstrate the software raid, it's also known as MD stands for multi-disc.
I'm not quite sure how he will pronounce it, I'll refer to this MD admin, the utility throughout
this episode. Now one of the key things to understand when trying to get your mind around how
a raid works in Linux is to have a conceptual model that enables you to understand what's going on.
Because most of you when I approach raid like myself, we use to physical drives with physical
partitions, and then we tell the Linux operating system that this is the physical partition as
you're going to use. What happens with the software raid is that the partition that the operating
system sees is actually abstracted away from the physical disks, and this is important concept to
keep in mind. So for example, when you're operating system, you'll be routing not to like SDA 1 or SDA 2,
you'll be routing to the multi-disc, the vast, the vast driver or the vastes, and that is all like
dev slash dev slash MD zero, MD one. So essentially think of those now as SDA 1 and SDA 2,
from the same role, not many different point of view as SDA 1 and SDA 2 did, for example,
when you're using Linux without raid. So conceptually now, you're going to create these new
lock fake drives for full operating system, and this is where the MD admin utility comes in.
So essentially until this drive, the MD zero, MD one, MD two is made up of the following physical
hard disk partitions. So you've got that in mind in what you'll do is you'll take your hard disk,
you'll partition it, and then you'll add the partitions to the multi-disc drive, and then
the operating system just sees a multi-disc drive. So why is this conceptual model important?
It comes into play, for example, let's say you'll take a multi-disc drive out of one machine,
and you want to mount it on another machine. You cannot mount that partition directly under SDA,
when you can try like SDA 1, but essentially you'll probably destroy the data on that disk,
because it has to be first added to a raid, because you've got a whole lot of raid information
and then you mount the raid device, the MD zero. So typically if you get a machine that comes in and
for some reason it's not booting or something wrong with it, you need to get the data off that disk,
what you'll do is you'll take the, this is a raid one, you'll take the one surviving raid one
drive out, you'll have to create a multi-disc drive, and then you mount the multi-disc
device under Linux, and then you can start reading the data off of it. So for example, you
shouldn't run if it's checked directly on the SDA 1, if it's part of a multi-disc raid,
you have to if it was mounted under MD zero, and then run the check on that.
So I'm not sure if I explained that very well, but you know what I do find that to understand
the software is very simple, actually once you understand conceptually how it works,
but it's getting a conceptual model in your mind sorted out, to know if you understand what's
going on at the operating system level, that's very important. So that's why I've
labed with the point a little bit there, and I hope it's clear, you know, I did my best to try
and explain to people, but hopefully it's giving you enough so that you can go on and build
the own conceptual model of how this works. So I'm going to go on to the steps of how to
create software raid in Linux. I'm not going to read out the commands as you start from
the screen, because I find it doesn't work for me, it doesn't work that well in podcast format,
but at least I'll guard you a bit, and I'll also provide some links and the show notes to
online tutorials that will explain that better for you. One thing to note is that software raid is
very, very robust, you don't have to be afraid of it, it's the key thing, it's just to start
using it, once you start using it, you'll be amazed at how robust this software really is,
and how great it is for actually getting raid very cheaply into your infrastructure.
So first time we see that raid is during the installation process of your distribution,
that distributions have different when you get to the screen when you're partition it,
some of them don't really explain the possibility of raid very, very well. I found the Ubuntu
installer was much easier to use than the raid had installed. The raid had installed was very
confusing to use the setup raid, but the steps are basically the same. When you're sitting
up raid, would you do installation time or installation time when you're adding another partition
to your system? First of all, you have to partition the hard, as you normally do, STI1, STI2, STI3,
okay, what you need to do then is various partitions that you want to use as raid,
you're going to have to always have two discs at least, and you're going to partition the
the similar sizes, and you can have one partition bigger than the other partition that you
mirroring to, but that's just wasted this space, because you're going to mirror up to the
the minimum space that you have there. So you partition your jars, and when you do that, you
must, you go into the using if disk, you must toggle the top of the partition and you set it to
raid auto detect, okay, which is normally just if D, which is a number that is used to flag
it as a raid auto detect partition. So now you've done the first part, you partition the jar,
that does not mean now that you have set up multi-disk yet, you've only set up the partitions,
now you need to go and set up the actual multi-disk draft, okay, this is where the MD0, MD1,
and MD2 come in. So then you go and you say, okay, and if you're using the installer,
you will say I only set up a multi-disk raid, and you add the partitions that you want to the raid,
okay, if you're doing this at the command line, after you installed, it's so much easier to
conceptually understand, you're going to create, you'll use the MD admin, create command,
and so I'm going to create the MD0, and I'm going to add the following two devices to it.
At creation time, you till it, what kind of raid, this is a raid one, it's mirroring,
it was a raid five, or raid zero. So you can, at that point, you till it, what type of raid you
constructing. One thing to remember is that your breed partition could only be on a raid one
of us, it can't be on any other raid, because your machine just won't boot, so till it only
is raid one for your breed partition, other partitions, you can use other tasks of raid that are
available. Okay, so once that's done, then you can then start installing your machine,
or setting up your new partition, as you would any other like SDR, SD1 partition, but to say,
using div MD0 and MD1. Okay, once your raid is set up and your boot into your machine,
you can see the status of your raid devices by using MD ADM utility, MD admin utility,
or you can just go and cat the proc MD stat file and see what status your draws in. Now,
typically if a draft fails, you will have to mark the draft as faulty using the MD admin utility,
and then remove the draft. Then when you add a new draft back in, the raid will not automatically
just start building on it. First of all, you have to partition the draft, mark the partitions as
as raid auto, auto detect partitions, and then basically add them back to the raid controller,
and then the raid will start rebuilding on it. So it's not like some of the hardware raids,
we're just going to shove the disk in and away to go and start rebuilding itself, yet it's
a little bit of work for the rebuild to happen on it. Then mark experience with MD admin,
the software raid is extremely robust, and it really can handle quite a bit of abuse,
and your data is still fine, it doesn't get lost, so don't be too afraid of playing with it,
obviously don't do this in a production environment, but in your lab while you're getting used to it,
really don't be afraid of actually trying it out and seeing if it works and hard works.
Your data and also your data will be there in most cases, you just have to mount it properly and
you can then read the data of the drafts. So one thing I would like to just repeat is that
software raid is quite flexible in how you architect a raid solution for your machine,
because you don't have to have all the petitions on one draft given over to redundancy or to your
raid. You can pick and choose petitions across your drafts to use in a raid, so you can have
three discs in there, for example, you can add two discs, one completely dedicated to raid,
other one just some of it, this is dedicated to raid, at least for example you have one
terabyte disc and two 500 gig discs, you can use the one 500 gig disc completely, you can then
merge the 500 gigs from the terabyte onto the other 500 gigs and still use the other 500 gigs
on the terabyte disc as another partition. This is one of the great advantages of software raid
is its flexibility. Okay, so that's it for software raid on Linux using the MD admin utility,
I'm now going to talk about Logic of Volume Manager or LVM for short. Logic of Volume Manager is
essentially what are largely dynamically resized with discs and grotesque discs as you need space.
I mean you all come across this problem by home use in business, you don't know what it's about,
terabyte disc or 500 gig disc, when you're partitioning your draft, it sets up, you know, should
I set aside, on this terabyte disc could give 500 gigs to partition A and partition B,
and you're not quite sure, you know, what's going to run out of this space sooner.
You know, this manifests itself later, in suddenly you got like 300 gigs left on the one draft,
but the one partition, but the other partition is now running out of space, so what do you do?
And normally then you have to get another disc, get a partitioned, you have to then go and
read data around manually, or set up about extra partitions and your FS tab file to load it,
boot time, all those kinds of things, it's really inefficient use of disc space, and it can also
be time consuming to fix when you start running out of disc space. And normally I'm solving these
problems in a really elegant and efficient manner. What allows you to do, what allows your
partitions, which the operating system sees as partitions, allows you to dynamically resize either
grow them as fast, or you can reduce them, so you want to release disc space, it isn't used to
use another partition, you can do that. Okay, so once again, what's important, you have your
conceptual model of the hard operating system sees discs. As with software-ride, the operating
system now sees what the normally called logical volumes, which are called logical volumes,
as the partitions in which it's operating system and all those data is stored. So another
incident point is you did STI1 or STI2 or 3, we now have you referred to what is dev,
normally it's under the criteria of the vastness, normally the dev of all R1, they're not given
and I'm like something logical volume 1, logical volume 2, logical volume 3. So those are the
the vastness of the operating system sees where the data is stored.
Okay, so when talking about RVMA3, concepts that you need to be aware of,
okay, one is the physical volume. Physical volume is either partition on a disc or a whole disc,
if you're going to just use the whole disc dedicated to the logical volume of volume manager,
then you have the volume group, okay, volume group is a collection of physical volumes,
and logical volume is basically a partition over that volume group, so you can partition
the volume group into different partitions. So I think like this is a logical, works logical way.
So you get your disc, your physical disc, either you got partitions on that disc or the whole
disc and you create a physical volume on that disc. If you're going to use partitions,
if you're going to add to the volume group, you need to then toggle the partition RDE
and if disc, to 80, to 80 I think it is, which sets it up as a Linux over the importition.
Or you're going to do the whole disc, then as far as the disc can remain with no partition on it.
You then run the PV, create command, which then marks that partition or that disc
as a physical volume, okay. The physical volume thing is added to a volume group. Now a volume
group essentially like a space or a name holder, it allows you to add more than one disc or partition
to the volume group. So if you got, let's say two discs that are 500 gigs, you can add it to the
physical to the volume group and essentially you have a one gig volume group. You then create
partitions essentially on top of the volume group, they're called logical volumes. So you divide
that up as how you require it for your disk, for your partition, then how you want to solve your system.
So in theory, you can have like three discs, you can add them to the volume group and then this
is equal to, you got 3,500 gig discs, because you 1.5 terabytes of data. But you want to see that
as one 800 gig disc and one 700 gig disc, say for example, then you can do that. You just
add the disc to the volume group and then on the volume group, you then divide it up as how you want it.
This is a great thing about logical volume engine. I was to treat your disc as a resource,
essentially as a pool of resources that you can allocate at will and use as you want.
So once you got your volume, you can set up, you can then install onto that.
So what is the advantage? So let's say a couple of months are around, you start running out of
this space on your logical volume 1, let's say which is your root volume. You look at your
root volume and you realize that there's 200 gigs there that you can use. You can actually shrink
your root volume, logical volume down, and you can increase your root volume by those 200 gigs.
Unless you can do on the system while it's live, you have to be a bit machine, you have to
bring the machine down, you don't have to basically move data around and start re-protectioning
and backing up and copying data all over the place. So when I point at you, it really is a
efficient use of your resources that you have available to you. You can also, of course,
if you're running out of this space, you can always add a bar near this, you can add it in,
you can create a physical volume on it, load it to existing volume group, and then expand your
logical volume to take up that space. So it really makes this allocation dynamic, dynamic
affair. You don't have to worry too much about planning it and getting it right up front,
and also having all this waste to do is also long around, it could be better and more
efficiently used elsewhere. One thing to bear in mind when I'm using logical volume manager on
your system, your brute partition cannot be LVM partition. So your brute partition effectively
must be either a commune or a normal partition, otherwise your machine will not boot.
Besides the ability to dynamically resize and partition disks and allocate your disk resources
as a pool of disk space rather than as individual disk and petitions, the LVM also allows you to do
your thing called snapshots. So you can do, for example, you can only snapshot my disks
as they are at this point in time, and then you make a copy of it, and what LVM does, it doesn't
think will copy and write. All changes since the time that you say you take your snapshot can
return to a separate part of the disk. So why would you want to take a snapshot of your disk?
This is great for backups. Typically one of the hardest things to do when you're trying to take
a backup of a large system is that if you wanted to do like a clearing of it, you'd have to take
the machine down, take it off line, and then clone it. With the snapshot, you can take a snapshot,
it will stay static, and you can then take a consistent backup of that snapshot.
Once you're finished doing the backups, you can then delete the snapshot and then release the space
back to the machine. And really this is, I cannot, if that's how useful this is, in production and
RAMs, which cannot have any data, we need to take backups of the space of your disks. So that
is really another one of the key advantages of LVM. So that covers LVM pretty quickly. As I say,
to get used to the LVM and RAID, it's getting the conception or offline from Marker,
it's getting the conception model and my mind worked out. And after that, it's pretty easy to
understand what the utilities were doing and how they were doing it. Now what's great with
LVM and RAID, it can use these two in combination. So what you can do is you can have
a feature disk as a resource of disks and have redundancy and speed improvement at the same time.
And this was great about Linux approach to solving problems. We build small modules that deal with
a particular problem and then you combine them to get it to build up a solution, which is bigger
than some of the spots. So what you can do, for example, you can set up a RAID 1 MD0, MD1, MD2,
and you add those, rather than a physical partitions, to your volume group and then you basically add,
locate your, I mean, allocate your logical volumes over that. So now you can have hot,
softable drives, you can have redundancy, you can see the hands and you can have the ability to
dynamically resize your disks. And the benefits of that, as well as they'll take back up to the
snapshot and everything, the benefits of that really for any business or any home users,
considering using it really as well with the effort of investigating how those two utilities
work and spending some time and effort getting to know them. Okay, that's it for now. I hope
everybody got something out of this tutorial or out of this episode. If they're any erota,
which I'm sure they are, please feel free to let me know and correct my errors. And I'll put
them up on the show notes as there are any errors or creatures that come to large from any listeners.
Thank you for listening and hope to see you again soon. Bye.