196 lines
17 KiB
Plaintext
196 lines
17 KiB
Plaintext
|
|
Episode: 1032
|
||
|
|
Title: HPR1032: LiTS 011: du - disk usage
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1032/hpr1032.mp3
|
||
|
|
Transcribed: 2025-10-17 17:42:01
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
Welcome to Linux in the Shell Episode 11, the DU command.
|
||
|
|
My name is Dan Washco, I will be your host today, and I would like to remind you that
|
||
|
|
this show was hosted by Hacker Public Radio, and if you like this show, please head over
|
||
|
|
to Hacker Public Radio and see how you can support the overall project. It's a fantastic
|
||
|
|
project. Also, this audio component is a supplement to the right up on the website for the full
|
||
|
|
skinny on the DU command. Head over to linuxintheshell.org, check out episode 11's entry.
|
||
|
|
Well, let's get into it, the DU command. Basically, the DU command stands for Disk Usage,
|
||
|
|
or displays a usage of all the files or directories in a certain location. By default, it displays
|
||
|
|
those in blocks, kilobyte blocks to be specific. So if you were to execute the DU command in
|
||
|
|
your home directory, it's going to show you, line by line, each file, how many blocks and
|
||
|
|
kilobytes it's taking up, and also each directory, each directory in there, it'll show each
|
||
|
|
file in that directory and recurse all the way through in a sub-directories, provide a listing
|
||
|
|
for how many blocks that file is taking up in kilobytes, and then at the end, when it's finished,
|
||
|
|
a sum total of all the blocks. How many blocks are being utilized by that directory? You can change
|
||
|
|
the block size from kilobytes, which equals 1,024 bytes. You can change that to just about any
|
||
|
|
other value, acceptable, megabytes, gigabytes, petabytes, terabytes, you name it. Very much like
|
||
|
|
you can at the F command, you just pass it to dash capital B option, and then specify the label
|
||
|
|
for what you want. So for instance, dash capital B, capital K is a default kilobytes. Now you can
|
||
|
|
also specify dash K for kilobytes, or dash dash block size equals K. So the dash capital B is
|
||
|
|
the same thing as dash dash block dash size equals, and then specify the value. You can do dash M
|
||
|
|
for megabytes, but you could also do dash capital B capital M, or dash dash block dash size equals
|
||
|
|
M for megabytes. Now for gigabytes, terabytes, petabytes, exabytes, zedabytes, and so on, you have
|
||
|
|
to, there isn't a dash equivalent, it's dash capital B, and then G for gigabytes, T for terabytes,
|
||
|
|
P for petabytes, E for exabytes, and so on, or dash dash block dash size, and a lower case
|
||
|
|
equivalent of one of those values. Now keep in mind, much like the DF command, if you pass like
|
||
|
|
dash capital B capital T for terabytes, and you don't have a terabyte drive, more than likely,
|
||
|
|
what it's going to show is a value of one for being used, because there's not a full terabyte in
|
||
|
|
air of blocks being used, but there is some value, but because there is some value, it slows one.
|
||
|
|
If you try to specify something like petabytes, exabytes, zedabytes, you might get an error
|
||
|
|
message that comes back, it says dash B argument, z capital Z, or capital E, or P, whatever you
|
||
|
|
have passed is too large. So keep that in mind, you want to, you want to keep it relative to what
|
||
|
|
the size of those files are going to actually be, and you can circumvent this on most newer
|
||
|
|
versions of DU with the dash H, or human readable output, which is one of my favorites. So if you were
|
||
|
|
to do that, DUed, space, dash H, it puts it in human readable, and like the other ones, DF and
|
||
|
|
free, what I said is it kind of truncates it to the nearest three digit value. So if it's under a
|
||
|
|
Meg, it's probably going to show it in kilobytes. If it's over a Meg, but under a gigabyte, it's
|
||
|
|
going to show it in megabytes. And once it reaches, you know, over a gigabyte, it's going to show it
|
||
|
|
in gigabytes, as opposed to megabytes. So the dash H could be very handy for how it is displayed.
|
||
|
|
Now, when you specify or execute the DU command, it launches in the current directory and displays
|
||
|
|
everything in that current directory, it recurses through all sub-directories and displaying each
|
||
|
|
individual file on hand. You can pass it a file or a directory and get the value for just that
|
||
|
|
file or directory. Now, remember, if you pass it a directory, it's going to recurs in each
|
||
|
|
individual sub-directorie and provide you with that information. Whereas if you pass it a file,
|
||
|
|
it's only going to provide you the information for that file. When it's finished like on a directory,
|
||
|
|
it will show you a total on the last line of the value of that directory and kilobytes. Now,
|
||
|
|
if you wanted to find out how much space, how many blocks your home directory is taking up,
|
||
|
|
you can pass it to dash S or to dash, dash summary option, which instead of showing you each
|
||
|
|
individual file on how much space, for goes all that and just shows you the total amount of blocks
|
||
|
|
taken up by that location. So it can be a file or it can be a directory, for instance, DU dash S,
|
||
|
|
M slash home slash D wash go, which show the total amount of space and megabit blocks taken up by
|
||
|
|
my home directory, not each individual but just the overall total. You can also
|
||
|
|
limit the directory recursion with the dash D or dash dash max dash depth command equals some
|
||
|
|
value. So that would be a numeric value from zero to how many directories deep you would actually
|
||
|
|
want to go. So if you specify zero, that means don't recurs into any sub-directorie, it's the same
|
||
|
|
thing essentially is passing the dash S command or summary to that location. Whereas if you were to
|
||
|
|
pass one, it's only going to recurse one sub-directorie in any directory and show you any files that are
|
||
|
|
in there and the summary for any sub-directories that are in there. So that's that's what you're going
|
||
|
|
to get is a listing of all the files in that current directory, all the sub-directories. So it's
|
||
|
|
only going to go down one level, but you will see in that one level, you will see any files in
|
||
|
|
there, there are amounts and any sub-directories, but not what's in those sub that second level of
|
||
|
|
sub-directorie, but just a total. And then you'll get the total at the bottom again if you're just
|
||
|
|
passing the dash D. So be aware of that, that's how that operates the max depth. There's an option
|
||
|
|
to exclude files with the dash dash exclude equals and then you could pass it a file name if you
|
||
|
|
wanted to just exclude one file or you could pass it a wildcard like asteris.txt which would exclude
|
||
|
|
any text files that it finds in there or asteris.png and exclude any png files in the totals.
|
||
|
|
The other option is the dash capital X or dash dash exclude dash from equals and then you're going
|
||
|
|
to pass it a file and in that file you would list all the files that you want to exclude
|
||
|
|
from being processed by the DU command. Now you can specify not just files again, but patterns,
|
||
|
|
so if you wanted to exclude like all image files like asteris.png, asteris.jpg, asteris.giv,
|
||
|
|
you could put that into a file and pass that to the DU command and it would exclude all of those.
|
||
|
|
So it's pretty flexible as to what you can do. Now if you listen to what I have been saying,
|
||
|
|
DU displays its output in blocks. I didn't say file size, I said blocks and there's a
|
||
|
|
difference there and it's an important difference because it's what differentiates I believe the
|
||
|
|
DF command from the DU command is how stuff is reported and that's in blocks.
|
||
|
|
Now what a block is is your file system is separated or formatted into blocks of data and
|
||
|
|
typically the block size for your file system is going to be roughly 4K.
|
||
|
|
4K is equivalent to 4,096 bytes. So 4K blocks typically what most distributions format on the
|
||
|
|
XT4, the XT3, the XT2 file system. And you can check what your block size is on your file system
|
||
|
|
if it's an FSCK or the XT file system by doing DUMP E2FS your device. For instance, I'm SDA2 for
|
||
|
|
my root file system or SDA5 for my home file system here. I can do DUMP E2FS slash dev slash SDA5
|
||
|
|
pipe grip and in double quotes capital B block space capital S size. So it's block size in
|
||
|
|
that quotes and it will show me what my block size is on that file system. Now chances are you might
|
||
|
|
have to run that through pseudo or run it as root not as a standard user to see that but that
|
||
|
|
will report the block size and then understand what that means okay a block can be filled
|
||
|
|
can have data or not if the block has data in it it's full to DU if a block has no data it's empty
|
||
|
|
but if a block is half full it's still considered a block of use use space to the file system to DU
|
||
|
|
now if you have a file that is 4 kilobits in size that's one block DU is going to report that as one
|
||
|
|
now if you have a file that let's say is 7 kilobits in size that is going to be reported by DU
|
||
|
|
as two kilobit blocks being used so you're saying okay one kilobit or four kilobit block is
|
||
|
|
4,096 bytes two kilobit blocks okay are going to be what that's 496 and 496 which comes out to
|
||
|
|
9192 blocks so anything that's a file size of say 497 to 9192 bytes is going to take a second block
|
||
|
|
regardless of how much of that second block it takes so you could be losing anywhere from
|
||
|
|
4,091 4,095 blocks to one block kilobyte of space be aware of that but DU is going to report it
|
||
|
|
as either one or two depending on how much space she might be saying yourself if I have a ton of
|
||
|
|
files in there that are not breaking on four kilobits of piece is DU accurately reflecting how much
|
||
|
|
quotes space the files are taking up yes and no but if you want to get a number that shows closer
|
||
|
|
to the what DF would show or what you would expect by saying okay this this file is only
|
||
|
|
7,258 bytes so it's not taking up two blocks well it is taking up two blocks but not all of it
|
||
|
|
I'm not getting an accurate representation of what that file size actually is you can pass it
|
||
|
|
the dash dash apparent dash size and that'll show you how much apparent space it is taking up
|
||
|
|
not by blocks but by whatever value you're passing it by default is going to be kilobytes so
|
||
|
|
whereas a file that may be let's see a file that may be 7,000 bytes in size you pass DU to that
|
||
|
|
file it's going to show a value of two by default but if you pass apparent size it's going to show
|
||
|
|
a value of 7,000 bytes or more than likely like 1.2 files and you can muck around with this using
|
||
|
|
the DD command to test this if you do DD iF which stands for input file equals dev slash dev slash
|
||
|
|
output file OF equals test and then a BS equals a block size of 4096 and count equals one that's
|
||
|
|
one block you use DU on that command and it's going to show you a value of one whereas if you were
|
||
|
|
to pass a block size equals 7,000 a BS equals 7,000 an account of one when you pass it into there
|
||
|
|
it's not going to show you a block size or if you pass it into bytes it's going to show you a
|
||
|
|
blocks or apparent size is going to show you like 7,000 as opposed to it being one or it's actually
|
||
|
|
going to be two blocks so be aware of that there is a slight calculation difference in how it
|
||
|
|
how it calculates I've focused and I've made mention disk DU by default block size not apparent
|
||
|
|
file size so you're not getting the full amount of space that you know the file size in there you're
|
||
|
|
getting how many blocks it is actually taking up in your file system and that is a very very
|
||
|
|
very important differentiation in calculating things because you might have a file system that's
|
||
|
|
capable of holding a gig of space right but technically speaking that's only a gig of space in
|
||
|
|
blocks so if you have data in there you can exhaust your space you know your space before you
|
||
|
|
actually exhaust all the files in there equal that total number of space by your blocks probably
|
||
|
|
not going to be in most cases that great of a difference but it can be it can be you know a little
|
||
|
|
significant so it's just just important to be aware of that important to be aware of that very
|
||
|
|
important so remember apparent size shows the value of the the files how much space they are actually
|
||
|
|
the size of the files as opposed to how many blocks they're taking up another thing to be aware of
|
||
|
|
is how disk usage or DU handles links both hard and symbolic links by default it doesn't
|
||
|
|
deference links so it doesn't necessarily follow the links that is it that it will not count
|
||
|
|
multiple instances of a hard link so it won't deference and it won't deference or follow symbolic
|
||
|
|
links okay so the the latter option for deferencing not deferencing symbolic links is dash
|
||
|
|
capital P or dash dash no dash deference doesn't follow those so that's a default if you want to
|
||
|
|
to change that you can pass the dash capital L or dash dash deference flag and then DU will
|
||
|
|
follow symbolic links to their original files and include those in the values the dash lower case
|
||
|
|
L or count dash dash count dash links will count multiple instances of hard link each time
|
||
|
|
that instance is encountered so by default if encounter is a hard link it only counts it once
|
||
|
|
for no matter how many hard links you may have in that file system that you're looking whereas if
|
||
|
|
you pass the dash L it'll count each one of those hard links in the the total instead of ignoring it
|
||
|
|
okay so that's that now aside from looking at just the amount of blocks that a file takes up
|
||
|
|
you can get some other information out of there particularly different time values for
|
||
|
|
instance you can get if you pass it the dash dash time it will show you the modification time
|
||
|
|
of any file or in director in that director or any sub director so by default dash dash time
|
||
|
|
is going to show you the m time or modification time so that that can be pretty handy now you can
|
||
|
|
change that with dash dash time equals and then a word now there's three words here three values
|
||
|
|
here m time which is the modification time last time the file was modified that's a default so you
|
||
|
|
don't need to specify that but then there's a time which is access time that that's the last time
|
||
|
|
a file was accessed of red and then there's C time which is the last time the i node was changed
|
||
|
|
you might say what is the difference between access time C time and m time well m time if m time
|
||
|
|
when m time changes so does C time okay that's when the i node was changed that's when a file was
|
||
|
|
changed so C time and m time kind of go hand in hand the file changes
|
||
|
|
then it changes the C time but A time is a little different C time is based on i node and i
|
||
|
|
node holds information of a file that includes like time values and permissions ownership etc so
|
||
|
|
any time that you change anything in that file modified a file it changes C time but also if
|
||
|
|
you change a files permission it alters the C time but it doesn't alter the modification time
|
||
|
|
okay so be aware of that and then access time is essentially the last time somebody looked into
|
||
|
|
that file or opened that file that's access time so if it was modified it's going to change the A time
|
||
|
|
it's going to change the C time but if you're changing the permissions on there it's only going to
|
||
|
|
change the C time not the modification time or the A time okay so there there's some of the differences
|
||
|
|
so you can look at those values by passing the proper dash dash time equals word A time or C time
|
||
|
|
m time you don't have to pass anything because it is the default so be aware of that
|
||
|
|
now you can change the way that the time is displayed by default it uses a a format that is the
|
||
|
|
ISO format and that shows year year month month day day hour hour minute minute what I mean by
|
||
|
|
that would be like a four digit year two digit day two digit day two digit hour in military time two
|
||
|
|
digit minute and you can change that if you really wanted to by with the dash dash time dash
|
||
|
|
style equals and then it accepts the formats that you can pass to the date command so if you just
|
||
|
|
want to specify the hour in a minute or I'm sorry the year and the in the hour you can pass it
|
||
|
|
the plus in double quotes per-centage capital Y percentage capital M and it would show you the year
|
||
|
|
in the hour if you wanted to just show the hour and the did I say cadet I say M that would show
|
||
|
|
the year in the minute it would be dash Y capital Y we're gonna phrase that for percentage capital Y
|
||
|
|
percentage capital H is it would be the year four digit year and the two digit month minute
|
||
|
|
think I have that right let me double check yes capital Y is four digit year capital H is two
|
||
|
|
digit hour military time if you wanted minute you could do capital M for minute 00 to 59
|
||
|
|
two digit minute so look at the man command for the day command it shows you all that
|
||
|
|
information or how to get it and the man do you will show you how to get that in there so again
|
||
|
|
to recap the disk usage command the you command is a great way to see how much space a file or your
|
||
|
|
directory is using in blocks by default on your file system if you want to get more fine-tuned how
|
||
|
|
much space a file is how large a file is in bytes kilobytes megabytes whatever a parent dash dash
|
||
|
|
a parent dash size is the key there so just be aware of that I thank you very much if anything
|
||
|
|
was unclear to you head on over to the website or if you haven't head on over to the website before
|
||
|
|
listening to this do so read up on the do command follow the links in there which will hopefully
|
||
|
|
solidify anything else that I have talked about if you're unclear on that and to watch the video
|
||
|
|
of the do you command in action thank you very much thank hacker public radio for all their support
|
||
|
|
and I hope to see you in a fortnight
|
||
|
|
you have been listening to hacker public radio at hacker public radio does our
|
||
|
|
we are a community podcast network the release of shows every weekday on death
|
||
|
|
before i day today show like all our shows was contributed by an hbr listener like yourself
|
||
|
|
if you ever consider recording a podcast then visit our website to find out how easy it really is
|
||
|
|
hacker public radio was founded by the digital dot pound and the economical and computer cloud
|
||
|
|
hbr is funded by the binary revolution at binref.com all binref projects are crowd-responsive by
|
||
|
|
linear pages from shared hosting to custom private clouds go to lunar pages.com for all your hosting
|
||
|
|
needs unless otherwise stasis today's show is released under a creative comments
|
||
|
|
attribute show share a like
|
||
|
|
read also license
|