Files

196 lines
17 KiB
Plaintext
Raw Permalink Normal View History

Episode: 1032
Title: HPR1032: LiTS 011: du - disk usage
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1032/hpr1032.mp3
Transcribed: 2025-10-17 17:42:01
---
Welcome to Linux in the Shell Episode 11, the DU command.
My name is Dan Washco, I will be your host today, and I would like to remind you that
this show was hosted by Hacker Public Radio, and if you like this show, please head over
to Hacker Public Radio and see how you can support the overall project. It's a fantastic
project. Also, this audio component is a supplement to the right up on the website for the full
skinny on the DU command. Head over to linuxintheshell.org, check out episode 11's entry.
Well, let's get into it, the DU command. Basically, the DU command stands for Disk Usage,
or displays a usage of all the files or directories in a certain location. By default, it displays
those in blocks, kilobyte blocks to be specific. So if you were to execute the DU command in
your home directory, it's going to show you, line by line, each file, how many blocks and
kilobytes it's taking up, and also each directory, each directory in there, it'll show each
file in that directory and recurse all the way through in a sub-directories, provide a listing
for how many blocks that file is taking up in kilobytes, and then at the end, when it's finished,
a sum total of all the blocks. How many blocks are being utilized by that directory? You can change
the block size from kilobytes, which equals 1,024 bytes. You can change that to just about any
other value, acceptable, megabytes, gigabytes, petabytes, terabytes, you name it. Very much like
you can at the F command, you just pass it to dash capital B option, and then specify the label
for what you want. So for instance, dash capital B, capital K is a default kilobytes. Now you can
also specify dash K for kilobytes, or dash dash block size equals K. So the dash capital B is
the same thing as dash dash block dash size equals, and then specify the value. You can do dash M
for megabytes, but you could also do dash capital B capital M, or dash dash block dash size equals
M for megabytes. Now for gigabytes, terabytes, petabytes, exabytes, zedabytes, and so on, you have
to, there isn't a dash equivalent, it's dash capital B, and then G for gigabytes, T for terabytes,
P for petabytes, E for exabytes, and so on, or dash dash block dash size, and a lower case
equivalent of one of those values. Now keep in mind, much like the DF command, if you pass like
dash capital B capital T for terabytes, and you don't have a terabyte drive, more than likely,
what it's going to show is a value of one for being used, because there's not a full terabyte in
air of blocks being used, but there is some value, but because there is some value, it slows one.
If you try to specify something like petabytes, exabytes, zedabytes, you might get an error
message that comes back, it says dash B argument, z capital Z, or capital E, or P, whatever you
have passed is too large. So keep that in mind, you want to, you want to keep it relative to what
the size of those files are going to actually be, and you can circumvent this on most newer
versions of DU with the dash H, or human readable output, which is one of my favorites. So if you were
to do that, DUed, space, dash H, it puts it in human readable, and like the other ones, DF and
free, what I said is it kind of truncates it to the nearest three digit value. So if it's under a
Meg, it's probably going to show it in kilobytes. If it's over a Meg, but under a gigabyte, it's
going to show it in megabytes. And once it reaches, you know, over a gigabyte, it's going to show it
in gigabytes, as opposed to megabytes. So the dash H could be very handy for how it is displayed.
Now, when you specify or execute the DU command, it launches in the current directory and displays
everything in that current directory, it recurses through all sub-directories and displaying each
individual file on hand. You can pass it a file or a directory and get the value for just that
file or directory. Now, remember, if you pass it a directory, it's going to recurs in each
individual sub-directorie and provide you with that information. Whereas if you pass it a file,
it's only going to provide you the information for that file. When it's finished like on a directory,
it will show you a total on the last line of the value of that directory and kilobytes. Now,
if you wanted to find out how much space, how many blocks your home directory is taking up,
you can pass it to dash S or to dash, dash summary option, which instead of showing you each
individual file on how much space, for goes all that and just shows you the total amount of blocks
taken up by that location. So it can be a file or it can be a directory, for instance, DU dash S,
M slash home slash D wash go, which show the total amount of space and megabit blocks taken up by
my home directory, not each individual but just the overall total. You can also
limit the directory recursion with the dash D or dash dash max dash depth command equals some
value. So that would be a numeric value from zero to how many directories deep you would actually
want to go. So if you specify zero, that means don't recurs into any sub-directorie, it's the same
thing essentially is passing the dash S command or summary to that location. Whereas if you were to
pass one, it's only going to recurse one sub-directorie in any directory and show you any files that are
in there and the summary for any sub-directories that are in there. So that's that's what you're going
to get is a listing of all the files in that current directory, all the sub-directories. So it's
only going to go down one level, but you will see in that one level, you will see any files in
there, there are amounts and any sub-directories, but not what's in those sub that second level of
sub-directorie, but just a total. And then you'll get the total at the bottom again if you're just
passing the dash D. So be aware of that, that's how that operates the max depth. There's an option
to exclude files with the dash dash exclude equals and then you could pass it a file name if you
wanted to just exclude one file or you could pass it a wildcard like asteris.txt which would exclude
any text files that it finds in there or asteris.png and exclude any png files in the totals.
The other option is the dash capital X or dash dash exclude dash from equals and then you're going
to pass it a file and in that file you would list all the files that you want to exclude
from being processed by the DU command. Now you can specify not just files again, but patterns,
so if you wanted to exclude like all image files like asteris.png, asteris.jpg, asteris.giv,
you could put that into a file and pass that to the DU command and it would exclude all of those.
So it's pretty flexible as to what you can do. Now if you listen to what I have been saying,
DU displays its output in blocks. I didn't say file size, I said blocks and there's a
difference there and it's an important difference because it's what differentiates I believe the
DF command from the DU command is how stuff is reported and that's in blocks.
Now what a block is is your file system is separated or formatted into blocks of data and
typically the block size for your file system is going to be roughly 4K.
4K is equivalent to 4,096 bytes. So 4K blocks typically what most distributions format on the
XT4, the XT3, the XT2 file system. And you can check what your block size is on your file system
if it's an FSCK or the XT file system by doing DUMP E2FS your device. For instance, I'm SDA2 for
my root file system or SDA5 for my home file system here. I can do DUMP E2FS slash dev slash SDA5
pipe grip and in double quotes capital B block space capital S size. So it's block size in
that quotes and it will show me what my block size is on that file system. Now chances are you might
have to run that through pseudo or run it as root not as a standard user to see that but that
will report the block size and then understand what that means okay a block can be filled
can have data or not if the block has data in it it's full to DU if a block has no data it's empty
but if a block is half full it's still considered a block of use use space to the file system to DU
now if you have a file that is 4 kilobits in size that's one block DU is going to report that as one
now if you have a file that let's say is 7 kilobits in size that is going to be reported by DU
as two kilobit blocks being used so you're saying okay one kilobit or four kilobit block is
4,096 bytes two kilobit blocks okay are going to be what that's 496 and 496 which comes out to
9192 blocks so anything that's a file size of say 497 to 9192 bytes is going to take a second block
regardless of how much of that second block it takes so you could be losing anywhere from
4,091 4,095 blocks to one block kilobyte of space be aware of that but DU is going to report it
as either one or two depending on how much space she might be saying yourself if I have a ton of
files in there that are not breaking on four kilobits of piece is DU accurately reflecting how much
quotes space the files are taking up yes and no but if you want to get a number that shows closer
to the what DF would show or what you would expect by saying okay this this file is only
7,258 bytes so it's not taking up two blocks well it is taking up two blocks but not all of it
I'm not getting an accurate representation of what that file size actually is you can pass it
the dash dash apparent dash size and that'll show you how much apparent space it is taking up
not by blocks but by whatever value you're passing it by default is going to be kilobytes so
whereas a file that may be let's see a file that may be 7,000 bytes in size you pass DU to that
file it's going to show a value of two by default but if you pass apparent size it's going to show
a value of 7,000 bytes or more than likely like 1.2 files and you can muck around with this using
the DD command to test this if you do DD iF which stands for input file equals dev slash dev slash
output file OF equals test and then a BS equals a block size of 4096 and count equals one that's
one block you use DU on that command and it's going to show you a value of one whereas if you were
to pass a block size equals 7,000 a BS equals 7,000 an account of one when you pass it into there
it's not going to show you a block size or if you pass it into bytes it's going to show you a
blocks or apparent size is going to show you like 7,000 as opposed to it being one or it's actually
going to be two blocks so be aware of that there is a slight calculation difference in how it
how it calculates I've focused and I've made mention disk DU by default block size not apparent
file size so you're not getting the full amount of space that you know the file size in there you're
getting how many blocks it is actually taking up in your file system and that is a very very
very important differentiation in calculating things because you might have a file system that's
capable of holding a gig of space right but technically speaking that's only a gig of space in
blocks so if you have data in there you can exhaust your space you know your space before you
actually exhaust all the files in there equal that total number of space by your blocks probably
not going to be in most cases that great of a difference but it can be it can be you know a little
significant so it's just just important to be aware of that important to be aware of that very
important so remember apparent size shows the value of the the files how much space they are actually
the size of the files as opposed to how many blocks they're taking up another thing to be aware of
is how disk usage or DU handles links both hard and symbolic links by default it doesn't
deference links so it doesn't necessarily follow the links that is it that it will not count
multiple instances of a hard link so it won't deference and it won't deference or follow symbolic
links okay so the the latter option for deferencing not deferencing symbolic links is dash
capital P or dash dash no dash deference doesn't follow those so that's a default if you want to
to change that you can pass the dash capital L or dash dash deference flag and then DU will
follow symbolic links to their original files and include those in the values the dash lower case
L or count dash dash count dash links will count multiple instances of hard link each time
that instance is encountered so by default if encounter is a hard link it only counts it once
for no matter how many hard links you may have in that file system that you're looking whereas if
you pass the dash L it'll count each one of those hard links in the the total instead of ignoring it
okay so that's that now aside from looking at just the amount of blocks that a file takes up
you can get some other information out of there particularly different time values for
instance you can get if you pass it the dash dash time it will show you the modification time
of any file or in director in that director or any sub director so by default dash dash time
is going to show you the m time or modification time so that that can be pretty handy now you can
change that with dash dash time equals and then a word now there's three words here three values
here m time which is the modification time last time the file was modified that's a default so you
don't need to specify that but then there's a time which is access time that that's the last time
a file was accessed of red and then there's C time which is the last time the i node was changed
you might say what is the difference between access time C time and m time well m time if m time
when m time changes so does C time okay that's when the i node was changed that's when a file was
changed so C time and m time kind of go hand in hand the file changes
then it changes the C time but A time is a little different C time is based on i node and i
node holds information of a file that includes like time values and permissions ownership etc so
any time that you change anything in that file modified a file it changes C time but also if
you change a files permission it alters the C time but it doesn't alter the modification time
okay so be aware of that and then access time is essentially the last time somebody looked into
that file or opened that file that's access time so if it was modified it's going to change the A time
it's going to change the C time but if you're changing the permissions on there it's only going to
change the C time not the modification time or the A time okay so there there's some of the differences
so you can look at those values by passing the proper dash dash time equals word A time or C time
m time you don't have to pass anything because it is the default so be aware of that
now you can change the way that the time is displayed by default it uses a a format that is the
ISO format and that shows year year month month day day hour hour minute minute what I mean by
that would be like a four digit year two digit day two digit day two digit hour in military time two
digit minute and you can change that if you really wanted to by with the dash dash time dash
style equals and then it accepts the formats that you can pass to the date command so if you just
want to specify the hour in a minute or I'm sorry the year and the in the hour you can pass it
the plus in double quotes per-centage capital Y percentage capital M and it would show you the year
in the hour if you wanted to just show the hour and the did I say cadet I say M that would show
the year in the minute it would be dash Y capital Y we're gonna phrase that for percentage capital Y
percentage capital H is it would be the year four digit year and the two digit month minute
think I have that right let me double check yes capital Y is four digit year capital H is two
digit hour military time if you wanted minute you could do capital M for minute 00 to 59
two digit minute so look at the man command for the day command it shows you all that
information or how to get it and the man do you will show you how to get that in there so again
to recap the disk usage command the you command is a great way to see how much space a file or your
directory is using in blocks by default on your file system if you want to get more fine-tuned how
much space a file is how large a file is in bytes kilobytes megabytes whatever a parent dash dash
a parent dash size is the key there so just be aware of that I thank you very much if anything
was unclear to you head on over to the website or if you haven't head on over to the website before
listening to this do so read up on the do command follow the links in there which will hopefully
solidify anything else that I have talked about if you're unclear on that and to watch the video
of the do you command in action thank you very much thank hacker public radio for all their support
and I hope to see you in a fortnight
you have been listening to hacker public radio at hacker public radio does our
we are a community podcast network the release of shows every weekday on death
before i day today show like all our shows was contributed by an hbr listener like yourself
if you ever consider recording a podcast then visit our website to find out how easy it really is
hacker public radio was founded by the digital dot pound and the economical and computer cloud
hbr is funded by the binary revolution at binref.com all binref projects are crowd-responsive by
linear pages from shared hosting to custom private clouds go to lunar pages.com for all your hosting
needs unless otherwise stasis today's show is released under a creative comments
attribute show share a like
read also license