Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
195
hpr_transcripts/hpr1032.txt
Normal file
195
hpr_transcripts/hpr1032.txt
Normal file
@@ -0,0 +1,195 @@
|
||||
Episode: 1032
|
||||
Title: HPR1032: LiTS 011: du - disk usage
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1032/hpr1032.mp3
|
||||
Transcribed: 2025-10-17 17:42:01
|
||||
|
||||
---
|
||||
|
||||
Welcome to Linux in the Shell Episode 11, the DU command.
|
||||
My name is Dan Washco, I will be your host today, and I would like to remind you that
|
||||
this show was hosted by Hacker Public Radio, and if you like this show, please head over
|
||||
to Hacker Public Radio and see how you can support the overall project. It's a fantastic
|
||||
project. Also, this audio component is a supplement to the right up on the website for the full
|
||||
skinny on the DU command. Head over to linuxintheshell.org, check out episode 11's entry.
|
||||
Well, let's get into it, the DU command. Basically, the DU command stands for Disk Usage,
|
||||
or displays a usage of all the files or directories in a certain location. By default, it displays
|
||||
those in blocks, kilobyte blocks to be specific. So if you were to execute the DU command in
|
||||
your home directory, it's going to show you, line by line, each file, how many blocks and
|
||||
kilobytes it's taking up, and also each directory, each directory in there, it'll show each
|
||||
file in that directory and recurse all the way through in a sub-directories, provide a listing
|
||||
for how many blocks that file is taking up in kilobytes, and then at the end, when it's finished,
|
||||
a sum total of all the blocks. How many blocks are being utilized by that directory? You can change
|
||||
the block size from kilobytes, which equals 1,024 bytes. You can change that to just about any
|
||||
other value, acceptable, megabytes, gigabytes, petabytes, terabytes, you name it. Very much like
|
||||
you can at the F command, you just pass it to dash capital B option, and then specify the label
|
||||
for what you want. So for instance, dash capital B, capital K is a default kilobytes. Now you can
|
||||
also specify dash K for kilobytes, or dash dash block size equals K. So the dash capital B is
|
||||
the same thing as dash dash block dash size equals, and then specify the value. You can do dash M
|
||||
for megabytes, but you could also do dash capital B capital M, or dash dash block dash size equals
|
||||
M for megabytes. Now for gigabytes, terabytes, petabytes, exabytes, zedabytes, and so on, you have
|
||||
to, there isn't a dash equivalent, it's dash capital B, and then G for gigabytes, T for terabytes,
|
||||
P for petabytes, E for exabytes, and so on, or dash dash block dash size, and a lower case
|
||||
equivalent of one of those values. Now keep in mind, much like the DF command, if you pass like
|
||||
dash capital B capital T for terabytes, and you don't have a terabyte drive, more than likely,
|
||||
what it's going to show is a value of one for being used, because there's not a full terabyte in
|
||||
air of blocks being used, but there is some value, but because there is some value, it slows one.
|
||||
If you try to specify something like petabytes, exabytes, zedabytes, you might get an error
|
||||
message that comes back, it says dash B argument, z capital Z, or capital E, or P, whatever you
|
||||
have passed is too large. So keep that in mind, you want to, you want to keep it relative to what
|
||||
the size of those files are going to actually be, and you can circumvent this on most newer
|
||||
versions of DU with the dash H, or human readable output, which is one of my favorites. So if you were
|
||||
to do that, DUed, space, dash H, it puts it in human readable, and like the other ones, DF and
|
||||
free, what I said is it kind of truncates it to the nearest three digit value. So if it's under a
|
||||
Meg, it's probably going to show it in kilobytes. If it's over a Meg, but under a gigabyte, it's
|
||||
going to show it in megabytes. And once it reaches, you know, over a gigabyte, it's going to show it
|
||||
in gigabytes, as opposed to megabytes. So the dash H could be very handy for how it is displayed.
|
||||
Now, when you specify or execute the DU command, it launches in the current directory and displays
|
||||
everything in that current directory, it recurses through all sub-directories and displaying each
|
||||
individual file on hand. You can pass it a file or a directory and get the value for just that
|
||||
file or directory. Now, remember, if you pass it a directory, it's going to recurs in each
|
||||
individual sub-directorie and provide you with that information. Whereas if you pass it a file,
|
||||
it's only going to provide you the information for that file. When it's finished like on a directory,
|
||||
it will show you a total on the last line of the value of that directory and kilobytes. Now,
|
||||
if you wanted to find out how much space, how many blocks your home directory is taking up,
|
||||
you can pass it to dash S or to dash, dash summary option, which instead of showing you each
|
||||
individual file on how much space, for goes all that and just shows you the total amount of blocks
|
||||
taken up by that location. So it can be a file or it can be a directory, for instance, DU dash S,
|
||||
M slash home slash D wash go, which show the total amount of space and megabit blocks taken up by
|
||||
my home directory, not each individual but just the overall total. You can also
|
||||
limit the directory recursion with the dash D or dash dash max dash depth command equals some
|
||||
value. So that would be a numeric value from zero to how many directories deep you would actually
|
||||
want to go. So if you specify zero, that means don't recurs into any sub-directorie, it's the same
|
||||
thing essentially is passing the dash S command or summary to that location. Whereas if you were to
|
||||
pass one, it's only going to recurse one sub-directorie in any directory and show you any files that are
|
||||
in there and the summary for any sub-directories that are in there. So that's that's what you're going
|
||||
to get is a listing of all the files in that current directory, all the sub-directories. So it's
|
||||
only going to go down one level, but you will see in that one level, you will see any files in
|
||||
there, there are amounts and any sub-directories, but not what's in those sub that second level of
|
||||
sub-directorie, but just a total. And then you'll get the total at the bottom again if you're just
|
||||
passing the dash D. So be aware of that, that's how that operates the max depth. There's an option
|
||||
to exclude files with the dash dash exclude equals and then you could pass it a file name if you
|
||||
wanted to just exclude one file or you could pass it a wildcard like asteris.txt which would exclude
|
||||
any text files that it finds in there or asteris.png and exclude any png files in the totals.
|
||||
The other option is the dash capital X or dash dash exclude dash from equals and then you're going
|
||||
to pass it a file and in that file you would list all the files that you want to exclude
|
||||
from being processed by the DU command. Now you can specify not just files again, but patterns,
|
||||
so if you wanted to exclude like all image files like asteris.png, asteris.jpg, asteris.giv,
|
||||
you could put that into a file and pass that to the DU command and it would exclude all of those.
|
||||
So it's pretty flexible as to what you can do. Now if you listen to what I have been saying,
|
||||
DU displays its output in blocks. I didn't say file size, I said blocks and there's a
|
||||
difference there and it's an important difference because it's what differentiates I believe the
|
||||
DF command from the DU command is how stuff is reported and that's in blocks.
|
||||
Now what a block is is your file system is separated or formatted into blocks of data and
|
||||
typically the block size for your file system is going to be roughly 4K.
|
||||
4K is equivalent to 4,096 bytes. So 4K blocks typically what most distributions format on the
|
||||
XT4, the XT3, the XT2 file system. And you can check what your block size is on your file system
|
||||
if it's an FSCK or the XT file system by doing DUMP E2FS your device. For instance, I'm SDA2 for
|
||||
my root file system or SDA5 for my home file system here. I can do DUMP E2FS slash dev slash SDA5
|
||||
pipe grip and in double quotes capital B block space capital S size. So it's block size in
|
||||
that quotes and it will show me what my block size is on that file system. Now chances are you might
|
||||
have to run that through pseudo or run it as root not as a standard user to see that but that
|
||||
will report the block size and then understand what that means okay a block can be filled
|
||||
can have data or not if the block has data in it it's full to DU if a block has no data it's empty
|
||||
but if a block is half full it's still considered a block of use use space to the file system to DU
|
||||
now if you have a file that is 4 kilobits in size that's one block DU is going to report that as one
|
||||
now if you have a file that let's say is 7 kilobits in size that is going to be reported by DU
|
||||
as two kilobit blocks being used so you're saying okay one kilobit or four kilobit block is
|
||||
4,096 bytes two kilobit blocks okay are going to be what that's 496 and 496 which comes out to
|
||||
9192 blocks so anything that's a file size of say 497 to 9192 bytes is going to take a second block
|
||||
regardless of how much of that second block it takes so you could be losing anywhere from
|
||||
4,091 4,095 blocks to one block kilobyte of space be aware of that but DU is going to report it
|
||||
as either one or two depending on how much space she might be saying yourself if I have a ton of
|
||||
files in there that are not breaking on four kilobits of piece is DU accurately reflecting how much
|
||||
quotes space the files are taking up yes and no but if you want to get a number that shows closer
|
||||
to the what DF would show or what you would expect by saying okay this this file is only
|
||||
7,258 bytes so it's not taking up two blocks well it is taking up two blocks but not all of it
|
||||
I'm not getting an accurate representation of what that file size actually is you can pass it
|
||||
the dash dash apparent dash size and that'll show you how much apparent space it is taking up
|
||||
not by blocks but by whatever value you're passing it by default is going to be kilobytes so
|
||||
whereas a file that may be let's see a file that may be 7,000 bytes in size you pass DU to that
|
||||
file it's going to show a value of two by default but if you pass apparent size it's going to show
|
||||
a value of 7,000 bytes or more than likely like 1.2 files and you can muck around with this using
|
||||
the DD command to test this if you do DD iF which stands for input file equals dev slash dev slash
|
||||
output file OF equals test and then a BS equals a block size of 4096 and count equals one that's
|
||||
one block you use DU on that command and it's going to show you a value of one whereas if you were
|
||||
to pass a block size equals 7,000 a BS equals 7,000 an account of one when you pass it into there
|
||||
it's not going to show you a block size or if you pass it into bytes it's going to show you a
|
||||
blocks or apparent size is going to show you like 7,000 as opposed to it being one or it's actually
|
||||
going to be two blocks so be aware of that there is a slight calculation difference in how it
|
||||
how it calculates I've focused and I've made mention disk DU by default block size not apparent
|
||||
file size so you're not getting the full amount of space that you know the file size in there you're
|
||||
getting how many blocks it is actually taking up in your file system and that is a very very
|
||||
very important differentiation in calculating things because you might have a file system that's
|
||||
capable of holding a gig of space right but technically speaking that's only a gig of space in
|
||||
blocks so if you have data in there you can exhaust your space you know your space before you
|
||||
actually exhaust all the files in there equal that total number of space by your blocks probably
|
||||
not going to be in most cases that great of a difference but it can be it can be you know a little
|
||||
significant so it's just just important to be aware of that important to be aware of that very
|
||||
important so remember apparent size shows the value of the the files how much space they are actually
|
||||
the size of the files as opposed to how many blocks they're taking up another thing to be aware of
|
||||
is how disk usage or DU handles links both hard and symbolic links by default it doesn't
|
||||
deference links so it doesn't necessarily follow the links that is it that it will not count
|
||||
multiple instances of a hard link so it won't deference and it won't deference or follow symbolic
|
||||
links okay so the the latter option for deferencing not deferencing symbolic links is dash
|
||||
capital P or dash dash no dash deference doesn't follow those so that's a default if you want to
|
||||
to change that you can pass the dash capital L or dash dash deference flag and then DU will
|
||||
follow symbolic links to their original files and include those in the values the dash lower case
|
||||
L or count dash dash count dash links will count multiple instances of hard link each time
|
||||
that instance is encountered so by default if encounter is a hard link it only counts it once
|
||||
for no matter how many hard links you may have in that file system that you're looking whereas if
|
||||
you pass the dash L it'll count each one of those hard links in the the total instead of ignoring it
|
||||
okay so that's that now aside from looking at just the amount of blocks that a file takes up
|
||||
you can get some other information out of there particularly different time values for
|
||||
instance you can get if you pass it the dash dash time it will show you the modification time
|
||||
of any file or in director in that director or any sub director so by default dash dash time
|
||||
is going to show you the m time or modification time so that that can be pretty handy now you can
|
||||
change that with dash dash time equals and then a word now there's three words here three values
|
||||
here m time which is the modification time last time the file was modified that's a default so you
|
||||
don't need to specify that but then there's a time which is access time that that's the last time
|
||||
a file was accessed of red and then there's C time which is the last time the i node was changed
|
||||
you might say what is the difference between access time C time and m time well m time if m time
|
||||
when m time changes so does C time okay that's when the i node was changed that's when a file was
|
||||
changed so C time and m time kind of go hand in hand the file changes
|
||||
then it changes the C time but A time is a little different C time is based on i node and i
|
||||
node holds information of a file that includes like time values and permissions ownership etc so
|
||||
any time that you change anything in that file modified a file it changes C time but also if
|
||||
you change a files permission it alters the C time but it doesn't alter the modification time
|
||||
okay so be aware of that and then access time is essentially the last time somebody looked into
|
||||
that file or opened that file that's access time so if it was modified it's going to change the A time
|
||||
it's going to change the C time but if you're changing the permissions on there it's only going to
|
||||
change the C time not the modification time or the A time okay so there there's some of the differences
|
||||
so you can look at those values by passing the proper dash dash time equals word A time or C time
|
||||
m time you don't have to pass anything because it is the default so be aware of that
|
||||
now you can change the way that the time is displayed by default it uses a a format that is the
|
||||
ISO format and that shows year year month month day day hour hour minute minute what I mean by
|
||||
that would be like a four digit year two digit day two digit day two digit hour in military time two
|
||||
digit minute and you can change that if you really wanted to by with the dash dash time dash
|
||||
style equals and then it accepts the formats that you can pass to the date command so if you just
|
||||
want to specify the hour in a minute or I'm sorry the year and the in the hour you can pass it
|
||||
the plus in double quotes per-centage capital Y percentage capital M and it would show you the year
|
||||
in the hour if you wanted to just show the hour and the did I say cadet I say M that would show
|
||||
the year in the minute it would be dash Y capital Y we're gonna phrase that for percentage capital Y
|
||||
percentage capital H is it would be the year four digit year and the two digit month minute
|
||||
think I have that right let me double check yes capital Y is four digit year capital H is two
|
||||
digit hour military time if you wanted minute you could do capital M for minute 00 to 59
|
||||
two digit minute so look at the man command for the day command it shows you all that
|
||||
information or how to get it and the man do you will show you how to get that in there so again
|
||||
to recap the disk usage command the you command is a great way to see how much space a file or your
|
||||
directory is using in blocks by default on your file system if you want to get more fine-tuned how
|
||||
much space a file is how large a file is in bytes kilobytes megabytes whatever a parent dash dash
|
||||
a parent dash size is the key there so just be aware of that I thank you very much if anything
|
||||
was unclear to you head on over to the website or if you haven't head on over to the website before
|
||||
listening to this do so read up on the do command follow the links in there which will hopefully
|
||||
solidify anything else that I have talked about if you're unclear on that and to watch the video
|
||||
of the do you command in action thank you very much thank hacker public radio for all their support
|
||||
and I hope to see you in a fortnight
|
||||
you have been listening to hacker public radio at hacker public radio does our
|
||||
we are a community podcast network the release of shows every weekday on death
|
||||
before i day today show like all our shows was contributed by an hbr listener like yourself
|
||||
if you ever consider recording a podcast then visit our website to find out how easy it really is
|
||||
hacker public radio was founded by the digital dot pound and the economical and computer cloud
|
||||
hbr is funded by the binary revolution at binref.com all binref projects are crowd-responsive by
|
||||
linear pages from shared hosting to custom private clouds go to lunar pages.com for all your hosting
|
||||
needs unless otherwise stasis today's show is released under a creative comments
|
||||
attribute show share a like
|
||||
read also license
|
||||
Reference in New Issue
Block a user