Episode: 1032 Title: HPR1032: LiTS 011: du - disk usage Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1032/hpr1032.mp3 Transcribed: 2025-10-17 17:42:01 --- Welcome to Linux in the Shell Episode 11, the DU command. My name is Dan Washco, I will be your host today, and I would like to remind you that this show was hosted by Hacker Public Radio, and if you like this show, please head over to Hacker Public Radio and see how you can support the overall project. It's a fantastic project. Also, this audio component is a supplement to the right up on the website for the full skinny on the DU command. Head over to linuxintheshell.org, check out episode 11's entry. Well, let's get into it, the DU command. Basically, the DU command stands for Disk Usage, or displays a usage of all the files or directories in a certain location. By default, it displays those in blocks, kilobyte blocks to be specific. So if you were to execute the DU command in your home directory, it's going to show you, line by line, each file, how many blocks and kilobytes it's taking up, and also each directory, each directory in there, it'll show each file in that directory and recurse all the way through in a sub-directories, provide a listing for how many blocks that file is taking up in kilobytes, and then at the end, when it's finished, a sum total of all the blocks. How many blocks are being utilized by that directory? You can change the block size from kilobytes, which equals 1,024 bytes. You can change that to just about any other value, acceptable, megabytes, gigabytes, petabytes, terabytes, you name it. Very much like you can at the F command, you just pass it to dash capital B option, and then specify the label for what you want. So for instance, dash capital B, capital K is a default kilobytes. Now you can also specify dash K for kilobytes, or dash dash block size equals K. So the dash capital B is the same thing as dash dash block dash size equals, and then specify the value. You can do dash M for megabytes, but you could also do dash capital B capital M, or dash dash block dash size equals M for megabytes. Now for gigabytes, terabytes, petabytes, exabytes, zedabytes, and so on, you have to, there isn't a dash equivalent, it's dash capital B, and then G for gigabytes, T for terabytes, P for petabytes, E for exabytes, and so on, or dash dash block dash size, and a lower case equivalent of one of those values. Now keep in mind, much like the DF command, if you pass like dash capital B capital T for terabytes, and you don't have a terabyte drive, more than likely, what it's going to show is a value of one for being used, because there's not a full terabyte in air of blocks being used, but there is some value, but because there is some value, it slows one. If you try to specify something like petabytes, exabytes, zedabytes, you might get an error message that comes back, it says dash B argument, z capital Z, or capital E, or P, whatever you have passed is too large. So keep that in mind, you want to, you want to keep it relative to what the size of those files are going to actually be, and you can circumvent this on most newer versions of DU with the dash H, or human readable output, which is one of my favorites. So if you were to do that, DUed, space, dash H, it puts it in human readable, and like the other ones, DF and free, what I said is it kind of truncates it to the nearest three digit value. So if it's under a Meg, it's probably going to show it in kilobytes. If it's over a Meg, but under a gigabyte, it's going to show it in megabytes. And once it reaches, you know, over a gigabyte, it's going to show it in gigabytes, as opposed to megabytes. So the dash H could be very handy for how it is displayed. Now, when you specify or execute the DU command, it launches in the current directory and displays everything in that current directory, it recurses through all sub-directories and displaying each individual file on hand. You can pass it a file or a directory and get the value for just that file or directory. Now, remember, if you pass it a directory, it's going to recurs in each individual sub-directorie and provide you with that information. Whereas if you pass it a file, it's only going to provide you the information for that file. When it's finished like on a directory, it will show you a total on the last line of the value of that directory and kilobytes. Now, if you wanted to find out how much space, how many blocks your home directory is taking up, you can pass it to dash S or to dash, dash summary option, which instead of showing you each individual file on how much space, for goes all that and just shows you the total amount of blocks taken up by that location. So it can be a file or it can be a directory, for instance, DU dash S, M slash home slash D wash go, which show the total amount of space and megabit blocks taken up by my home directory, not each individual but just the overall total. You can also limit the directory recursion with the dash D or dash dash max dash depth command equals some value. So that would be a numeric value from zero to how many directories deep you would actually want to go. So if you specify zero, that means don't recurs into any sub-directorie, it's the same thing essentially is passing the dash S command or summary to that location. Whereas if you were to pass one, it's only going to recurse one sub-directorie in any directory and show you any files that are in there and the summary for any sub-directories that are in there. So that's that's what you're going to get is a listing of all the files in that current directory, all the sub-directories. So it's only going to go down one level, but you will see in that one level, you will see any files in there, there are amounts and any sub-directories, but not what's in those sub that second level of sub-directorie, but just a total. And then you'll get the total at the bottom again if you're just passing the dash D. So be aware of that, that's how that operates the max depth. There's an option to exclude files with the dash dash exclude equals and then you could pass it a file name if you wanted to just exclude one file or you could pass it a wildcard like asteris.txt which would exclude any text files that it finds in there or asteris.png and exclude any png files in the totals. The other option is the dash capital X or dash dash exclude dash from equals and then you're going to pass it a file and in that file you would list all the files that you want to exclude from being processed by the DU command. Now you can specify not just files again, but patterns, so if you wanted to exclude like all image files like asteris.png, asteris.jpg, asteris.giv, you could put that into a file and pass that to the DU command and it would exclude all of those. So it's pretty flexible as to what you can do. Now if you listen to what I have been saying, DU displays its output in blocks. I didn't say file size, I said blocks and there's a difference there and it's an important difference because it's what differentiates I believe the DF command from the DU command is how stuff is reported and that's in blocks. Now what a block is is your file system is separated or formatted into blocks of data and typically the block size for your file system is going to be roughly 4K. 4K is equivalent to 4,096 bytes. So 4K blocks typically what most distributions format on the XT4, the XT3, the XT2 file system. And you can check what your block size is on your file system if it's an FSCK or the XT file system by doing DUMP E2FS your device. For instance, I'm SDA2 for my root file system or SDA5 for my home file system here. I can do DUMP E2FS slash dev slash SDA5 pipe grip and in double quotes capital B block space capital S size. So it's block size in that quotes and it will show me what my block size is on that file system. Now chances are you might have to run that through pseudo or run it as root not as a standard user to see that but that will report the block size and then understand what that means okay a block can be filled can have data or not if the block has data in it it's full to DU if a block has no data it's empty but if a block is half full it's still considered a block of use use space to the file system to DU now if you have a file that is 4 kilobits in size that's one block DU is going to report that as one now if you have a file that let's say is 7 kilobits in size that is going to be reported by DU as two kilobit blocks being used so you're saying okay one kilobit or four kilobit block is 4,096 bytes two kilobit blocks okay are going to be what that's 496 and 496 which comes out to 9192 blocks so anything that's a file size of say 497 to 9192 bytes is going to take a second block regardless of how much of that second block it takes so you could be losing anywhere from 4,091 4,095 blocks to one block kilobyte of space be aware of that but DU is going to report it as either one or two depending on how much space she might be saying yourself if I have a ton of files in there that are not breaking on four kilobits of piece is DU accurately reflecting how much quotes space the files are taking up yes and no but if you want to get a number that shows closer to the what DF would show or what you would expect by saying okay this this file is only 7,258 bytes so it's not taking up two blocks well it is taking up two blocks but not all of it I'm not getting an accurate representation of what that file size actually is you can pass it the dash dash apparent dash size and that'll show you how much apparent space it is taking up not by blocks but by whatever value you're passing it by default is going to be kilobytes so whereas a file that may be let's see a file that may be 7,000 bytes in size you pass DU to that file it's going to show a value of two by default but if you pass apparent size it's going to show a value of 7,000 bytes or more than likely like 1.2 files and you can muck around with this using the DD command to test this if you do DD iF which stands for input file equals dev slash dev slash output file OF equals test and then a BS equals a block size of 4096 and count equals one that's one block you use DU on that command and it's going to show you a value of one whereas if you were to pass a block size equals 7,000 a BS equals 7,000 an account of one when you pass it into there it's not going to show you a block size or if you pass it into bytes it's going to show you a blocks or apparent size is going to show you like 7,000 as opposed to it being one or it's actually going to be two blocks so be aware of that there is a slight calculation difference in how it how it calculates I've focused and I've made mention disk DU by default block size not apparent file size so you're not getting the full amount of space that you know the file size in there you're getting how many blocks it is actually taking up in your file system and that is a very very very important differentiation in calculating things because you might have a file system that's capable of holding a gig of space right but technically speaking that's only a gig of space in blocks so if you have data in there you can exhaust your space you know your space before you actually exhaust all the files in there equal that total number of space by your blocks probably not going to be in most cases that great of a difference but it can be it can be you know a little significant so it's just just important to be aware of that important to be aware of that very important so remember apparent size shows the value of the the files how much space they are actually the size of the files as opposed to how many blocks they're taking up another thing to be aware of is how disk usage or DU handles links both hard and symbolic links by default it doesn't deference links so it doesn't necessarily follow the links that is it that it will not count multiple instances of a hard link so it won't deference and it won't deference or follow symbolic links okay so the the latter option for deferencing not deferencing symbolic links is dash capital P or dash dash no dash deference doesn't follow those so that's a default if you want to to change that you can pass the dash capital L or dash dash deference flag and then DU will follow symbolic links to their original files and include those in the values the dash lower case L or count dash dash count dash links will count multiple instances of hard link each time that instance is encountered so by default if encounter is a hard link it only counts it once for no matter how many hard links you may have in that file system that you're looking whereas if you pass the dash L it'll count each one of those hard links in the the total instead of ignoring it okay so that's that now aside from looking at just the amount of blocks that a file takes up you can get some other information out of there particularly different time values for instance you can get if you pass it the dash dash time it will show you the modification time of any file or in director in that director or any sub director so by default dash dash time is going to show you the m time or modification time so that that can be pretty handy now you can change that with dash dash time equals and then a word now there's three words here three values here m time which is the modification time last time the file was modified that's a default so you don't need to specify that but then there's a time which is access time that that's the last time a file was accessed of red and then there's C time which is the last time the i node was changed you might say what is the difference between access time C time and m time well m time if m time when m time changes so does C time okay that's when the i node was changed that's when a file was changed so C time and m time kind of go hand in hand the file changes then it changes the C time but A time is a little different C time is based on i node and i node holds information of a file that includes like time values and permissions ownership etc so any time that you change anything in that file modified a file it changes C time but also if you change a files permission it alters the C time but it doesn't alter the modification time okay so be aware of that and then access time is essentially the last time somebody looked into that file or opened that file that's access time so if it was modified it's going to change the A time it's going to change the C time but if you're changing the permissions on there it's only going to change the C time not the modification time or the A time okay so there there's some of the differences so you can look at those values by passing the proper dash dash time equals word A time or C time m time you don't have to pass anything because it is the default so be aware of that now you can change the way that the time is displayed by default it uses a a format that is the ISO format and that shows year year month month day day hour hour minute minute what I mean by that would be like a four digit year two digit day two digit day two digit hour in military time two digit minute and you can change that if you really wanted to by with the dash dash time dash style equals and then it accepts the formats that you can pass to the date command so if you just want to specify the hour in a minute or I'm sorry the year and the in the hour you can pass it the plus in double quotes per-centage capital Y percentage capital M and it would show you the year in the hour if you wanted to just show the hour and the did I say cadet I say M that would show the year in the minute it would be dash Y capital Y we're gonna phrase that for percentage capital Y percentage capital H is it would be the year four digit year and the two digit month minute think I have that right let me double check yes capital Y is four digit year capital H is two digit hour military time if you wanted minute you could do capital M for minute 00 to 59 two digit minute so look at the man command for the day command it shows you all that information or how to get it and the man do you will show you how to get that in there so again to recap the disk usage command the you command is a great way to see how much space a file or your directory is using in blocks by default on your file system if you want to get more fine-tuned how much space a file is how large a file is in bytes kilobytes megabytes whatever a parent dash dash a parent dash size is the key there so just be aware of that I thank you very much if anything was unclear to you head on over to the website or if you haven't head on over to the website before listening to this do so read up on the do command follow the links in there which will hopefully solidify anything else that I have talked about if you're unclear on that and to watch the video of the do you command in action thank you very much thank hacker public radio for all their support and I hope to see you in a fortnight you have been listening to hacker public radio at hacker public radio does our we are a community podcast network the release of shows every weekday on death before i day today show like all our shows was contributed by an hbr listener like yourself if you ever consider recording a podcast then visit our website to find out how easy it really is hacker public radio was founded by the digital dot pound and the economical and computer cloud hbr is funded by the binary revolution at binref.com all binref projects are crowd-responsive by linear pages from shared hosting to custom private clouds go to lunar pages.com for all your hosting needs unless otherwise stasis today's show is released under a creative comments attribute show share a like read also license