191 lines
12 KiB
Plaintext
191 lines
12 KiB
Plaintext
|
|
Episode: 1012
|
||
|
|
Title: HPR1012: LiTS 009: w command and linux load averages
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1012/hpr1012.mp3
|
||
|
|
Transcribed: 2025-10-17 17:22:39
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
Welcome to Linux in the Shell episode 9, W command.
|
||
|
|
My name is Dan Washkel.
|
||
|
|
I'll be your host, and before I begin, I'd like to.
|
||
|
|
Thanks.
|
||
|
|
Thank you for your time, and I'd like to remind you that if you have not already visited
|
||
|
|
the website, please do so at your earliest convenience for the full right up of the W command
|
||
|
|
and the corresponding example video.
|
||
|
|
Alright, the W command.
|
||
|
|
What the W command does is essentially displays information about the users that are currently
|
||
|
|
logged on to the machine and their processes.
|
||
|
|
It displays it in two separate sections.
|
||
|
|
The first section is the header information, and that tells general system information.
|
||
|
|
The current time, it also shows the up time, how many users are logged in, and then the
|
||
|
|
CPU load average.
|
||
|
|
Now, I'm going to come back to the CPU load average because that's very important, and
|
||
|
|
it's pretty much the reason why I chose the W command for this episode.
|
||
|
|
So in conjunction with the after the header information, you have the body of the W command,
|
||
|
|
which is a list of all the users that are logged into the system, and some corresponding
|
||
|
|
information.
|
||
|
|
Now, each line on the body of the W command lists the user where they're logged in from
|
||
|
|
their TTY, whether they're on a TTY 1, 2, or 3, or on a PTS.
|
||
|
|
There's also the time that they are logged in at, and how long they have been idle, if
|
||
|
|
they are idle, in minutes or hours and seconds.
|
||
|
|
Then there's a JCPU, which means the time that is used by all processes attached to the
|
||
|
|
TTY that are logged into, and then there's PCPU, which is just a time used by the current
|
||
|
|
processes that are executed by that user at that moment, and then a line that says what
|
||
|
|
process that the user is currently running, or what is running at the moment, or the
|
||
|
|
last process.
|
||
|
|
Now the W command takes a few options, there's a dash H or dash, dash, no, dash header, which
|
||
|
|
omits printing the header, then there's the dash, ass, or the dash, dash, short, which
|
||
|
|
uses the short format, and it does not print the log in time, the JCPU or the PCPU times,
|
||
|
|
and then there's the dash F or the dash, dash, from, which toggles printing the from remote
|
||
|
|
host name field.
|
||
|
|
Now by default, chances are your version of W is compiled without showing the from, which
|
||
|
|
is where they're logged in from, and if you choose the dash F option when you execute
|
||
|
|
it, it will put in an extra column between TTY and log in, which will say from where.
|
||
|
|
So if they're logged into the local host, it will show, chances are show nothing, or if
|
||
|
|
they're in an accession and they have like a terminal window open, it will show their accession
|
||
|
|
and the display that they're logged in from, so more than likely it'll be 0.0 that's
|
||
|
|
coming from.
|
||
|
|
If it's an remote session, it would show from the host that they're logged in from either
|
||
|
|
the host name or the IP address.
|
||
|
|
Now there's the old style dash, or dash dash, old dash style output, which prints blank
|
||
|
|
spaces for idle times less than one minute.
|
||
|
|
And then finally, you can choose to display just specific information on a single user
|
||
|
|
by typing W and the user name, and it will show only that user that's logged in and
|
||
|
|
the corresponding information.
|
||
|
|
So to step back a second and just understand a little difference between JCPU and PCPU,
|
||
|
|
what JCPU offers is the time used by all the processes attached to the TTY.
|
||
|
|
So any process that's running currently by that TTY might not be the current process
|
||
|
|
that is being displayed, but any processes that are running for that TTY will be, it
|
||
|
|
will display the CPU time for that JCPU session.
|
||
|
|
So for instance, I have a process running called XINIT on my X session that I started X
|
||
|
|
with, and that's showing that it's been idle for 17 hours and 21 minutes.
|
||
|
|
Total JCPU, the CPU time that's been utilized by it is 7 minutes and 46 seconds, so that's
|
||
|
|
the total of all the running processes of CPU time.
|
||
|
|
And then PCPU, which is the time used by the current process named in the what field.
|
||
|
|
So XINIT, like right now, is pretty much lying dormant, so it's not using any PCPU cycles
|
||
|
|
at the moment.
|
||
|
|
So that's what that basically displays.
|
||
|
|
So that's the difference between these two, but the big thing I wanted to talk about
|
||
|
|
was load average.
|
||
|
|
And load average is not only shown by the W command, but it's also shown by a very helpful
|
||
|
|
command called top.
|
||
|
|
And I wanted to cover load average in W because it gives me a little bit of time to do that,
|
||
|
|
whereas trying to cover it in top, there's a whole lot of stuff to cover in top.
|
||
|
|
It just makes it a little more convenient.
|
||
|
|
So a little foreshadowing, expect a show about top in the very near future, if not maybe
|
||
|
|
the next show, haven't decided yet.
|
||
|
|
But what load average is, you will see three values up there.
|
||
|
|
It's the CPU load average of the past minute, the past five minutes, and the past 15 minutes.
|
||
|
|
And that's not a percentage of CPU usage time.
|
||
|
|
It's load average over a minute.
|
||
|
|
Now, if you look, if you think about CPU usage, and you can say, well, CPU has one state.
|
||
|
|
It's either being used or not, so it's either zero or a hundred percent usage.
|
||
|
|
But what load average means over the period of one minute, five minutes, and fifteen minutes,
|
||
|
|
it's how much the processor's being used and how many processes are waiting for the CPU
|
||
|
|
over that past minute so that it to be able to be processed by the CPU.
|
||
|
|
If you read a lot about this, the common analogy is a highway or a traffic pattern, where
|
||
|
|
the CPU is a highway and an on ramp is the processes waiting to be processed by the CPU.
|
||
|
|
So if your on ramp is empty, that means the CPU is handling the processes as they come
|
||
|
|
in without any overhead.
|
||
|
|
So it's running just fine, the system is not under any significant load.
|
||
|
|
But if all the processes that come in are in the highway on the highway lanes, there's
|
||
|
|
nobody waiting, but the highways filled up with all your processes, that means the CPU
|
||
|
|
is running at full bore.
|
||
|
|
And that's not necessarily a bad thing, so to speak.
|
||
|
|
It just means that your CPU is perfectly suited to handle all the processes that are coming
|
||
|
|
in and, you know, relatively no more.
|
||
|
|
Nobody's waiting, but there's no, there's, it's running full bore.
|
||
|
|
Okay, so what you would see in that case under load average is a one for that process.
|
||
|
|
So if the CPU is running load averages of one, that means that CPU is handling all the
|
||
|
|
processes, but, and there are no processes waiting, but it's, it's constantly running.
|
||
|
|
Then anything above one means that processes are still starting to back up on that on ramp.
|
||
|
|
So the highway is full, and the CPU is processing all this stuff, and there's a backup on
|
||
|
|
the own ramp, so they're starting to get a little traffic backup.
|
||
|
|
And that's when your load average is going to start to increase beyond one.
|
||
|
|
To look at it a different way, you can look at it like in a percentage that if, if you're,
|
||
|
|
let's say you see your load average is 0.50, that means your system, your CPU is capable
|
||
|
|
of handling twice the amount of load that you currently have on it.
|
||
|
|
So you're only using, utilizing about half the CPU processing power.
|
||
|
|
Now if it is at one, that means you're utilizing 100% of the CPU processing power.
|
||
|
|
And if the load average goes to say two, that means you're using twice as much processing
|
||
|
|
power.
|
||
|
|
You'd need twice as much processing power to handle the load.
|
||
|
|
So you really need to think about your system, and now those are rough estimates, and
|
||
|
|
it's not exactly a one to one ratio, but that is kind of a way to look at it.
|
||
|
|
So what you have to bear in mind, not just these values, but the values over the periods
|
||
|
|
of time from one to five and 15 minutes.
|
||
|
|
Because just looking at one value is not indicative of the common load on the system.
|
||
|
|
And be aware of what you're doing on the system.
|
||
|
|
If you're running your system and not doing anything on it, or just doing normal activities,
|
||
|
|
and you pop over there to look at your load averages, and you see numbers between 50 and
|
||
|
|
1.2, you might want to start observing what's going on.
|
||
|
|
Whereas if you're under like 0.70, that's a pretty good indication that your system's
|
||
|
|
not being taxed at all.
|
||
|
|
Now if you're normally under 0.70, and you start compiling a program and you're playing
|
||
|
|
a game, and for a couple of, you know, for the minute load average, you spike up to maybe
|
||
|
|
1.5 or two, but then it drops back down.
|
||
|
|
You don't notice that same level of spike in the five and 15 minutes.
|
||
|
|
You're not necessarily taxing your system.
|
||
|
|
It just means that when you're, you know, that application started up, it put a load on
|
||
|
|
your system, and things went back to normal, and it was able to handle things on a regular
|
||
|
|
basis that you were throwing at it.
|
||
|
|
But be aware of those numbers.
|
||
|
|
If you're wanting to observe your system performance and determine whether what you're
|
||
|
|
doing with your system, you have at least the processing power to perform whatever it
|
||
|
|
is that you need to do.
|
||
|
|
Now it's not just a one-to-one ratio here of percentages.
|
||
|
|
Well, be in mind that if you are on a multi-processor or on a dual-course or quad-core multi-course
|
||
|
|
system, the load average is not just one and above is a problem.
|
||
|
|
It's essentially a value of 1 per processor.
|
||
|
|
So if you're on a dual-course system like this laptop I have right here is, and this
|
||
|
|
is currently showing me a load average of 0.93 for the past minute, 0.87 for the past
|
||
|
|
five minutes and 0.80 for the past 15 minutes.
|
||
|
|
Now I had said you probably want to be careful if your load average goes above 0.7 and
|
||
|
|
start observing what's going on.
|
||
|
|
So this is a dual-course system.
|
||
|
|
So there's essentially two processors in here.
|
||
|
|
And one, two processors means you got to double those load average values.
|
||
|
|
So a value of two in a dual-course processor system means that both processors are being
|
||
|
|
utilized at a hundred percent.
|
||
|
|
Instead of it being one, it's a value of two.
|
||
|
|
So you would want to be alarmed at what's going on in your system when you've reached
|
||
|
|
roughly something like 1.7 instead of 0.7.
|
||
|
|
In a quad-core processing system, you have four.
|
||
|
|
So a value of two or three is not necessarily a significant load on the system.
|
||
|
|
But when you get to value of four, that means all four cores are being utilized to a hundred
|
||
|
|
percent of their capacity.
|
||
|
|
And it really can't take on any more load without there being processes having to wait.
|
||
|
|
So again, load averages scale up with the number of CPUs or cores that you have on the
|
||
|
|
system.
|
||
|
|
Adjust accordingly.
|
||
|
|
So as an example, at work, we were load testing a new application and whether it was the
|
||
|
|
system that we had for it was adequate and whether there were any runaway processes, memory
|
||
|
|
problems or anything going on with this application, when we fired it up, it skyrocketed showing
|
||
|
|
a load average between 5 and 13.
|
||
|
|
Now this was on a 8-processor system.
|
||
|
|
So between 5 and 7 is not necessarily bad, but you might want to keep an eye on that.
|
||
|
|
If the normal load is between 5 and 7, if you would get an excess load on that system,
|
||
|
|
you probably tax the system and that's what happened when it went to 13 and above.
|
||
|
|
That means that that system, with load that we were putting on that, was exceeding the
|
||
|
|
full load of the system by almost twice as much.
|
||
|
|
So it was one, almost one and a half times beyond the load capacity system and as it grew,
|
||
|
|
if it would have, when it reached 16, that means twice the load that the CPU can process
|
||
|
|
was being put on the system.
|
||
|
|
So that's load average and kind of in a nutshell what it means.
|
||
|
|
If you're still a little confused about that, I have some great, I won't say the right
|
||
|
|
up I did on it, it's great, but there's some really great resources in the bibliography
|
||
|
|
on the website that really break this down and even into the equations and stuff like
|
||
|
|
that from the Linux journal side of things, it was really good article.
|
||
|
|
So that's it right there.
|
||
|
|
So just remember, W command shows some snapshot of the system in the header and then it shows
|
||
|
|
all the people that are logged in into the system and what processes they're running
|
||
|
|
and how much they're utilizing the CPU at that time.
|
||
|
|
Load average is really what I want you to take away from this because when we cover top,
|
||
|
|
that's going to be an important metric.
|
||
|
|
My name is Dan, thank you very much.
|
||
|
|
This is Linux in a shell.
|
||
|
|
Again, I want to thank Hacker Public Radio and I hope to see you at the next episode.
|
||
|
|
Have a good one.
|
||
|
|
Thank you for listening to Hacker Public Radio, HPR is sponsored by Kero.net, so head
|
||
|
|
on over to caro.net for all your hosting needs.
|