- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
193 lines
17 KiB
Plaintext
193 lines
17 KiB
Plaintext
Episode: 2438
|
|
Title: HPR2438: Gnu Awk - Part 8
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2438/hpr2438.mp3
|
|
Transcribed: 2025-10-19 03:05:57
|
|
|
|
---
|
|
|
|
This is HPR Episode 2,438 entitled Gnurk, Part 8, and in part of the series Learning Ork.
|
|
It is hosted by Dave Morris and is about 21 minutes long and Karim and exquisite flag.
|
|
The summary is more about loops.
|
|
This episode of HPR is brought to you by archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org forward slash donate.
|
|
Hello everybody, welcome to Hacker Public Radio.
|
|
My name is Dave Morris and today I am continuing with the series called Learning Ork that I'm doing
|
|
in conjunction with BEZ. This one is Episode 8, so I like to start with a bit of a recap of the
|
|
previous episode and in that episode BEZ took us through the various sorts of loops starting off
|
|
with the while loop, which tests the condition and performs command while the test returns true.
|
|
Then the do while loop which does its tests at the end and repeats until while the test is true.
|
|
There's a for loop which is similar to the C format of full loop.
|
|
Initialize the variable performs a test on it and increments the variable and then for each
|
|
iteration it performs command while that particular test is true. There's the other type of
|
|
for loop which sets a variable to the successive indices of an array and performs a
|
|
collection of commands reach index. These loops were demonstrated by examples in the last episode
|
|
and there's a link in the notes to this. I've made long notes here which is what I'm using
|
|
as the content for this particular show. Unfortunately as happens sometimes a mistake
|
|
crept into the notes for the last episode. In the example for do while there was an infinite loop
|
|
created. BEZ did mention that if you weren't careful you could make an infinite loop and
|
|
somehow rather that happened. I like to think he did that as a test of audience alertness so
|
|
sure you all spotted that. You just too polite to say anything about it. I demonstrated what it
|
|
looked like and what happens if you run it in the notes here. It just runs away forever so it's
|
|
not much point in showing you an infinite loop. The results of it just a few samples. But the point
|
|
is that there's a variable in it called i which is set to two to start off with. Then there's a
|
|
print which is executed and i is incremented and so it starts at two and then moves to three and
|
|
so forth. The test i is not equal to two. It was not going to be true. It starts off at two but
|
|
by the time it gets to the test it's it's already three so it's never going to become two. So it's
|
|
it's an infinite loop. I did point this out to BEZ and check he was happy with me pointing this out
|
|
and yeah of course he was. So let's go on and look at some more statements. There's a few more
|
|
loop related ones. I'm going to talk about in this episode but I thought I'd start by looking at
|
|
the switch statement which is not a loop but it's hard to know where to fit these. Sometimes I
|
|
thought this one would fit here. This one is only in the GNU version of AUK so some of the other
|
|
versions that we've mentioned in the past don't have it and you you can disable if you switch off
|
|
GNU compatibility. It's very similar to the the statement the switch statement in C language which
|
|
I guess is where it's originated from but it wasn't in the original AUK and it consists of
|
|
the word switch followed by an expression in parentheses then there are series of statements
|
|
which I don't know whether they have to be enclosed in braces but if there's more than one
|
|
pretty much the rule is yes you have to have more than if there's going to be more than one statement
|
|
you have to enclose them so I imagine that you always have to have the braces and inside the
|
|
list there's a inside the body of the statement if you like. There's a series of lines which
|
|
consist of the word case followed by a value and then the body of that particular case so what
|
|
is factually what it's saying is if the expression matches this value then do this stuff so it's like
|
|
a sort of if then else type structure in fact it's if then else and then if then else embedded in it
|
|
that type of thing but it's done in a more succinct way you can have any number of these case
|
|
elements and there's one there's a special one called default which is for the instance where
|
|
none of the cases match and the default body is executed. Now the value after each case
|
|
is either a numeric or string constant or it can be a regular expression. Now I haven't given
|
|
any examples of regular expressions here but you can delve more deeply into the new orc
|
|
user guide to to find out more examples. The expression after the switch is any sort of complexity
|
|
that you like really but it can just be a simple variable it would not be a constant obviously
|
|
and it can be something complicated like a numeric expression or something of that sort.
|
|
Now I've given an example here which I've also included as a downloadable file as part of
|
|
the the show which there's a complete orc script which is designed to process the file that we
|
|
called file1.txt that we handed out in show 2129. I think was that the first show I can't
|
|
remember which one it was actually now. I know second show I wrote it down here and that
|
|
consisted of a series of lines containing fruit and vegetables and stuff and stuff about their
|
|
colour and the number and so forth it was a sort of inventory type thing so I'm using that again
|
|
just because it's convenient to do so. So the script starts off with the test nr in capitals
|
|
greater than 1 which means if the record number that's being processed in the input file is greater
|
|
than 1 then do the rest of the rule and the rule actually consists of a printf which says we have
|
|
covered this but we need to look at printfs in a little more detail which we'll do soon but it says
|
|
the percent s is classified as colon space and then as the argument there dollar one.
|
|
dollar one is actually the name of the fruit or vegetable so that's just part of a piece of text
|
|
we're going to put out but there's no new line on it so it's written out and needs a new line
|
|
before it makes any sense. Then comes the switch statement and the switch uses dollar one that
|
|
fruit name again and it consists of a bunch of cases. The first case is checking to see whether
|
|
it's an apple and what it's doing is it's classifying the fruits this is me being a biologist again
|
|
and just just because it's interesting I think I think it's interesting you might disagree
|
|
but it says an apple and then it prints out the message a fruit comma poem P o m e the classification
|
|
of an apple of course is that it's a fruit but its botanical classification is a thing called a
|
|
poem which you can look up and find out what that means. The print statement remember it's a print so
|
|
it produces a new line on the end by default so that if that's executed it will end the line that's
|
|
been composed partially. Then it's followed by the statement break and what that does is it terminates
|
|
the case branch within the switch and causes the switch to move on to whether program the script to
|
|
move on to the next thing after the switch so it breaks out of the of the switch statement totally.
|
|
You can emit that but if you do then it will also the switch will also execute all of the
|
|
statements that are part of all of the other cases till it hits a break so that can be useful
|
|
but I'm not doing that in this particular case so following I'm not going to read the whole
|
|
thing out but just do one more example in the the next case is banana and then there's nothing
|
|
after it but from another case grape and another case follows that kiwi so there are three cases case
|
|
in quotes banana colon and then grape colon and kiwi and the result the the statement that's
|
|
actually going to be produced is that it's a fruit bananas grapes and kiwis are fruits and they're
|
|
all classified as berries in the world of botany so those three are all berries and then we do
|
|
another break there to get out of the switch because we finished with that particular input line
|
|
because we're in a standard rule and the rule is iterating through every line of the file so you'll
|
|
get a message a different message for each line as it's encountered but then that's pretty
|
|
obvious you already know enough about walk to know that there is a default case which is never
|
|
triggered in this particular example where it simply prints out in square brackets the word
|
|
unclassified just to signal that don't know you can't classify that particular thing so
|
|
if you happen to stick another line on the end of the file or indeed give it a different file then
|
|
it's not going to be able to classify some I have run this and captured the output which is in the
|
|
notes and also downloadable if you if you really want to look at it and there's an explanation
|
|
in the notes which pretty much covers what I've just said the moving on from switch to then
|
|
the break statement will we just see the break statement and see something of what it does
|
|
but let's look at it more deeply it's effectively for breaking out of a for loop a while or do
|
|
while loop and it also is usable within the switch statement as we've just seen but outside of those
|
|
break has no effect in a loop you're often used to break statement when it's not possible to
|
|
to work at how many iterations you you're going to need beforehand so it's a way of looping around
|
|
and then eventually deciding right I've had enough I want to stop for some reason rather and
|
|
breaking out of that loop it terminates the loop entirely the loop doesn't run again
|
|
I've written down in the terminates the enclosing loop because you might have a loop within a loop
|
|
within a loop or whatever and it just breaks out of the loop that it's found in and it's executed in
|
|
now I couldn't come up with a good example of using break in a loop so I cheated I guess and went
|
|
to the canoe or user guide and have used that there example which is working out the smallest
|
|
divisor of a number now what I've done is I've just simply copied that particular script into the
|
|
notes here but I've added some comments to it to try and make it a little bit easier to understand
|
|
so it's a rule it's a rule that operates on as many lines as it's given and it's
|
|
sets a variable num to the first field of each line and then it executes a loop and it uses
|
|
it sets up a variable called divisor and it sets it to two and it keeps repeating that loop adding
|
|
one to divisor and it does this infinitely it's one of these three element four loops so the
|
|
middle one where it does a test is emitted it's perfectly legal and what that does is it makes an
|
|
infinite loop where divisor is being incremented by one add infinitum and inside the loop the first
|
|
thing is a is an if statement which checks to see whether the modulus of num divided by divisor
|
|
in other words what's the remainder if you divide num by divisor is it zero and if it is zero then
|
|
it's completely divisible by whatever divisor is and it will report that and then you see a break
|
|
which stops the loop so there's a couple examples I'll come on to in a second so if that
|
|
test doesn't trigger then second test is tried which is if you multiply divisor by itself is it
|
|
greater than num so in other words has divisor now got too large and meaning that the number has
|
|
no divisors we can't find anything that it divides by if that's the case then it's a prime number
|
|
that's the definition of a prime number so we print out the number is prime and break and that stops
|
|
the the loop under those circumstances so it'll keep iterating around till it finds the the
|
|
smallest divisor so what I did was I ran it echoing a number to it so I echoed 67 through a pipe to
|
|
dot slash divisor dot orc which is the name of the far which is downloadable if you wish and it
|
|
comes back and says 67 is prime 67 is a prime number so it's the 19th prime I just happened to
|
|
know that then I gave it 69 and that's not a prime number and it it finds that it's divisible by three
|
|
that's the smallest number so it tries it tries to it's not divisible by two and it tries three yes
|
|
it is you can divide divide into it with no remainder so that's it's pretty simple quite elegant
|
|
actually I thought I couldn't come up with anything cleverer than that next statement why shouldn't
|
|
say that because there is a statement called next we will look at the continue statement now and
|
|
it's similar to break in that it's used in a four-while or do-while loop but you don't use it in
|
|
a switch statement I think it will it will be rejected I think it will be either ignored yeah
|
|
probably ignored but I'm not sure whether it will be produced the syntax error anyway what it does
|
|
is it skips the rest of the enclosing loop and begins the next cycle so again I've stolen an
|
|
example from the new orc user guide and I've called it continue underscore example dot orc
|
|
what this one does is it simply prints the numbers 0 to 20 and emits 5 very trivial but so it's
|
|
all it's all part of a begin rule not reading any data so it's just running in the begin rule and
|
|
it's a it's a for loop with the three component parts to it we set x to 0 we check c of x is less
|
|
than or equal to 20 because we go from 0 to 20 and we we increment x for each iteration and
|
|
there's a test that says if x is equal to five then we do continue so when x becomes five simply
|
|
by the continue which causes it to just drop out of the loop and go on to the to the six and if
|
|
it's not five then print out the the number so the result of running that is simply just to print
|
|
the the numbers 0 to 20 but without a five in it one more of these changing of execution sequence
|
|
type statements called next now this is not a loop statement it's not used in loops except that
|
|
it's not like break and continue anyway but given that orc itself is a loop the rules that you give
|
|
it are executed on each line that that's fed to it then it is a sort of loop breaking out type
|
|
statement so what next does is it causes orc to stop processing the current input record
|
|
and go on to the next one so I made an example this time which consists of a script with
|
|
three or four rules if you count the last if you can't the end rule and what it does is it's using
|
|
the the file one dot txt file that we we like so much the one with all the fruit stuff in it
|
|
and the first rule is nr equals one so if we're on record one which which is a header if you remember
|
|
it will it invokes the next next statement which means if it's if it's one if it's record one
|
|
just skip it ignore it completely then the next rule consists of length length is a function
|
|
and then in parentheses dollar two less than six so if the length of field two which is the color
|
|
of the fruit or vegetable if it's less than six characters long then we we trigger this particular
|
|
rule what we do is we write two an array which is called skip and we write dollar zero to it which
|
|
is the whole record that we've just read the index of skip it since it's an array is nr which is
|
|
the record number and then we invoke next so what that rule does is if it finds a row it's been
|
|
read in a record that's been read in with less than six characters in field two then it will save it
|
|
in this array and ignore it otherwise the third rule is simply in braces print so if it's not
|
|
record one and it's not it's not got this short color name then print it the final rule is an
|
|
end rule and it consists of a print f which simply puts out the string skipped with some new lines at
|
|
the front and end of it and then it it contains a loop which iterates over the array skip printing out
|
|
each element and the index of it so we'll put out the record number followed by the contents of
|
|
the array it uses the print command to do this the print statement so I put an example of running it
|
|
well there is only one example because there's the one file that we we like to use and that's
|
|
included in the both both this example and the output are available for download if you want to
|
|
so you'll see that it's printed out banana grape plum pineapple because the the name the color name for
|
|
each one is longer than greater than or equal to six characters but it's skipped apple strawberry
|
|
et cetera et cetera there's two apples in fact and in it so skip record two which is apple
|
|
red four and record four which is strawberry red three et cetera because the the color names are
|
|
too short I mean it's I can't imagine why you'd ever want to do this but you know it just shows
|
|
ways in which this type of thing can be can be done so I think it's time to wind up with this
|
|
particular one I think we've covered enough in this particular episode so as usual there's a
|
|
whole bunch of files you can download if you want to you can get the e-pub version of these notes
|
|
if you wish or the PDF version of them and as I said all these example scripts et cetera that you
|
|
can play with as the mood takes you okay that's it then bye now
|
|
you've been listening to hecka public radio at hecka public radio dot org we are a community
|
|
podcast network that releases shows every weekday Monday through Friday today's show like all our
|
|
shows was contributed by an hpr listener like yourself if you ever thought of recording a podcast
|
|
and click on our contributing to find out how easy it really is hecka public radio was found
|
|
by the digital dog pound and the infonomicon computer club and it's part of the binary revolution
|
|
at binwreff.com if you have comments on today's show please email the host directly leave a comment
|
|
on the website or record a follow-up episode yourself unless otherwise status today's show is
|
|
released on the creative comments attribution share a light 3.0 license
|