- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
229 lines
20 KiB
Plaintext
229 lines
20 KiB
Plaintext
Episode: 2816
|
|
Title: HPR2816: Gnu Awk - Part 14
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2816/hpr2816.mp3
|
|
Transcribed: 2025-10-19 17:14:49
|
|
|
|
---
|
|
|
|
This is HPR Episode 2816 entitled Ganouak, Part 14, and in part of the series Learning Ork,
|
|
it is hosted by Dave Morris and is about 23 minutes long and carries an explicit flag.
|
|
The summary is re-irection of input and output Part 1.
|
|
This episode of HPR is brought to you by archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org forward slash donate.
|
|
Support universal access to all knowledge by heading over to archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org.
|
|
Support universal access to all knowledge by heading over to archive.org.
|
|
Hello everybody, welcome to Hackabovic Radio.
|
|
This is episode number 14 in the Learning Ork series that be easy and
|
|
myself are doing. I wanted to talk about the subject of re-direction in Ork programs
|
|
and I originally thought yes I can fit that into one episode but as I started to write it I realised
|
|
there was just too much so I'm going to do it as two episodes and this is the first of the pair
|
|
not surprisingly and this time I want to be looking at output re-direction and then the next one
|
|
I'll be looking at the get line command which is used for input explicit input I think that's why
|
|
they put it which can include re-direction. So so far in our Ork programs pretty much all anyway
|
|
we have seen the when it it the script prints prints out using print or print f the outputs
|
|
written to the standard output channel which is pretty much the screen if you're running things
|
|
from a terminal the re-direction feature in Ork allows output to be written somewhere else so the
|
|
first thing you might want to do is to redirect to a file and you would use print or print f and
|
|
there's a sort of syntax diagram print space items there would be a list of items often separated
|
|
by commas greater than sign and then the name of an output file and that's a simple example that
|
|
I've shown here it uses the infamous file of fruit data that we invented it's actually be easy
|
|
that came up with it in episode number two I've included the data file with this show just in case
|
|
you find it useful to have it around. So this is a very simple Ork script just a one line up
|
|
and I've demonstrated how it would be used so you would write as your program after the command
|
|
Ork in quotes single quotes capital NR greater than one that's the number of records greater than one
|
|
so skipping the first line which is a header so the rule that is triggered by that particular
|
|
test simply consists of a print dollar one the first field of the data greater than
|
|
and then in double quotes fruit names all of this in curly brackets close single quote
|
|
and then the name of the the file or 14 fruit data dot txt which is a says included with the show
|
|
so what that's doing is it's taking the first field of every line and writing it to a file called
|
|
fruit name and I've shown it the file being catted and you can see its contents is the names of
|
|
the fruit apple banana strawberry etc now the things to note are that the name of the file is enclosed
|
|
in double quotes and that's because it's a string so this has to be a string you'll get into trouble
|
|
if you try and use anything else other than a string there so the script or loop once per line
|
|
of the input file as I've said and it will execute three direction each time and what happens is
|
|
the output file is erased before the first output is written to it and then subsequent rows of the same
|
|
file don't erase it but append to it and it's important to be aware of this because it's not the same
|
|
if you're used to doing this in shell scripts then it's not the same behavior now this is not
|
|
different in any significant way from simply writing the same script where you simply print
|
|
dollar one and then at the end of the the line on in your shell you put a greater than and then
|
|
the name fruit names and in this particular case and there's example of it here in this particular
|
|
case you're using the shell to do the redirection to a file that's fine I mean I would choose the
|
|
latter one personally if I needed to do something but things get more complicated if you want to
|
|
be writing to multiple files from your script so I've prepared an example one which is downloadable
|
|
org14 underscoreex1.awk and that writes to a collection of output files and I've listed it in the notes
|
|
again it's using in our greater than one as the trigger for the the single rule that exists in
|
|
the script and it sets a variable color to column two column two contains the color of the fruit
|
|
it makes a file name which is being stored in in a variable f name and it does this by concatenating
|
|
the string org14 underscore with contents of color and then underscore fruit and it prints just
|
|
just so you can tell what it's doing it prints the message writing percent is two percent is
|
|
backslashn that's in print f and in it in that string it it fills in the fruit name and the name
|
|
of the file that it's going to write to then it actually does it print dollar one greater than
|
|
f name now f name in this case so it's sort of alluded to before is not quoted because it's it
|
|
it is itself a string it's a variable containing a string that's not a string constant if it's a
|
|
string constant need to true quote it it would have been possible to put that string
|
|
concatenation in the place of f name and if you do that there's great scope for confusion
|
|
and the the org script the org interpreter will get confused unless you enclose that concatenation
|
|
in quotes in in parentheses distracted by a cat in the background is running about the place
|
|
so running the script writes the files called stuff like org14 underscore brown underscore fruit
|
|
and similar you can see it being run in the notes and see the names of the files since the
|
|
output file names are generated dynamically and are liable to changes from each line read from
|
|
the input file the script is doing what was described earlier it creates them or empties them if
|
|
they already exist the first instance abuse and then depends them once open then all of the files
|
|
will be multiple files open as the script run all the files are closed in the script exits
|
|
i've shown that if i ls org14 underscore asterisk underscore fruit i get back brown fruit green fruit
|
|
purple fruit red fruit yellow that catted the purple one and i get back grape and plum so how would
|
|
you then append to an existing file well not too surprisingly that it used different type of
|
|
rederation where you use double greater than sign so the output file is expected to exist already
|
|
but if it doesn't then it will be created if it does then its contents are not arrays but
|
|
appended to now when you redirect stuff in in a shell script you will see something like an echo
|
|
followed by a greater than and the name of the the output file so that's it writing the first
|
|
line to the file and then you will see later on the last line being or subsequent lines being
|
|
written to it which are appending will use double greater than sign so similar sort of idea
|
|
but the way it behaves in the shell whether it be bash or born shell or whatever i pretty much
|
|
then it will be somewhat different from the way that org works now that's partly because each
|
|
redirection in the shell involves the file being opened and then closed again when it's done in
|
|
this sort of way whereas the redirection is being done to an open file where the file is opened by
|
|
the first instance of redirection to it and then there's a the file will be closed when the script
|
|
exits but there's also a closed command which will do this stuff and we'll look at that in a minute
|
|
so the next topic is redirecting to another program so this type of redirection uses a pipe symbol
|
|
and to the right of the pipe symbol is the command which is a string so either a string literal
|
|
or it's a variable containing a string so print space items space vertical bar space command
|
|
will do the job and so there's an example here using the famous fruit data org open quote symbol
|
|
quote nr greater than one then in curly brackets print dollar one space vertical bar then in double
|
|
sort space minus use space vertical bar space nl closed double quotes closed curly brackets closed
|
|
quotes than the name of the file bar so this time you get a list of the fruit but with a few added
|
|
extras so the command which is being redirected to is actually a double command is a pipe in its own
|
|
respect and it starts with a sort command it uses the option minus u or hyphen u the output from
|
|
the sort and that causes it to make sure that all the things that it sorts are unique and then it
|
|
that's piped to nl which is a thing that just numbers line so as this script has run when the
|
|
first thing is written to this command subprocess is started up with these two commands in and that
|
|
they are sitting waiting for input first name sent to this process and then it repeats with each
|
|
successive name and that the subprocess finishes when the script finishes and the way that
|
|
salt work is it it works as it accumulates all of the stuff that it's fed and then when they
|
|
when it stops because the determination of the stream of data it will do the sort and carry on
|
|
and in this case it passes the results of the sort as a bunch of lines to nl so you'll see though
|
|
in the in the demonstration of how it works you see everything sorted alphabetically and then
|
|
number one through eight now as I mentioned before there's a close command in a org which will close
|
|
the redirection to a file or to command the argument to close needs to be the exact commands
|
|
or file name which exactly might which define the process it needs to be completely exacting in
|
|
every respect it's it's worse with a command because you might have you might add it extra spaces
|
|
at various points so that's why it's a good idea to store the commands or file names in an org
|
|
variable if you need to to do an explicit clone and example two shows the variable cmd being used
|
|
to hold the shell command and in this case the connection is closed to show how it would be done
|
|
that there's no actual need to close the the channel this is essentially the same the same script
|
|
except it's a bit more involved the first rule is a begin rule where it sets up this cmd variable
|
|
which is the sort hyphen u vertical bar nl then the second rule is and triggered by every line
|
|
doesn't have a record number of one everything greater than one in other words which prints dollar one
|
|
to this command and the final rule is an end rule which closes cmd variable so it just does exactly
|
|
the same in a momented way but it proves the point so I thought I'd throw in a more what I
|
|
refer to as real world example at least the set here is real in my world may wonder why not
|
|
but when I'm preparing an hbr show like this which involves a number a number of examples
|
|
script I need to run them for testing purposes prove they they really work and not nonsense so I
|
|
have a main directory for my hbr shows and I work in that directory and I'm then like to make
|
|
and then have subdirectries per show and I like to make soft links to the examples in the subject
|
|
so I can run tests without having to hop around between directories in general I use the ln
|
|
command with hyphen s as one of its arguments which makes a soft link and I use hyphen f which
|
|
forces it to to make the link even if it exists normally if it already exists it will make it but
|
|
sometimes I mess things up and want to overwrite a link with the real thing and so I use
|
|
hyphen f to force up and then the arguments are the path to the actual file I want to make a link
|
|
to and then the name of the link so I use the path of the file relative to where I
|
|
and then use the base name of that to to to to make a link and so if I'm I'm pointing to where
|
|
example one of walk14 is walk14 underscore ex1.awk and I'm going to make a link called that so I wrote
|
|
a little Oxford to help me do this it takes path names as input and constructs shell commands
|
|
which it pipes to sh through running as a subprocess and it's here as example three the script
|
|
expects to be given one or more path names on standard input it takes the path and splits it up
|
|
based on the slash character and it uses split to do this and split returns a number of elements
|
|
that it finds so that number which are save in variable called n will index the last element we
|
|
check that this makes sense so if it wasn't a path but consistent just of the file name then
|
|
something a bit silly going on so check to see the n is if it's less than two then there's an error
|
|
one of the things I do here is a sort of look ahead which is a print error message error in path
|
|
and then dollar zero the the actual path string I've just read in and I send that to the file
|
|
which is a string slash dev slash std e double r standard error and I'm going to talk a
|
|
little bit more about this then the command next causes this particular input line to be skipped
|
|
next I build the shell command and this is partly because I want to demonstrate it printed
|
|
but it's sometimes people are convenient to do things this way so the command is called CMD
|
|
and it's constructed by using sprint f s print f which is a formatted print thing but
|
|
except it doesn't write anything out it simply returns it as a value which can be stored in a
|
|
variable and so what I'm doing here is building the string which will be a shell command so in
|
|
square brackets inside this string I've got hyphen e then percent s which is substitution point
|
|
closed square bracket double ampersand so that's checking for existence of the file
|
|
which is the case I was daft enough to feed it in the path to a file that doesn't exist double
|
|
out in the sentence followed by ln space hyphen s space hyphen f for force percent s and then
|
|
percent s closed no quits there's no new line on the end of this one and then the arguments are
|
|
feed to this are dollar zero the path name dollar zero again same path name and a square bracket
|
|
n I've stored the results from the split in an array called a and the end element is the last
|
|
one as I said already then I simply print this out preceded by a double double chevron and finally
|
|
in this rule print percent s print f I should say the format being percent s backslash n and feed
|
|
command into that so that puts a new line on the end of it which is necessary for the the shell
|
|
to receive it as a separate one so it will send the command that I've just printed out for
|
|
for demonstration purposes to the shell whatever the shell is the shell I think by default is the
|
|
born shell or whatever operating system it pretends it's born shell I think it's dash for example
|
|
on debi and derivatives as opposed to bash then the final rule in this script is an end rule which
|
|
closes the s h command pipeline not necessary not strictly necessary but it's good good practice to
|
|
to do I've got an example of how I how I would run it how I do run it and I'm my command on the
|
|
command line is a print f this is this is me in bash actually which sends percent s backslash
|
|
n in double quotes to a pipeline and the argument to that print f is the path to the example and I
|
|
will read this path out basically the last bit of it consists of all 14 underscore ex question mark
|
|
or so what that will do is it will that that's what do they call that file expansion so it will
|
|
return all of the matching file there'll be three because it's through example particular show
|
|
and it returns them to print f and the way that print f works is that if it'll just keep printing
|
|
the arguments that you give it if you there's only it's only cases for one in this particular case
|
|
this is the way that the bash version of print f works anyway if you give it more than one it will
|
|
just keep repeating that format over and over again so the result is that it prints out the strings
|
|
that are returned from the expansion one after the other with a new line on the end and these are
|
|
piped to an invocation of this particular script dot slash walk 14 underscore ex three dot walk
|
|
and what we get back is three commands which are the the test to see that the files exist
|
|
it's a bit superfluous because they do because that's otherwise they wouldn't have expanded
|
|
but it depends how you create them in the first place check the exist and if it if it does exist
|
|
then send it to an airline command maker doesn't that's all it shows the three instances
|
|
so this is actually really useful I did it as a as a demo here and then realized actually this
|
|
much more elegant than we've been doing things before and it probably needs to be a bit more
|
|
foolproof needs to be have more error checks in it and stuff and in particular when you're doing
|
|
this using an orc script to generate commands to send to your shell then there are potential pitfalls
|
|
using quotes because shells are often fussy about quotes and you're doing it in another language
|
|
it's sort of so me fuzzy about quotes and there's a particular thing in the guinea walk manual
|
|
section 10.2.9 which I've referenced here but it's it's a useful thing to be able to do
|
|
you might prefer to just write the the whole of such a thing in a bash script but it's entirely
|
|
up to you the final bit of the redirection I want to talk about is redirecting to a co-process
|
|
so this uses a pipe symbol and an ampersand to send output to a string containing a command
|
|
or commands for the shell now this is an orc gnu orc extension and quite advanced
|
|
unlike the previous redirection which sends to a program this form sends to a program and allows
|
|
programs output to be read back so it's a two directional connection to a running process and that's
|
|
the definition of this thing called a co-process by the time this show comes out you should have
|
|
heard clackers talk about co-processes in bash so it's it hopefully it will make some make a
|
|
bit more sense as a consequence we're going to talk a bit more about co-processes in the next
|
|
of this pair of shows because it really makes sense to talk about it in the context of get line
|
|
which is the way of reading stuff back again so i'm not going to say anything more about it but
|
|
the basic idea is you would do print space item space vertical bar ampersand space and then
|
|
some command that sends up to and re-receive stuff back from the final point then is redirecting
|
|
to special file so as you know within unix there are three standard channels called standard input
|
|
standard output and standard error standard error output is the other way of expressing these are
|
|
connected to the the keyboard for standard input the screen for standard output and standard error
|
|
usually goes to the screen so normally a unix program will script reach from standard input
|
|
and writes the standard output and generates any error message on standard error and there's
|
|
there's a lot more to be said about this and i think i'm going to go into this in a bit more detail
|
|
in the bash tips series bit like but the way new org deals with these three three special
|
|
file name three special channels is it has these special file name which are slash dev slash
|
|
std in which is standard slash dev slash std e double r which is standard error output
|
|
so i did i used it in an earlier script but you might want to send the message print space in
|
|
double quotes invalid number double quotes greater than and then in double quotes slash dev slash
|
|
std e double r double quotes and which will send a message to standard error so if you're running
|
|
a script that did this then you can you can use your shells or a direction capability to
|
|
send the standard error stuff somewhere special log it or something if you wish do something
|
|
clever otherwise it will just look like output from the script so there's a lot more to be said
|
|
about this haven't gone into detail see section 5.7 in the new org guide and there's
|
|
other special names that you can use c section 5.8 or about but i'm not going to go into more depth
|
|
so i'll be continuing the second half of this episode which is pretty much written
|
|
be doing that fairly soon in the next few weeks is the plan so all right that's it then bye now
|
|
you've been listening to hecka public radio at hecka public radio dot org we are a community podcast
|
|
network that releases shows every weekday Monday through Friday today's show like all our shows
|
|
was contributed by an hbr listener like yourself if you ever thought of recording a podcast
|
|
and click on our contributing to find out how easy it really is hecka public radio was founded by
|
|
the digital dog pound and the infonomicum computer club and it's part of the binary revolution
|
|
at binrev.com if you have comments on today's show please email the host directly leave a comment
|
|
on the website or record a follow up episode yourself unless otherwise status today's show is
|
|
released on the creative comments attribution share a live 3.0 license
|