Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

229 lines
20 KiB
Plaintext

Episode: 2816
Title: HPR2816: Gnu Awk - Part 14
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2816/hpr2816.mp3
Transcribed: 2025-10-19 17:14:49
---
This is HPR Episode 2816 entitled Ganouak, Part 14, and in part of the series Learning Ork,
it is hosted by Dave Morris and is about 23 minutes long and carries an explicit flag.
The summary is re-irection of input and output Part 1.
This episode of HPR is brought to you by archive.org.
Support universal access to all knowledge by heading over to archive.org forward slash donate.
Support universal access to all knowledge by heading over to archive.org.
Support universal access to all knowledge by heading over to archive.org.
Support universal access to all knowledge by heading over to archive.org.
Support universal access to all knowledge by heading over to archive.org.
Support universal access to all knowledge by heading over to archive.org.
Hello everybody, welcome to Hackabovic Radio.
This is episode number 14 in the Learning Ork series that be easy and
myself are doing. I wanted to talk about the subject of re-direction in Ork programs
and I originally thought yes I can fit that into one episode but as I started to write it I realised
there was just too much so I'm going to do it as two episodes and this is the first of the pair
not surprisingly and this time I want to be looking at output re-direction and then the next one
I'll be looking at the get line command which is used for input explicit input I think that's why
they put it which can include re-direction. So so far in our Ork programs pretty much all anyway
we have seen the when it it the script prints prints out using print or print f the outputs
written to the standard output channel which is pretty much the screen if you're running things
from a terminal the re-direction feature in Ork allows output to be written somewhere else so the
first thing you might want to do is to redirect to a file and you would use print or print f and
there's a sort of syntax diagram print space items there would be a list of items often separated
by commas greater than sign and then the name of an output file and that's a simple example that
I've shown here it uses the infamous file of fruit data that we invented it's actually be easy
that came up with it in episode number two I've included the data file with this show just in case
you find it useful to have it around. So this is a very simple Ork script just a one line up
and I've demonstrated how it would be used so you would write as your program after the command
Ork in quotes single quotes capital NR greater than one that's the number of records greater than one
so skipping the first line which is a header so the rule that is triggered by that particular
test simply consists of a print dollar one the first field of the data greater than
and then in double quotes fruit names all of this in curly brackets close single quote
and then the name of the the file or 14 fruit data dot txt which is a says included with the show
so what that's doing is it's taking the first field of every line and writing it to a file called
fruit name and I've shown it the file being catted and you can see its contents is the names of
the fruit apple banana strawberry etc now the things to note are that the name of the file is enclosed
in double quotes and that's because it's a string so this has to be a string you'll get into trouble
if you try and use anything else other than a string there so the script or loop once per line
of the input file as I've said and it will execute three direction each time and what happens is
the output file is erased before the first output is written to it and then subsequent rows of the same
file don't erase it but append to it and it's important to be aware of this because it's not the same
if you're used to doing this in shell scripts then it's not the same behavior now this is not
different in any significant way from simply writing the same script where you simply print
dollar one and then at the end of the the line on in your shell you put a greater than and then
the name fruit names and in this particular case and there's example of it here in this particular
case you're using the shell to do the redirection to a file that's fine I mean I would choose the
latter one personally if I needed to do something but things get more complicated if you want to
be writing to multiple files from your script so I've prepared an example one which is downloadable
org14 underscoreex1.awk and that writes to a collection of output files and I've listed it in the notes
again it's using in our greater than one as the trigger for the the single rule that exists in
the script and it sets a variable color to column two column two contains the color of the fruit
it makes a file name which is being stored in in a variable f name and it does this by concatenating
the string org14 underscore with contents of color and then underscore fruit and it prints just
just so you can tell what it's doing it prints the message writing percent is two percent is
backslashn that's in print f and in it in that string it it fills in the fruit name and the name
of the file that it's going to write to then it actually does it print dollar one greater than
f name now f name in this case so it's sort of alluded to before is not quoted because it's it
it is itself a string it's a variable containing a string that's not a string constant if it's a
string constant need to true quote it it would have been possible to put that string
concatenation in the place of f name and if you do that there's great scope for confusion
and the the org script the org interpreter will get confused unless you enclose that concatenation
in quotes in in parentheses distracted by a cat in the background is running about the place
so running the script writes the files called stuff like org14 underscore brown underscore fruit
and similar you can see it being run in the notes and see the names of the files since the
output file names are generated dynamically and are liable to changes from each line read from
the input file the script is doing what was described earlier it creates them or empties them if
they already exist the first instance abuse and then depends them once open then all of the files
will be multiple files open as the script run all the files are closed in the script exits
i've shown that if i ls org14 underscore asterisk underscore fruit i get back brown fruit green fruit
purple fruit red fruit yellow that catted the purple one and i get back grape and plum so how would
you then append to an existing file well not too surprisingly that it used different type of
rederation where you use double greater than sign so the output file is expected to exist already
but if it doesn't then it will be created if it does then its contents are not arrays but
appended to now when you redirect stuff in in a shell script you will see something like an echo
followed by a greater than and the name of the the output file so that's it writing the first
line to the file and then you will see later on the last line being or subsequent lines being
written to it which are appending will use double greater than sign so similar sort of idea
but the way it behaves in the shell whether it be bash or born shell or whatever i pretty much
then it will be somewhat different from the way that org works now that's partly because each
redirection in the shell involves the file being opened and then closed again when it's done in
this sort of way whereas the redirection is being done to an open file where the file is opened by
the first instance of redirection to it and then there's a the file will be closed when the script
exits but there's also a closed command which will do this stuff and we'll look at that in a minute
so the next topic is redirecting to another program so this type of redirection uses a pipe symbol
and to the right of the pipe symbol is the command which is a string so either a string literal
or it's a variable containing a string so print space items space vertical bar space command
will do the job and so there's an example here using the famous fruit data org open quote symbol
quote nr greater than one then in curly brackets print dollar one space vertical bar then in double
sort space minus use space vertical bar space nl closed double quotes closed curly brackets closed
quotes than the name of the file bar so this time you get a list of the fruit but with a few added
extras so the command which is being redirected to is actually a double command is a pipe in its own
respect and it starts with a sort command it uses the option minus u or hyphen u the output from
the sort and that causes it to make sure that all the things that it sorts are unique and then it
that's piped to nl which is a thing that just numbers line so as this script has run when the
first thing is written to this command subprocess is started up with these two commands in and that
they are sitting waiting for input first name sent to this process and then it repeats with each
successive name and that the subprocess finishes when the script finishes and the way that
salt work is it it works as it accumulates all of the stuff that it's fed and then when they
when it stops because the determination of the stream of data it will do the sort and carry on
and in this case it passes the results of the sort as a bunch of lines to nl so you'll see though
in the in the demonstration of how it works you see everything sorted alphabetically and then
number one through eight now as I mentioned before there's a close command in a org which will close
the redirection to a file or to command the argument to close needs to be the exact commands
or file name which exactly might which define the process it needs to be completely exacting in
every respect it's it's worse with a command because you might have you might add it extra spaces
at various points so that's why it's a good idea to store the commands or file names in an org
variable if you need to to do an explicit clone and example two shows the variable cmd being used
to hold the shell command and in this case the connection is closed to show how it would be done
that there's no actual need to close the the channel this is essentially the same the same script
except it's a bit more involved the first rule is a begin rule where it sets up this cmd variable
which is the sort hyphen u vertical bar nl then the second rule is and triggered by every line
doesn't have a record number of one everything greater than one in other words which prints dollar one
to this command and the final rule is an end rule which closes cmd variable so it just does exactly
the same in a momented way but it proves the point so I thought I'd throw in a more what I
refer to as real world example at least the set here is real in my world may wonder why not
but when I'm preparing an hbr show like this which involves a number a number of examples
script I need to run them for testing purposes prove they they really work and not nonsense so I
have a main directory for my hbr shows and I work in that directory and I'm then like to make
and then have subdirectries per show and I like to make soft links to the examples in the subject
so I can run tests without having to hop around between directories in general I use the ln
command with hyphen s as one of its arguments which makes a soft link and I use hyphen f which
forces it to to make the link even if it exists normally if it already exists it will make it but
sometimes I mess things up and want to overwrite a link with the real thing and so I use
hyphen f to force up and then the arguments are the path to the actual file I want to make a link
to and then the name of the link so I use the path of the file relative to where I
and then use the base name of that to to to to make a link and so if I'm I'm pointing to where
example one of walk14 is walk14 underscore ex1.awk and I'm going to make a link called that so I wrote
a little Oxford to help me do this it takes path names as input and constructs shell commands
which it pipes to sh through running as a subprocess and it's here as example three the script
expects to be given one or more path names on standard input it takes the path and splits it up
based on the slash character and it uses split to do this and split returns a number of elements
that it finds so that number which are save in variable called n will index the last element we
check that this makes sense so if it wasn't a path but consistent just of the file name then
something a bit silly going on so check to see the n is if it's less than two then there's an error
one of the things I do here is a sort of look ahead which is a print error message error in path
and then dollar zero the the actual path string I've just read in and I send that to the file
which is a string slash dev slash std e double r standard error and I'm going to talk a
little bit more about this then the command next causes this particular input line to be skipped
next I build the shell command and this is partly because I want to demonstrate it printed
but it's sometimes people are convenient to do things this way so the command is called CMD
and it's constructed by using sprint f s print f which is a formatted print thing but
except it doesn't write anything out it simply returns it as a value which can be stored in a
variable and so what I'm doing here is building the string which will be a shell command so in
square brackets inside this string I've got hyphen e then percent s which is substitution point
closed square bracket double ampersand so that's checking for existence of the file
which is the case I was daft enough to feed it in the path to a file that doesn't exist double
out in the sentence followed by ln space hyphen s space hyphen f for force percent s and then
percent s closed no quits there's no new line on the end of this one and then the arguments are
feed to this are dollar zero the path name dollar zero again same path name and a square bracket
n I've stored the results from the split in an array called a and the end element is the last
one as I said already then I simply print this out preceded by a double double chevron and finally
in this rule print percent s print f I should say the format being percent s backslash n and feed
command into that so that puts a new line on the end of it which is necessary for the the shell
to receive it as a separate one so it will send the command that I've just printed out for
for demonstration purposes to the shell whatever the shell is the shell I think by default is the
born shell or whatever operating system it pretends it's born shell I think it's dash for example
on debi and derivatives as opposed to bash then the final rule in this script is an end rule which
closes the s h command pipeline not necessary not strictly necessary but it's good good practice to
to do I've got an example of how I how I would run it how I do run it and I'm my command on the
command line is a print f this is this is me in bash actually which sends percent s backslash
n in double quotes to a pipeline and the argument to that print f is the path to the example and I
will read this path out basically the last bit of it consists of all 14 underscore ex question mark
or so what that will do is it will that that's what do they call that file expansion so it will
return all of the matching file there'll be three because it's through example particular show
and it returns them to print f and the way that print f works is that if it'll just keep printing
the arguments that you give it if you there's only it's only cases for one in this particular case
this is the way that the bash version of print f works anyway if you give it more than one it will
just keep repeating that format over and over again so the result is that it prints out the strings
that are returned from the expansion one after the other with a new line on the end and these are
piped to an invocation of this particular script dot slash walk 14 underscore ex three dot walk
and what we get back is three commands which are the the test to see that the files exist
it's a bit superfluous because they do because that's otherwise they wouldn't have expanded
but it depends how you create them in the first place check the exist and if it if it does exist
then send it to an airline command maker doesn't that's all it shows the three instances
so this is actually really useful I did it as a as a demo here and then realized actually this
much more elegant than we've been doing things before and it probably needs to be a bit more
foolproof needs to be have more error checks in it and stuff and in particular when you're doing
this using an orc script to generate commands to send to your shell then there are potential pitfalls
using quotes because shells are often fussy about quotes and you're doing it in another language
it's sort of so me fuzzy about quotes and there's a particular thing in the guinea walk manual
section 10.2.9 which I've referenced here but it's it's a useful thing to be able to do
you might prefer to just write the the whole of such a thing in a bash script but it's entirely
up to you the final bit of the redirection I want to talk about is redirecting to a co-process
so this uses a pipe symbol and an ampersand to send output to a string containing a command
or commands for the shell now this is an orc gnu orc extension and quite advanced
unlike the previous redirection which sends to a program this form sends to a program and allows
programs output to be read back so it's a two directional connection to a running process and that's
the definition of this thing called a co-process by the time this show comes out you should have
heard clackers talk about co-processes in bash so it's it hopefully it will make some make a
bit more sense as a consequence we're going to talk a bit more about co-processes in the next
of this pair of shows because it really makes sense to talk about it in the context of get line
which is the way of reading stuff back again so i'm not going to say anything more about it but
the basic idea is you would do print space item space vertical bar ampersand space and then
some command that sends up to and re-receive stuff back from the final point then is redirecting
to special file so as you know within unix there are three standard channels called standard input
standard output and standard error standard error output is the other way of expressing these are
connected to the the keyboard for standard input the screen for standard output and standard error
usually goes to the screen so normally a unix program will script reach from standard input
and writes the standard output and generates any error message on standard error and there's
there's a lot more to be said about this and i think i'm going to go into this in a bit more detail
in the bash tips series bit like but the way new org deals with these three three special
file name three special channels is it has these special file name which are slash dev slash
std in which is standard slash dev slash std e double r which is standard error output
so i did i used it in an earlier script but you might want to send the message print space in
double quotes invalid number double quotes greater than and then in double quotes slash dev slash
std e double r double quotes and which will send a message to standard error so if you're running
a script that did this then you can you can use your shells or a direction capability to
send the standard error stuff somewhere special log it or something if you wish do something
clever otherwise it will just look like output from the script so there's a lot more to be said
about this haven't gone into detail see section 5.7 in the new org guide and there's
other special names that you can use c section 5.8 or about but i'm not going to go into more depth
so i'll be continuing the second half of this episode which is pretty much written
be doing that fairly soon in the next few weeks is the plan so all right that's it then bye now
you've been listening to hecka public radio at hecka public radio dot org we are a community podcast
network that releases shows every weekday Monday through Friday today's show like all our shows
was contributed by an hbr listener like yourself if you ever thought of recording a podcast
and click on our contributing to find out how easy it really is hecka public radio was founded by
the digital dog pound and the infonomicum computer club and it's part of the binary revolution
at binrev.com if you have comments on today's show please email the host directly leave a comment
on the website or record a follow up episode yourself unless otherwise status today's show is
released on the creative comments attribution share a live 3.0 license