Episode: 2816 Title: HPR2816: Gnu Awk - Part 14 Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2816/hpr2816.mp3 Transcribed: 2025-10-19 17:14:49 --- This is HPR Episode 2816 entitled Ganouak, Part 14, and in part of the series Learning Ork, it is hosted by Dave Morris and is about 23 minutes long and carries an explicit flag. The summary is re-irection of input and output Part 1. This episode of HPR is brought to you by archive.org. Support universal access to all knowledge by heading over to archive.org forward slash donate. Support universal access to all knowledge by heading over to archive.org. Support universal access to all knowledge by heading over to archive.org. Support universal access to all knowledge by heading over to archive.org. Support universal access to all knowledge by heading over to archive.org. Support universal access to all knowledge by heading over to archive.org. Hello everybody, welcome to Hackabovic Radio. This is episode number 14 in the Learning Ork series that be easy and myself are doing. I wanted to talk about the subject of re-direction in Ork programs and I originally thought yes I can fit that into one episode but as I started to write it I realised there was just too much so I'm going to do it as two episodes and this is the first of the pair not surprisingly and this time I want to be looking at output re-direction and then the next one I'll be looking at the get line command which is used for input explicit input I think that's why they put it which can include re-direction. So so far in our Ork programs pretty much all anyway we have seen the when it it the script prints prints out using print or print f the outputs written to the standard output channel which is pretty much the screen if you're running things from a terminal the re-direction feature in Ork allows output to be written somewhere else so the first thing you might want to do is to redirect to a file and you would use print or print f and there's a sort of syntax diagram print space items there would be a list of items often separated by commas greater than sign and then the name of an output file and that's a simple example that I've shown here it uses the infamous file of fruit data that we invented it's actually be easy that came up with it in episode number two I've included the data file with this show just in case you find it useful to have it around. So this is a very simple Ork script just a one line up and I've demonstrated how it would be used so you would write as your program after the command Ork in quotes single quotes capital NR greater than one that's the number of records greater than one so skipping the first line which is a header so the rule that is triggered by that particular test simply consists of a print dollar one the first field of the data greater than and then in double quotes fruit names all of this in curly brackets close single quote and then the name of the the file or 14 fruit data dot txt which is a says included with the show so what that's doing is it's taking the first field of every line and writing it to a file called fruit name and I've shown it the file being catted and you can see its contents is the names of the fruit apple banana strawberry etc now the things to note are that the name of the file is enclosed in double quotes and that's because it's a string so this has to be a string you'll get into trouble if you try and use anything else other than a string there so the script or loop once per line of the input file as I've said and it will execute three direction each time and what happens is the output file is erased before the first output is written to it and then subsequent rows of the same file don't erase it but append to it and it's important to be aware of this because it's not the same if you're used to doing this in shell scripts then it's not the same behavior now this is not different in any significant way from simply writing the same script where you simply print dollar one and then at the end of the the line on in your shell you put a greater than and then the name fruit names and in this particular case and there's example of it here in this particular case you're using the shell to do the redirection to a file that's fine I mean I would choose the latter one personally if I needed to do something but things get more complicated if you want to be writing to multiple files from your script so I've prepared an example one which is downloadable org14 underscoreex1.awk and that writes to a collection of output files and I've listed it in the notes again it's using in our greater than one as the trigger for the the single rule that exists in the script and it sets a variable color to column two column two contains the color of the fruit it makes a file name which is being stored in in a variable f name and it does this by concatenating the string org14 underscore with contents of color and then underscore fruit and it prints just just so you can tell what it's doing it prints the message writing percent is two percent is backslashn that's in print f and in it in that string it it fills in the fruit name and the name of the file that it's going to write to then it actually does it print dollar one greater than f name now f name in this case so it's sort of alluded to before is not quoted because it's it it is itself a string it's a variable containing a string that's not a string constant if it's a string constant need to true quote it it would have been possible to put that string concatenation in the place of f name and if you do that there's great scope for confusion and the the org script the org interpreter will get confused unless you enclose that concatenation in quotes in in parentheses distracted by a cat in the background is running about the place so running the script writes the files called stuff like org14 underscore brown underscore fruit and similar you can see it being run in the notes and see the names of the files since the output file names are generated dynamically and are liable to changes from each line read from the input file the script is doing what was described earlier it creates them or empties them if they already exist the first instance abuse and then depends them once open then all of the files will be multiple files open as the script run all the files are closed in the script exits i've shown that if i ls org14 underscore asterisk underscore fruit i get back brown fruit green fruit purple fruit red fruit yellow that catted the purple one and i get back grape and plum so how would you then append to an existing file well not too surprisingly that it used different type of rederation where you use double greater than sign so the output file is expected to exist already but if it doesn't then it will be created if it does then its contents are not arrays but appended to now when you redirect stuff in in a shell script you will see something like an echo followed by a greater than and the name of the the output file so that's it writing the first line to the file and then you will see later on the last line being or subsequent lines being written to it which are appending will use double greater than sign so similar sort of idea but the way it behaves in the shell whether it be bash or born shell or whatever i pretty much then it will be somewhat different from the way that org works now that's partly because each redirection in the shell involves the file being opened and then closed again when it's done in this sort of way whereas the redirection is being done to an open file where the file is opened by the first instance of redirection to it and then there's a the file will be closed when the script exits but there's also a closed command which will do this stuff and we'll look at that in a minute so the next topic is redirecting to another program so this type of redirection uses a pipe symbol and to the right of the pipe symbol is the command which is a string so either a string literal or it's a variable containing a string so print space items space vertical bar space command will do the job and so there's an example here using the famous fruit data org open quote symbol quote nr greater than one then in curly brackets print dollar one space vertical bar then in double sort space minus use space vertical bar space nl closed double quotes closed curly brackets closed quotes than the name of the file bar so this time you get a list of the fruit but with a few added extras so the command which is being redirected to is actually a double command is a pipe in its own respect and it starts with a sort command it uses the option minus u or hyphen u the output from the sort and that causes it to make sure that all the things that it sorts are unique and then it that's piped to nl which is a thing that just numbers line so as this script has run when the first thing is written to this command subprocess is started up with these two commands in and that they are sitting waiting for input first name sent to this process and then it repeats with each successive name and that the subprocess finishes when the script finishes and the way that salt work is it it works as it accumulates all of the stuff that it's fed and then when they when it stops because the determination of the stream of data it will do the sort and carry on and in this case it passes the results of the sort as a bunch of lines to nl so you'll see though in the in the demonstration of how it works you see everything sorted alphabetically and then number one through eight now as I mentioned before there's a close command in a org which will close the redirection to a file or to command the argument to close needs to be the exact commands or file name which exactly might which define the process it needs to be completely exacting in every respect it's it's worse with a command because you might have you might add it extra spaces at various points so that's why it's a good idea to store the commands or file names in an org variable if you need to to do an explicit clone and example two shows the variable cmd being used to hold the shell command and in this case the connection is closed to show how it would be done that there's no actual need to close the the channel this is essentially the same the same script except it's a bit more involved the first rule is a begin rule where it sets up this cmd variable which is the sort hyphen u vertical bar nl then the second rule is and triggered by every line doesn't have a record number of one everything greater than one in other words which prints dollar one to this command and the final rule is an end rule which closes cmd variable so it just does exactly the same in a momented way but it proves the point so I thought I'd throw in a more what I refer to as real world example at least the set here is real in my world may wonder why not but when I'm preparing an hbr show like this which involves a number a number of examples script I need to run them for testing purposes prove they they really work and not nonsense so I have a main directory for my hbr shows and I work in that directory and I'm then like to make and then have subdirectries per show and I like to make soft links to the examples in the subject so I can run tests without having to hop around between directories in general I use the ln command with hyphen s as one of its arguments which makes a soft link and I use hyphen f which forces it to to make the link even if it exists normally if it already exists it will make it but sometimes I mess things up and want to overwrite a link with the real thing and so I use hyphen f to force up and then the arguments are the path to the actual file I want to make a link to and then the name of the link so I use the path of the file relative to where I and then use the base name of that to to to to make a link and so if I'm I'm pointing to where example one of walk14 is walk14 underscore ex1.awk and I'm going to make a link called that so I wrote a little Oxford to help me do this it takes path names as input and constructs shell commands which it pipes to sh through running as a subprocess and it's here as example three the script expects to be given one or more path names on standard input it takes the path and splits it up based on the slash character and it uses split to do this and split returns a number of elements that it finds so that number which are save in variable called n will index the last element we check that this makes sense so if it wasn't a path but consistent just of the file name then something a bit silly going on so check to see the n is if it's less than two then there's an error one of the things I do here is a sort of look ahead which is a print error message error in path and then dollar zero the the actual path string I've just read in and I send that to the file which is a string slash dev slash std e double r standard error and I'm going to talk a little bit more about this then the command next causes this particular input line to be skipped next I build the shell command and this is partly because I want to demonstrate it printed but it's sometimes people are convenient to do things this way so the command is called CMD and it's constructed by using sprint f s print f which is a formatted print thing but except it doesn't write anything out it simply returns it as a value which can be stored in a variable and so what I'm doing here is building the string which will be a shell command so in square brackets inside this string I've got hyphen e then percent s which is substitution point closed square bracket double ampersand so that's checking for existence of the file which is the case I was daft enough to feed it in the path to a file that doesn't exist double out in the sentence followed by ln space hyphen s space hyphen f for force percent s and then percent s closed no quits there's no new line on the end of this one and then the arguments are feed to this are dollar zero the path name dollar zero again same path name and a square bracket n I've stored the results from the split in an array called a and the end element is the last one as I said already then I simply print this out preceded by a double double chevron and finally in this rule print percent s print f I should say the format being percent s backslash n and feed command into that so that puts a new line on the end of it which is necessary for the the shell to receive it as a separate one so it will send the command that I've just printed out for for demonstration purposes to the shell whatever the shell is the shell I think by default is the born shell or whatever operating system it pretends it's born shell I think it's dash for example on debi and derivatives as opposed to bash then the final rule in this script is an end rule which closes the s h command pipeline not necessary not strictly necessary but it's good good practice to to do I've got an example of how I how I would run it how I do run it and I'm my command on the command line is a print f this is this is me in bash actually which sends percent s backslash n in double quotes to a pipeline and the argument to that print f is the path to the example and I will read this path out basically the last bit of it consists of all 14 underscore ex question mark or so what that will do is it will that that's what do they call that file expansion so it will return all of the matching file there'll be three because it's through example particular show and it returns them to print f and the way that print f works is that if it'll just keep printing the arguments that you give it if you there's only it's only cases for one in this particular case this is the way that the bash version of print f works anyway if you give it more than one it will just keep repeating that format over and over again so the result is that it prints out the strings that are returned from the expansion one after the other with a new line on the end and these are piped to an invocation of this particular script dot slash walk 14 underscore ex three dot walk and what we get back is three commands which are the the test to see that the files exist it's a bit superfluous because they do because that's otherwise they wouldn't have expanded but it depends how you create them in the first place check the exist and if it if it does exist then send it to an airline command maker doesn't that's all it shows the three instances so this is actually really useful I did it as a as a demo here and then realized actually this much more elegant than we've been doing things before and it probably needs to be a bit more foolproof needs to be have more error checks in it and stuff and in particular when you're doing this using an orc script to generate commands to send to your shell then there are potential pitfalls using quotes because shells are often fussy about quotes and you're doing it in another language it's sort of so me fuzzy about quotes and there's a particular thing in the guinea walk manual section 10.2.9 which I've referenced here but it's it's a useful thing to be able to do you might prefer to just write the the whole of such a thing in a bash script but it's entirely up to you the final bit of the redirection I want to talk about is redirecting to a co-process so this uses a pipe symbol and an ampersand to send output to a string containing a command or commands for the shell now this is an orc gnu orc extension and quite advanced unlike the previous redirection which sends to a program this form sends to a program and allows programs output to be read back so it's a two directional connection to a running process and that's the definition of this thing called a co-process by the time this show comes out you should have heard clackers talk about co-processes in bash so it's it hopefully it will make some make a bit more sense as a consequence we're going to talk a bit more about co-processes in the next of this pair of shows because it really makes sense to talk about it in the context of get line which is the way of reading stuff back again so i'm not going to say anything more about it but the basic idea is you would do print space item space vertical bar ampersand space and then some command that sends up to and re-receive stuff back from the final point then is redirecting to special file so as you know within unix there are three standard channels called standard input standard output and standard error standard error output is the other way of expressing these are connected to the the keyboard for standard input the screen for standard output and standard error usually goes to the screen so normally a unix program will script reach from standard input and writes the standard output and generates any error message on standard error and there's there's a lot more to be said about this and i think i'm going to go into this in a bit more detail in the bash tips series bit like but the way new org deals with these three three special file name three special channels is it has these special file name which are slash dev slash std in which is standard slash dev slash std e double r which is standard error output so i did i used it in an earlier script but you might want to send the message print space in double quotes invalid number double quotes greater than and then in double quotes slash dev slash std e double r double quotes and which will send a message to standard error so if you're running a script that did this then you can you can use your shells or a direction capability to send the standard error stuff somewhere special log it or something if you wish do something clever otherwise it will just look like output from the script so there's a lot more to be said about this haven't gone into detail see section 5.7 in the new org guide and there's other special names that you can use c section 5.8 or about but i'm not going to go into more depth so i'll be continuing the second half of this episode which is pretty much written be doing that fairly soon in the next few weeks is the plan so all right that's it then bye now you've been listening to hecka public radio at hecka public radio dot org we are a community podcast network that releases shows every weekday Monday through Friday today's show like all our shows was contributed by an hbr listener like yourself if you ever thought of recording a podcast and click on our contributing to find out how easy it really is hecka public radio was founded by the digital dog pound and the infonomicum computer club and it's part of the binary revolution at binrev.com if you have comments on today's show please email the host directly leave a comment on the website or record a follow up episode yourself unless otherwise status today's show is released on the creative comments attribution share a live 3.0 license