Episode: 2804 Title: HPR2804: Awk Part 13: Fix-Width Field Processing Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2804/hpr2804.mp3 Transcribed: 2025-10-19 17:01:59 --- This is an HPR episode 2,804 entitled Book Part 13, Fixed With Field Processing and in part of the series Learning Org, it is hosted by me and in about 6 minutes long and carrying an explicit flag. The summary is in this episode I discuss how to deal with Fixed With Field Text File Newing Org. This episode of HPR is brought to you by archive.org. University Access to All Knowledge by heading over to archive.org forward slash donate. Hello Hacker Public Radio fans, this is Be Easy once again coming at you with another episode for the OX series. This time focusing on Fixed Field With or how would you say it Fixed With Fields and Text Processing. So what is a Fixed With Field? It is a field where you instead of using a delimiter such as a comma or a pipe or a colon to delimit the different fields in a line or a record. What you do instead is you say you ahead of time declare how wide a field is allowed to be and then fill in any space between that and the next field beginning with white space or just in this case spaces. The advantage of this is that it is easily human readable because it looks just like a table that as an output in a text file. This is becoming out of fashion nowadays as we try to make more data formats machine readable. But if you work in health care like I do or some other industries where there's still a lot of legacy, a lot of times hardware that has that if you ask for a file output from that hardware it might give you a Fixed With Formatted Data Structure. So how do we process that in OX? Well actually it's really simple and this is going to be a really short episode because of that. In the begin statement of OX phrase and we've discussed this in the earlier episodes if you don't remember but you can use a begin statement before you go into the middle part and then you can do an end at the end. In the begin statement just use the phrase field with all in capital letters. So F-I-E-L-D-W-I-D-T-H-S, all one word, capital, field withs equals and inside of double quotes and space delimited the width of every field. So I have an example here where I have three fields per record and the field widths are 20, 10, and 12 so it'll say field widths equals and inside of double quotes, 20, 10, and 12 separated by spaces. And then after that you just process the file just like you would any other OX file where you have any other type of delimited that you've specified at the beginning. Now that makes it really easy but one thing that happens because OX is such a set simple file format is that when you say that for instance the first column is going to have 20 characters it's going to delegate 20 characters for that field for every single record and any character that is not a non-white space character is going to be filled in with space so if you try to use it in an expression where you try to do any analysis on it you're going to have a whole bunch of spaces at the end. Now when you're dealing with most of the times where you're just looking at numbers that's okay because inside of that file format they'll already keep a number of digits to hold that light space and to fill the entire width of the column but sometimes when you're dealing with textual data especially and if you don't do a you know you don't have an int format or you have to do a little bit of preprocessing before you can use it in the downstream processes. What I mean by preprocessing I just really mean stripping the white space at the end. So my example I have a begin statement then I say NR with methods my field widths phrase in it and then in the body I say NR is greater than 1 which means don't include the header I define name for column 1 state for column 2 phone for column 3 and then I use the sub command which I went over in my string function, string manipulation episode while back. I use the sub function to substitute out space any amount of multiples of zero more spaces that are before the end of the line and I replace it with nothing with just empty string. And so what it looks like is the sub and then the regular expression inside of four slashes comma and then empty string with double quote double quote comma and then the first name which is the first variable name and then I do the same thing with state and the same phone and then do a print up statement that says blank lives in blank period the phone number is blank period and line character and then I and then I fill in those blanks and that instead of you know when I'm saying blank I really mean percent s that is the the placeholder in a print up statement and then I'm filling those placeholders in with name state and phone number so when you read it out for the first line example you would see John Smith lives in Washington period phone number is 418 311 4111 and that's pretty much it that's how you manipulate a fill of fix with a record to use with awk and that's it so with no further ado I bid you farewell and keep hacking you've been listening to Hacker Public Radio at HackerPublicRadio.org we are a community podcast network that releases shows every weekday Monday through Friday today's show like all our shows was contributed by an hbr listener like yourself if you ever thought of recording a podcast then click on our contributing to find out how easy it really is Hacker Public Radio was founded by the digital dog pound and the infonomicant computer club and is part of the binary revolution at binrev.com if you have comments on today's show please email the host directly leave a comment on the website or record a follow-up episode unless otherwise status today's show is released on the creative firmman's attribution share a light 3.0 license