- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
84 lines
5.7 KiB
Plaintext
84 lines
5.7 KiB
Plaintext
Episode: 2804
|
|
Title: HPR2804: Awk Part 13: Fix-Width Field Processing
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2804/hpr2804.mp3
|
|
Transcribed: 2025-10-19 17:01:59
|
|
|
|
---
|
|
|
|
This is an HPR episode 2,804 entitled Book Part 13, Fixed With Field Processing and in part
|
|
of the series Learning Org, it is hosted by me and in about 6 minutes long and carrying
|
|
an explicit flag.
|
|
The summary is in this episode I discuss how to deal with Fixed With Field Text File Newing
|
|
Org.
|
|
This episode of HPR is brought to you by archive.org.
|
|
University Access to All Knowledge by heading over to archive.org forward slash donate.
|
|
Hello Hacker Public Radio fans, this is Be Easy once again coming at you with another
|
|
episode for the OX series.
|
|
This time focusing on Fixed Field With or how would you say it Fixed With Fields and
|
|
Text Processing.
|
|
So what is a Fixed With Field?
|
|
It is a field where you instead of using a delimiter such as a comma or a pipe or a colon
|
|
to delimit the different fields in a line or a record.
|
|
What you do instead is you say you ahead of time declare how wide a field is allowed to
|
|
be and then fill in any space between that and the next field beginning with white space
|
|
or just in this case spaces.
|
|
The advantage of this is that it is easily human readable because it looks just like a table
|
|
that as an output in a text file.
|
|
This is becoming out of fashion nowadays as we try to make more data formats machine readable.
|
|
But if you work in health care like I do or some other industries where there's still
|
|
a lot of legacy, a lot of times hardware that has that if you ask for a file output from
|
|
that hardware it might give you a Fixed With Formatted Data Structure.
|
|
So how do we process that in OX?
|
|
Well actually it's really simple and this is going to be a really short episode because
|
|
of that.
|
|
In the begin statement of OX phrase and we've discussed this in the earlier episodes
|
|
if you don't remember but you can use a begin statement before you go into the middle
|
|
part and then you can do an end at the end.
|
|
In the begin statement just use the phrase field with all in capital letters.
|
|
So F-I-E-L-D-W-I-D-T-H-S, all one word, capital, field withs equals and inside of double
|
|
quotes and space delimited the width of every field.
|
|
So I have an example here where I have three fields per record and the field widths are
|
|
20, 10, and 12 so it'll say field widths equals and inside of double quotes, 20, 10, and
|
|
12 separated by spaces.
|
|
And then after that you just process the file just like you would any other OX file where
|
|
you have any other type of delimited that you've specified at the beginning.
|
|
Now that makes it really easy but one thing that happens because OX is such a set
|
|
simple file format is that when you say that for instance the first column is going to
|
|
have 20 characters it's going to delegate 20 characters for that field for every single
|
|
record and any character that is not a non-white space character is going to be filled in with
|
|
space so if you try to use it in an expression where you try to do any analysis on it you're
|
|
going to have a whole bunch of spaces at the end.
|
|
Now when you're dealing with most of the times where you're just looking at numbers that's
|
|
okay because inside of that file format they'll already keep a number of digits to hold
|
|
that light space and to fill the entire width of the column but sometimes when you're
|
|
dealing with textual data especially and if you don't do a you know you don't have
|
|
an int format or you have to do a little bit of preprocessing before you can use it in
|
|
the downstream processes.
|
|
What I mean by preprocessing I just really mean stripping the white space at the end.
|
|
So my example I have a begin statement then I say NR with methods my field widths phrase
|
|
in it and then in the body I say NR is greater than 1 which means don't include the header
|
|
I define name for column 1 state for column 2 phone for column 3 and then I use the sub
|
|
command which I went over in my string function, string manipulation episode while back.
|
|
I use the sub function to substitute out space any amount of multiples of zero more spaces
|
|
that are before the end of the line and I replace it with nothing with just empty string.
|
|
And so what it looks like is the sub and then the regular expression inside of four slashes
|
|
comma and then empty string with double quote double quote comma and then the first name
|
|
which is the first variable name and then I do the same thing with state and the same
|
|
phone and then do a print up statement that says blank lives in blank period the phone number
|
|
is blank period and line character and then I and then I fill in those blanks and that
|
|
instead of you know when I'm saying blank I really mean percent s that is the the placeholder
|
|
in a print up statement and then I'm filling those placeholders in with name state and phone
|
|
number so when you read it out for the first line example you would see John Smith lives in
|
|
Washington period phone number is 418 311 4111 and that's pretty much it that's how you
|
|
manipulate a fill of fix with a record to use with awk and that's it so with no further ado
|
|
I bid you farewell and keep hacking
|
|
you've been listening to Hacker Public Radio at HackerPublicRadio.org
|
|
we are a community podcast network that releases shows every weekday Monday through Friday
|
|
today's show like all our shows was contributed by an hbr listener like yourself
|
|
if you ever thought of recording a podcast then click on our contributing to find out how easy
|
|
it really is Hacker Public Radio was founded by the digital dog pound and the infonomicant
|
|
computer club and is part of the binary revolution at binrev.com if you have comments on today's
|
|
show please email the host directly leave a comment on the website or record a follow-up episode
|
|
unless otherwise status today's show is released on the creative firmman's
|
|
attribution share a light 3.0 license
|