Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

84 lines
5.7 KiB
Plaintext

Episode: 2804
Title: HPR2804: Awk Part 13: Fix-Width Field Processing
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2804/hpr2804.mp3
Transcribed: 2025-10-19 17:01:59
---
This is an HPR episode 2,804 entitled Book Part 13, Fixed With Field Processing and in part
of the series Learning Org, it is hosted by me and in about 6 minutes long and carrying
an explicit flag.
The summary is in this episode I discuss how to deal with Fixed With Field Text File Newing
Org.
This episode of HPR is brought to you by archive.org.
University Access to All Knowledge by heading over to archive.org forward slash donate.
Hello Hacker Public Radio fans, this is Be Easy once again coming at you with another
episode for the OX series.
This time focusing on Fixed Field With or how would you say it Fixed With Fields and
Text Processing.
So what is a Fixed With Field?
It is a field where you instead of using a delimiter such as a comma or a pipe or a colon
to delimit the different fields in a line or a record.
What you do instead is you say you ahead of time declare how wide a field is allowed to
be and then fill in any space between that and the next field beginning with white space
or just in this case spaces.
The advantage of this is that it is easily human readable because it looks just like a table
that as an output in a text file.
This is becoming out of fashion nowadays as we try to make more data formats machine readable.
But if you work in health care like I do or some other industries where there's still
a lot of legacy, a lot of times hardware that has that if you ask for a file output from
that hardware it might give you a Fixed With Formatted Data Structure.
So how do we process that in OX?
Well actually it's really simple and this is going to be a really short episode because
of that.
In the begin statement of OX phrase and we've discussed this in the earlier episodes
if you don't remember but you can use a begin statement before you go into the middle
part and then you can do an end at the end.
In the begin statement just use the phrase field with all in capital letters.
So F-I-E-L-D-W-I-D-T-H-S, all one word, capital, field withs equals and inside of double
quotes and space delimited the width of every field.
So I have an example here where I have three fields per record and the field widths are
20, 10, and 12 so it'll say field widths equals and inside of double quotes, 20, 10, and
12 separated by spaces.
And then after that you just process the file just like you would any other OX file where
you have any other type of delimited that you've specified at the beginning.
Now that makes it really easy but one thing that happens because OX is such a set
simple file format is that when you say that for instance the first column is going to
have 20 characters it's going to delegate 20 characters for that field for every single
record and any character that is not a non-white space character is going to be filled in with
space so if you try to use it in an expression where you try to do any analysis on it you're
going to have a whole bunch of spaces at the end.
Now when you're dealing with most of the times where you're just looking at numbers that's
okay because inside of that file format they'll already keep a number of digits to hold
that light space and to fill the entire width of the column but sometimes when you're
dealing with textual data especially and if you don't do a you know you don't have
an int format or you have to do a little bit of preprocessing before you can use it in
the downstream processes.
What I mean by preprocessing I just really mean stripping the white space at the end.
So my example I have a begin statement then I say NR with methods my field widths phrase
in it and then in the body I say NR is greater than 1 which means don't include the header
I define name for column 1 state for column 2 phone for column 3 and then I use the sub
command which I went over in my string function, string manipulation episode while back.
I use the sub function to substitute out space any amount of multiples of zero more spaces
that are before the end of the line and I replace it with nothing with just empty string.
And so what it looks like is the sub and then the regular expression inside of four slashes
comma and then empty string with double quote double quote comma and then the first name
which is the first variable name and then I do the same thing with state and the same
phone and then do a print up statement that says blank lives in blank period the phone number
is blank period and line character and then I and then I fill in those blanks and that
instead of you know when I'm saying blank I really mean percent s that is the the placeholder
in a print up statement and then I'm filling those placeholders in with name state and phone
number so when you read it out for the first line example you would see John Smith lives in
Washington period phone number is 418 311 4111 and that's pretty much it that's how you
manipulate a fill of fix with a record to use with awk and that's it so with no further ado
I bid you farewell and keep hacking
you've been listening to Hacker Public Radio at HackerPublicRadio.org
we are a community podcast network that releases shows every weekday Monday through Friday
today's show like all our shows was contributed by an hbr listener like yourself
if you ever thought of recording a podcast then click on our contributing to find out how easy
it really is Hacker Public Radio was founded by the digital dog pound and the infonomicant
computer club and is part of the binary revolution at binrev.com if you have comments on today's
show please email the host directly leave a comment on the website or record a follow-up episode
unless otherwise status today's show is released on the creative firmman's
attribution share a light 3.0 license