Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

68 lines
8.4 KiB
Plaintext

Episode: 768
Title: HPR0768: Sort
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0768/hpr0768.mp3
Transcribed: 2025-10-08 02:07:02
---
Today in Hacker Public Radio, this sort of command.
Hi everybody, my name is Ken Fallon and welcome to Hacker Public Radio.
On today's episode, we're going to be going through the sort of command.
It's one of the most common commands used on the command line and unique systems.
It's probably guaranteed to be on every system.
And I want to bring you through some of the options that are available in the GNU Core Utils 8.5 version, which is on my system here.
If you want to read along, you can open a bash prompt and you can type in man, space, sort.
And the options are, so you can sort then some options and then the file name.
So dash b, well, first of all, what it does is it writes a sorted concatenation of all files, just standard output.
And what that means is if you have a text file, for instance, with IP addresses and you want to sort it, you could just type sort, space, IP address, text and sort all those files for you.
Some of the options that you can use is dash b, which is ignore leading blank lines, blanks in the lines.
dash d or dash dash, dictionary, dash order, consider only blanks enough on numeric characters.
dash f or dash dash, ignore dash case will fold lowercase to uppercase characters.
So if you have a combination of upper and lowercase, and you're not that worried about this, you want to sort by street address perhaps, then that will be an interesting one to use.
The dash g or dash dash general dash numeric dash sort will compare according to generic numerical values.
The dash I or dash dash ignore dash non printing will consider only printable characters.
The dash uppercase M or the dash dash month dash sort will compare January, February, March or J and right to DEC dash H or dash dash human human dash numeric dash sort will compare human readable character numbers.
So, for example, 2k one gig dash n or dash dash numeric dash sort will compare according to string of numerical values dash or or dash dash random dash sort will compare will sort by a random hash of keys.
And the option dash dash random dash source equals file will get random bytes from a file. I have no idea why you would use that, but there it is.
The dash R or a long form dash dash reverse will reverse the results of the comparisons.
You also have the long form dash dash sort equals and it can be G dash H dash uppercase M.
Actually, the dash dash sort can be a word which is generic general dash numeric which shortcuts to dash G human that new dash numeric which shortcuts to dash H month which shortcuts to dash uppercase M numeric which shortcuts to dash N random which shortcuts to dash R and version which shortcuts to dash.
And dash V is a natural sort of version numbers within a text which is quite handy.
Other options include dash dash batch size NM merge which will merge the most N merge inputs at once for more use temp file. I have no idea what that means.
However, the dash C and long form dash dash check and also dash dash check equals diagnosed dash first will check for sorted input first and do not sort which is I guess more efficient.
dash capital C or dash dash check equals quiet or dash dash check equals silent is like dash C does not report the first bad mind.
The long form dash dash compress dash program equals program will compress temperies with the program whichever one you choose.
And then you can decompress them with proc dash D dash dash files zero equals from equals F will read input from the files by no terminated names in file F.
If file F is dash then it will read the names from standard input.
Then we have dash key sorry dash K long form dash dash key equals position one with an optional comma position to start key start at a key position one origin of one and end at position two default is the end of the line.
I'm coming back to this on an image because this one is the reason that I'm recording this for you. However, I'll continue reading the page next option is dash M or long form dash dash merge will merge and already sorted files so do not sort.
dash O or dash dash output equals file will write the results to file them instead of standard output dash S or dash dash stable will stabilize the output by disabling last resort comparison no idea what that means.
dash capital S S or long form dash dash buffer dash size equals size will use the size for the main memory buffer dash T or long form dash field dash separator equals separator so use separator instead of non blank to blank transitions.
Then we have dash capital T or dash dash temporary equal dash directory equals and a dear will use the directory for temporary's not dollar temp dear or slash temp multiple options specifying multiple directories.
We have dash U or long form dash dash unique with dash C checked for strict ordering without dash C output only the first of an equal run.
I'll be coming back to this in a moment as well and dash Z or dash dash zero dash terminated and lines with zero byte not a new line.
Then we have the default dash dash help display the help and exit and dash dash version will output.
Now the reason I wanted to talk to you about this is the sort is a very useful command lots of people would use it so say you have a log file of IP addresses and you want to get unique IP addresses.
So you would use sort you would cat IP address text use the pipe character which is a vertical line going from top to bottom usually above the enter key and most keyboards.
And you would pipe that to sort which will put all the IP addresses into group them all together.
And then you can pipe that again into unique UNIQ which is not the command which will give you the first of the unique instances and won't repeat and your repeats and give you the next one.
The thing though is if you had a series of names like clatu phinics code cruncher code cruncher clatu and you sorted that now you want to unique on that you still have clatu phinics code cruncher and you'd have clatu again but with the sore command you would have all the clatu's together all the phinics is together and all the code crunchers together to give you a list of HPR calls who happen to be on iron.
So there you are. However, I'm going to go back to what is really really cool and that is the dash k option and the dash u option.
So what that will allow you to do is say if you have a list of in my case I've got a text file which has the name of the camera model that took the picture and all the pictures all parts to all the pictures that are on my hard disk.
So there's something like 20 or 30 different cameras images from 20 or 30 different sources. What I want to do is take one sample image from each of those cameras so I can test another program that I'm working on.
I have a large file with list of all those images but and they're all mixed up together so I have a file with cyber shot a file with MX dash 12,000 and so forth and so on.
So what I want to do is just take one file from each of these and my file is camera model path to file.
So I can do that by simply going sort dash dash key equals one comma one which means sort the file based on the first key and ignore the second key.
So say for example take my IP addresses or cyber shot picture one cyber shot picture two cyber shot picture three it'll just sort on the first one and ignore the second column.
But what's kind of cool is if you tag on the dash u to the end of that then it will uniquely pick the first picture from a camera.
So the first cyber shot picture one camera two picture one camera three picture one giving you a nice unique text file.
Okay with that I'll in the show I hope you found this rundown of the sort command useful and if you feel like sponsoring man page feel free to do that and give us some examples of what you might use it for.
One other one just looking through the file here that I do use quite a lot is the dash T which specifies the field separator.
So if you're using comma separated files you can use the dash T in there equals dash dash field separate dash separator equals and then the comma and you will be able to then tell sort I'm using a comma as a field separator.
Instead of white spaces.
Okay with that I'll thank you very much for listening to this also again ask everybody if they wouldn't mind sending us an issue I'd appreciate it.
Thank you very much and see you all on the other side.
Thank you for listening to H.P.R. sponsored by Carol dot net so head on over to C.A.R.O dot 18 for all of us.
Thank you.
Thank you.