Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
127
hpr_transcripts/hpr1172.txt
Normal file
127
hpr_transcripts/hpr1172.txt
Normal file
@@ -0,0 +1,127 @@
|
||||
Episode: 1172
|
||||
Title: HPR1172: LiTS 022: Sort
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1172/hpr1172.mp3
|
||||
Transcribed: 2025-10-17 20:56:41
|
||||
|
||||
---
|
||||
|
||||
Welcome to Linux in the Shell episode 22. My name is Dan Moshko, I'll be your host
|
||||
today and I would like to thank Hacker Public Radio for hosting the website and these
|
||||
audio files. Please consider contributing the Hacker Public Radio by doing your own
|
||||
episode or at the very least listening to the fantastic shows that are offered every
|
||||
weekday. Episode 22 of Linux in the Shell is going to talk about the sort command. If you
|
||||
have not gone over to the website, Linux in the Shell.org to read up on this entry, I suggest
|
||||
you do so to get a full understanding of the sort command either before or soon after you
|
||||
listen to the audio component. Sort command is an extremely handy utility and it does basically
|
||||
what it says. It sorts, it sorts input from standard in, from a file that you provide and it
|
||||
sorts it in many different ways and I have used sort a lot of times in the past in many different
|
||||
projects that I've had to do. One that comes to mind is when I used to manage a lot of user
|
||||
accounts or user names, I get lists of data that I have to put into a format that I
|
||||
can easily put into a database or whatever and I get lists in spreadsheets and pulling this
|
||||
information out and sorting it and using it in conjunction with like unique and cut. Really save
|
||||
time on manipulating that data into a format that I needed. Sort by itself does just a standard
|
||||
alpha numeric sorting of whatever data you throw at it line by line. So if you had a list of
|
||||
like a shopping list for instance and you ran sort on that sort shopping list, what that would
|
||||
do is essentially put it in an alpha numeric order and the way that it orders stuff by default
|
||||
is that symbols have the highest priority in the hierarchy. So anything that starts with like a
|
||||
symbol like a pound sign a plus minus any any kind of symbol takes precedence followed by
|
||||
numbers and then letters with capital letters taking precedence that is uppercase letters
|
||||
taking precedence over lower case letters. So if you had a list that was like a hash,
|
||||
fish, two pounds, sinker, hooks and bobbers and you sorted that list the first thing would be the hash,
|
||||
fish, two pounds, sinker then it would change the order of hook that would come after bobbur. So it
|
||||
would be hash, fish, two pounds, sinker, bobbur, hook like that. So it was a simplest like that.
|
||||
And based on that you can then apply different options to sort and oh just so you know a space
|
||||
is considered the highest priority symbol. So if you had a space at the beginning of any of those
|
||||
lines that would come before any of the symbols. So space then symbols, then numbers,
|
||||
then letters starting with uppercase, then lower case. And it does a comparison logically
|
||||
like it looks at the first character in that line and orders by that. And if there's two lines
|
||||
that have the same character then of course it proceeds to the second character and so on. So that's
|
||||
that's how sorting works. Now you can ignore leading blank lines blank spaces with the dash B
|
||||
or dash dash ignore dash leading dash blanks. And so that that treats any line that starts with a
|
||||
blank or a space. It ignores those blanks and goes to the first non space character like that.
|
||||
And you can also ignore case by the dash F or dash dash ignore dash case option. And then
|
||||
it just does an alphanumeric sort as you would expect. But it doesn't look at upper or lower case.
|
||||
It doesn't differentiate. It treats everything essentially as if it were lower case and and sorts
|
||||
on that option. Now sort has a couple other functions to it that not only look at the characters
|
||||
and ordering them in alphanumeric format, but it also has stuff like you could do if you had a
|
||||
list of dates you can do a month sort or which is a dash capital M or dash dash month dash sort.
|
||||
And it looks at a list of months whether they're full names like April, December, October or
|
||||
they're the abbreviated name of the month like JUN for June or MAR for March or AUG for August.
|
||||
And it would sort on those put them in the proper order and it does that ignoring case.
|
||||
And it also ignores whether it's abbreviated or not. So if you had a mixture of
|
||||
full name dates and abbreviations, it would sort them as you would expect in the proper alphabet
|
||||
or proper order for the date. Proper monthly order where January would be first followed by
|
||||
February, March, April and so on through December. So that's pretty handy. Now there's a
|
||||
couple of other stuff more specific to like numeric values which there's a general numeric sort,
|
||||
dash G, which is dash dash general, dash numeric, dash sort. And that is what you would expect that
|
||||
it sorts on a standard numeric sort. And what takes precedence in those are any non numeric
|
||||
character is considered the same and treated as a regular sort. So if you had a list of numbers
|
||||
in their integers, it would sort them as you would expect from 0 to or negative numbers to
|
||||
up through positive numbers. So if you had like negative, if you had like 0 5.88 plus 12 negative
|
||||
32 15, it would sort those as you would expect to be negative 32 0 5.88 plus 12 15. So it looks at
|
||||
that numbers in their symbols. But if there were any letters in there like you had fish and corn,
|
||||
it would sort those first before the numeric list alphabetically as you would expect and then
|
||||
it would do the numeric sort. Now that's a general numeric sort whereas a numeric sort, a dash N or
|
||||
dash dash numeric dash sort, that produces a list by a little different rules that it looks at
|
||||
symbols first and treats all non number characters as the same like it does in a regular numeric sort.
|
||||
But it gives preference to like alpha numeric values. So
|
||||
what it does is if there's a symbol character in there, it kind of
|
||||
puts those symbols characters first. So any previous example that I used like dash where it was
|
||||
the negative 32 and the plus 12, those would come first. So it treats the symbols first, negative
|
||||
32 plus 12, then it gets 0 and then any alpha numeric characters would be all treated like 0.
|
||||
So you'd have any alpha characters in there and then numeric characters followed logically
|
||||
going from 0 on up through the numbers. So a dash and a numeric sort may not give you the
|
||||
output that you expected to. So just test that first. So the way that it behaves can be a little
|
||||
jarring. So a regular numeric sort which dash N has different rules than general numeric sort.
|
||||
So be aware of that. Chances are if you're really looking to do something more in a numerical
|
||||
basis, you want to do a general numeric sort as opposed to just a that's with the dash g as a
|
||||
number sort as just a regular numerical sort. There's a dash h which is the human numeric sort
|
||||
or dash dash human dash numeric dash sort. And what that does is it first determines whether
|
||||
there's a number signs positive or negative 0. And then it looks at whether there's a suffix.
|
||||
And the suffix can be any one of the following could be a k for like kilobytes or capital k.
|
||||
And then the other options, MGTPEZY, which we're all familiar with those, megabyte, gigabyte,
|
||||
terabyte, petabyte, exabyte, zedabyte, yadabyte, those all have to be capital letters. MGTPEZY,
|
||||
all have to be capitalized. In this case, incident, you won't get a proper sort if you don't capitalize
|
||||
those if that's what you're looking for with a human human generic sort. So it looks at the prefix
|
||||
first, the number, and the suffix. And it orders it on both of those. So for example,
|
||||
if you had one M and one one capital M and one capital G, that would be the order if you had
|
||||
but if you did like 1042, 1042 capital M and one G, it would not put one G first and then 1042
|
||||
capital M second, even though 1042 M is greater than one G, one giga, because 1024 mega is equal to
|
||||
one gigabyte. It's not smart enough to do that, just be aware of that. There's some limitations there.
|
||||
So it's primarily looking at the numbers and the suffixes and ordering them on there, not necessarily
|
||||
the value of the number or in conjunction with the suffix. Just be aware of that right there.
|
||||
Sort has an option to randomize the sort value. If you need to really take a list and randomize it,
|
||||
you can use the dash capital R or dash dash random dash sort. And that'll do a hashed random value
|
||||
of that list. So if you if you ran it three or four times, you get different values each time
|
||||
that you run it. So it does a pretty good job of randomizing it. And you can you can randomize that
|
||||
based on a file, use a dash dash random dash source equals some file and you'll get a random
|
||||
sort based on that the value of that file. And it should be fairly consistent if you're doing a
|
||||
random sort of the same list over and over again. The last sorting option I want to talk about
|
||||
is versioning sort, which is a smarter option of looking at prefixes and suffixes for version
|
||||
files like source code files or something like that. That the way it operates is it it looks at
|
||||
tries to break it into a prefix and suffix logically. And the suffix being the version number,
|
||||
so to speak. And it looks at that and orders it logically through a standard regular expression
|
||||
that's outlined in the info file. But what it does is is if encounters leading zeros and the
|
||||
version numbers ignores those. So it does a good job of being able to sort out like zero one two zero
|
||||
one two B zero one three zero zero one three B and sort those in proper order where necessary.
|
||||
A normal sort would put the zero zero one three B first when you probably want that to be last.
|
||||
And so it would order it based upon the non zero suffix value and a standard numeric order on
|
||||
those to give you a proper versioning list. And that can be handy if you're if you're sorting through
|
||||
a list of versioned software. Now finally all those options that I've talked about when you're
|
||||
passing a sort you can do a dash dash sort equals word where word would be one of the values that
|
||||
I talked about before which would be general dash numeric human dash numeric month numeric random
|
||||
version instead of specifying like the other options you can do dash dash sort equals numeric spell
|
||||
it out. And it'll do us numeric sort on the list. That is the basics of sort in a nutshell. There
|
||||
are other options which I may cover in a future show. They're pre uh unique options that
|
||||
90% of the cases people use sort four will probably never use those but that is an option.
|
||||
Head over to the website for to full write up and to watch the video of using the sort command.
|
||||
Again I want to thank hacker public radio for hosting the files and you for listening have a great day.
|
||||
You have been listening to Hacker Public Radio or Hacker Public Radio does our
|
||||
We are a community podcast network that releases shows every weekday Monday through Friday.
|
||||
Today's show like all our shows was contributed by a HPR listener like yourself.
|
||||
If you ever consider recording a podcast then visit our website to find out how easy it really is.
|
||||
Hacker Public Radio was founded by the digital dog pound and the infonomicum computer club.
|
||||
HPR is funded by the binary revolution at binref.com. All binref projects are crowd-sponsored
|
||||
by luna pages. From shared hosting to custom private clouds go to luna pages.com for all your
|
||||
hosting needs. Unless otherwise stasis today's show is released under a creative commons,
|
||||
attribution, share a line, free dose of license.
|
||||
Reference in New Issue
Block a user