Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
445
hpr_transcripts/hpr2060.txt
Normal file
445
hpr_transcripts/hpr2060.txt
Normal file
@@ -0,0 +1,445 @@
|
||||
Episode: 2060
|
||||
Title: HPR2060: Introduction to sed - part 5
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2060/hpr2060.mp3
|
||||
Transcribed: 2025-10-18 13:51:08
|
||||
|
||||
---
|
||||
|
||||
This is HPR episode 2016 titled Introduction to Send Part 5 and in part on the series Learning Send.
|
||||
It is hosted by Dave Morris and in about 48 minutes long.
|
||||
The summer is finishing covering send commands, looking at some enample crypts.
|
||||
This episode of HPR is brought to you by an honesthost.com.
|
||||
Get 15% discount on all shared hosting with the offer code HPR15, that's HPR15.
|
||||
Better web hosting that's honest and fair at an honesthost.com.
|
||||
Hello everyone, this is Dave Morris.
|
||||
Now today I'm finishing the series on said.
|
||||
It's taken 5 episodes to cover it in some depth.
|
||||
I get the impression that quite a few people have tuned out by now.
|
||||
But there's some quite interesting features in here and I thought there would be at least a small number of listeners who would be interested in delving into them.
|
||||
So in the last episode we looked at pretty much everything about the way that said works.
|
||||
And we looked at the hold and the pattern buffers.
|
||||
We looked at some of the commands that we hadn't seen at that particular point and started to see what could be done with them.
|
||||
Now there are a few more remaining commands, we're not going to cover them all.
|
||||
But some of them are, well I think they're all pretty obscure, possibly even very obscure as I put in the notes here.
|
||||
But I wanted to cover them because I'd like to go over a few of the example said scripts that are in the said, the GNU said manual.
|
||||
The reason for doing this is that I went looking at this manual when I was trying to work out how to use said in a slightly more advanced way.
|
||||
And looking at the examples, I could not make head or tail of the majority of them.
|
||||
I wanted to fairly straightforward but the majority of them they're just so so so obscure.
|
||||
And I thought well if I can shed a little bit of light on them that might be helpful to others then it's worth having to go out.
|
||||
There's quite a number of them in the GNU manual but I'm only covering three in this show.
|
||||
So if you want to get deeper then you're on your own or you can call on me and I'll do another show.
|
||||
But I doubt that's going to happen.
|
||||
So we looked at some of the less frequently used commands last time.
|
||||
We skipped a few so I'm going to fill in those gaps today or some of them anyway.
|
||||
There are a bunch of commands for inserting text instead.
|
||||
So far we've just been modifying stuff and printing it out and so forth.
|
||||
But we haven't got anything for inserting stuff.
|
||||
And the first command is the C command and it's rather weird.
|
||||
It's normally written out as if you're typing your said commands into a file.
|
||||
So you see the letter C on the first line and it can have addresses associated with it.
|
||||
All of the address types that we saw in episode two and the C is then followed by a backslash.
|
||||
Then on the next line is the first line that you want to insert followed by a backslash if there's more lines.
|
||||
And then it carries on to the last line which doesn't have a backslash on the end.
|
||||
So it's all the lines including the C line have to have backslashes on the ends of them except for the last line.
|
||||
So what it does it deletes the lines from the input stream or the file that you're working on.
|
||||
It deletes the lines that match the addresses and it replaces them by the lines that follow the C command.
|
||||
Since it deletes the pattern space when you issue a C command a new cycle begins.
|
||||
So another line is brought in and whatever.
|
||||
All the things that we talked about in earlier episodes.
|
||||
Now the C command can be used on the command line, direct on the command line, but not all that usefully.
|
||||
If you use it you can't then follow C with any other said commands because there's no easy way to signal that that's the end of the insertion.
|
||||
And here is the next command.
|
||||
But you can do it by using multiple minus e quoted strings on the command line.
|
||||
So I've got a demonstration here.
|
||||
The demonstration I'll just read out the command.
|
||||
It says space minus e space open quote one C backslash line remove close quote minus e quote three q lowercase q close quote.
|
||||
And then the file said demo one dot TXT.
|
||||
So what that saying is online one replace that line with the words line removed and then online three.
|
||||
That's the second minus e expression quit.
|
||||
So what you see is the first line to replace the line removed then the next two lines are printed out normally.
|
||||
But you can only generate one line through to the C command in this way.
|
||||
You can't do this in normal in the standard said.
|
||||
So if you wanted to generate more than one line and I've got an example of trying to do that.
|
||||
So I've got a one C in my expression backslash asterisk asterisk asterisk asterisk, backslash do not read.
|
||||
It doesn't generate two lines.
|
||||
It's just one line containing all that text with the backslash is removed.
|
||||
so but you can insert escape character so backslash n is valid so that will
|
||||
generate a line break so you do you can generate multiple lines or the third
|
||||
example in this the notes the long notes here shows it being used to add the
|
||||
line censored with asterisk rounded do not read instead of line one of said
|
||||
demo one dot txt but the c command is best used in a file of said commands and
|
||||
I've put one together which is called demo five dot said scd which is available
|
||||
and linked from the the notes available on the hpr website so what I've done in
|
||||
the notes is I've shown the the said file demo five dot said and I've I've
|
||||
listed it using the nl command with various arguments to to control the way
|
||||
that the numbers are put out so you'll see there are seven lines the first
|
||||
five lines consist of one c so that means on line one operate the c command
|
||||
replace line and then on the end of the c commands are backslash then a line
|
||||
of a few hyphons backslash this next line is line three this line has been
|
||||
censored backslash and then so and so lines four then line five is the last
|
||||
line and it's a bunch of hyphons is a blank line then three q so basically this
|
||||
is a similar said program to the ones have already seen and when you run it you
|
||||
would run it as said space minus f space demo five dot scd space said demo one
|
||||
dot txt and instead of line one of the file you get this line has been
|
||||
censored with lines of hyphons around it and so on and so forth you could have
|
||||
done all that on one command line using backslash in new line things but
|
||||
that's very much a a GNU said extension so this particular example would work
|
||||
with standard versions of said on bsd or unique systems or whatever not sure how
|
||||
many systems don't use GNU said these days but they do exist I'm sure so
|
||||
after the the c command let's look at the a command this is similar it consists
|
||||
of the letter a followed by a backslash followed by multiple lines each
|
||||
followed by a backslash except the last one it'll take any of the address
|
||||
types that we normally find and what happens here is that the line that's
|
||||
currently been read in is processed as normal and then it's followed by the
|
||||
lines associated by with the a command and it happens at the end of the cycle
|
||||
so basically it's appending this whatever you text you're providing after this
|
||||
particular line so if you put addresses on the a command then it will only do
|
||||
this for the lines in question or if you don't put a line number at all it will
|
||||
apply to every line of the file so I've got an example of using it on the
|
||||
command line using the one line form as we looked at for the c command and this
|
||||
one consists of the command said space minus e space quote one a backslash
|
||||
chickens quote space minus e space quote one q quote space said demo dot t x t
|
||||
so what this does is it prints out the first line hackaburlic radio and then
|
||||
it follows it with the word chickens and it only applies to the first line and
|
||||
the one q says after processing line one stop so it just does it once and you
|
||||
only see one line of the file the next example shows how you could add lines
|
||||
containing just a single hyphen after every line of a file I don't know why
|
||||
you'd want to do that but it's effectively double space so maybe it's a bit
|
||||
easier to read or something so this one the command is said space minus e
|
||||
space quote a backslash hyphen quote space minus e space quote three q quote space
|
||||
said demo one dot t x t so this prints out the first second and third lines of
|
||||
the file and after each one is a black is a line that only contains a hyphen
|
||||
in the first first column so you can see that the a command is being applied to
|
||||
each line and the hyphen is being written out after the line so finally in this
|
||||
group of insertion command we have the i command this is both I should have
|
||||
said as far as the a command you won't find that in other versions of all
|
||||
said it's only the GNU said and similarly with the i command it's only only
|
||||
available in GNU said and it's got the same structure as the cna command so it's
|
||||
the is i backslash line one backslash line two and so forth the last line not
|
||||
containing not ending with the backslash takes addresses in the same way and what
|
||||
happens is all that the lines covered by the addresses or all lines at the
|
||||
file if there's no address have the text which has been inserted by the i
|
||||
command in added in front of them so using the one line form as we looked at
|
||||
with the c and the a commands you can put escape characters like backslash
|
||||
in for new line so I've got an example that simply adds the hyphen that the
|
||||
line contains just a hyphen in front of every line and printed out from the
|
||||
file and this is this is a very very similarity example I gave for the a command
|
||||
except the hyphen hyphenated line precedes each line so I won't enlarge on
|
||||
this one I thought I'd finish off with something that might be might be useful
|
||||
I don't know and personally I've never found a use for these these particular
|
||||
commands but what I've done is to construct an example that inserts an open
|
||||
square bracket before every line and a closed square bracket after every
|
||||
line and it just does it to the first three lines of said demo 1.dxt and it's
|
||||
simply done by the command said space minus e space quote i backslash open
|
||||
square bracket quote space minus e space quote a backslash closed square
|
||||
bracket quote space minus e space quote 3q quote space said demo 1.dxt so
|
||||
that's going to print out the first three lines of the file then it quit and
|
||||
in front of each one I'll put an open square bracket after each one I put a
|
||||
closed square bracket so that's all there is to say about this really it's it's
|
||||
a bit weird it's it's could be could be useful in some contexts you could use it
|
||||
to since the addressing type that you can use there is any of the addresses that
|
||||
we've talked about in episode two you could use it as a way of highlighting a
|
||||
given line in a file by by putting something something some sort of highlighting
|
||||
sequence in front and after it or just in front of it or just after it or whichever
|
||||
you to ever takes your fancy so I guess it could be useful from that point of
|
||||
view so I'm going to go on to three elements three commands within GNU's
|
||||
said which are classified as guru level these are pretty obscure the
|
||||
section only in the manual the GNU said manual is called commands for said
|
||||
gurus and it says there's an introduction in most cases use of these
|
||||
commands indicates that you are probably better off programming in something
|
||||
like awk or pearl but occasionally one is committed to sticking with said
|
||||
and these commands can enable one to write quite convoluted scripts so
|
||||
basically it's warning off using them I think but I'm including them here
|
||||
because quite a number of the examples in the manual use use them so I thought
|
||||
it would be useful at least to skim over them so you had some sort of
|
||||
understanding when I go on to explain some of the examples so said commands
|
||||
are really a sort of programming language very very basic sort of and rather odd
|
||||
programming language and as such it's possible to use labels and conditional
|
||||
and unconditional branching within a said script defining a label consists of a
|
||||
colon you just put a colon on a line by itself followed by a character
|
||||
sequence and the character sequences the label so colon x is a gives an example
|
||||
or colon hello or anything that makes you fancy can be used as a as a label
|
||||
it's no addressing associated with it makes no sense to have any addresses
|
||||
and it's simply a point a way of marking a point in the script for branch
|
||||
speaking of which the b command which consists of b followed by an optional label
|
||||
it causes an unconditional branch to that label so it makes the flow of
|
||||
execution jump backwards or indeed forwards to the label in the script the b command
|
||||
can be used without a label in which case it just stops the current cycle
|
||||
and starts a new one third example we'll look at in this this particular episode
|
||||
reverse characters of lines it's called uses this then there's the t command
|
||||
and this takes the form pretty much the same as the b command letter t
|
||||
lowercase t it was lowercase b as well I should have said
|
||||
lowercase t followed by a label so it causes a conditional branch to the label
|
||||
and it happens only if there's been a successful substitution with an s command
|
||||
since the last input line was read or conditional branch was taken if you admit the label
|
||||
which you're quite at liberty to do then it causes the next cycle to start and again this is used
|
||||
in the third example towards the end of the notes so the final category is commands which are
|
||||
specific to GNU said there's actually quite a number of these I've skipped them all but one
|
||||
you're probably relieved to hear that you can check them out and I've given pointers to them
|
||||
in the GNU manual in the in the notes the command I'm going to talk about quite briefly is
|
||||
the F command this is a capital F and what it does it prints out the file name of the current
|
||||
input file it does this with a new line at the end of it so I've got an example here which
|
||||
consists of said space minus e quote one open curly bracket capital F semicolon q lowercase q
|
||||
close curly bracket quote space said underscore demo one dot txt so what that's actually saying is
|
||||
online one there's a group that is to be obeyed the curly brackets and the group of commands
|
||||
consists of the capital F command which prints the name of the file and then a q command which
|
||||
stops processing because in this example said is running in its standard read and print mode or
|
||||
to print then the first line of the output is printed so what you see is output that command
|
||||
is said underscore demo one so that's the name of the file is being processed and it's written
|
||||
to the output followed by the first line of the file it's as simple as that it is quite useful
|
||||
though and there is a show in the queue talking about some of the features of the bash scripting
|
||||
language where I've demonstrated that it's usefulness okay let's get on with the examples from
|
||||
the GNU manual there's a bunch of these and that there are they are on the whole pretty obscure I
|
||||
would say I've chosen three which I think are probably a little bit more understandable and seem
|
||||
to make a bit more sense it's entirely my arbitrary choice I guess first one is to do with centering
|
||||
lines and it centers all the lines of a file in a width of 80 columns so it just it just
|
||||
pads them with spaces on the front to make sure each line fits into the center of a given line
|
||||
now what I've done here is to I've slightly modified some of these to make them a little bit more
|
||||
readable and also to reflect that most of my audience are going to be using Linux and things
|
||||
like paths and so on are going to be different in Linux and in this particular case the file begins
|
||||
with usual hash mark exclamation mark often called crunch bang or hash bang followed by the path
|
||||
to said which in my cases slash bin slash said then a space then a minus f you need the minus f
|
||||
because you need to tell said to read its commands from the file that it's actually been invoked from
|
||||
and this particular script is available and as part of this this episode and it's called center.set
|
||||
cnt re spell the British way and I've linked to it from the notes so I've listed it out and I've
|
||||
used a feature of mark down and pan dock which lets me do a numbered listing so I can talk about
|
||||
the lines in the notes so the script begins with a group of commands which span from lines four
|
||||
to nine and I apologize that you really do need to be to have this in front of you when you
|
||||
when you're working your way through this you're sitting on the bus or something driving home or
|
||||
something then it's not going to be very easy to deal with but I can't think of a better way
|
||||
of achieving this so in the group the first line line five of the script is an x command and you
|
||||
remember that this exchanges the pattern space and the whole space and on line one this is
|
||||
this operates on line one it means one line will have been read from the file and it will place that
|
||||
line into the whole space the whole space because it's line one will be empty so now the pattern
|
||||
space is empty then the next line consists of a substitute command which is an s slash circumflex
|
||||
dollar slash then 10 spaces so what that saying is the line in the pattern space which consists
|
||||
merely of the the dummy character start of start of space start of line and the other one
|
||||
end of line the dollar is to be replaced by 10 spaces so basically it's it's a way of adding 10
|
||||
spaces to to the buffer line seven is another substitute now there are multiple spaces at 10 spaces
|
||||
in the in the buffer so this command is s slash circumflex dot asterisk dollar slash then there
|
||||
are eight ampersands and then a close slash closing slash so what that does is it replaces the 10
|
||||
spaces in that pattern space buffer by itself the ampersand eight time so it creates 80 spaces then
|
||||
line eight is another x command so you won't be surprised to know that it swaps the 80 spaces
|
||||
into the whole space and the pattern space what was in the whole space the the luck first line
|
||||
is swapped back into the pattern space so that was a brief process of creating 80 spaces which are
|
||||
stored in the whole space we move on to line 12 which simply contains a y command y command if
|
||||
you remember is the the way in which you can swap or transliterate was the term used translate
|
||||
one type of character to another so in the original it contained a contained the word tab TAB
|
||||
wise slash tab slash space slash but what that what that was just a convention to signify a tab so
|
||||
I've replaced that in the copy I've handed out to you with back slash t that escape sequence so
|
||||
in other words it's just going to go through the whole line and replace any currencies of tabs by
|
||||
spaces suppose it's bit of a cop out in a way because it's it's gonna destroy the effect of any
|
||||
tabulation that you happen to have on that line but there you go line 13 is a substitute command
|
||||
which is s slash circumflex space asterisk slash slash so what that's doing is it's removing
|
||||
all leading spaces from the pattern space buffer line 14 is the equivalent for taking off
|
||||
trailing spaces it's s slash space asterisk dollar slash slash so any number of spaces preceding
|
||||
the end of line is is removed so these lines 12 to 14 are going to be executed here on every line
|
||||
for the moment we're just imagining it happening to line the first line we're now on line 17
|
||||
this is a capital G command which appends the contents of the whole space to the pattern space
|
||||
preceded by a new line it just makes copy of the whole space and appends it to the pattern space
|
||||
well the whole space it's got 80 80 spaces in it so it's going to add to the pattern space a new
|
||||
line and 80 spaces line 20 is an s command substitute so it's look as s slash circumflex back slash
|
||||
open parenthesis dot back slash curly open curly bracket 81 back slash close curly bracket back
|
||||
slash close parenthesis dot asterisk dollar now that regular expression is a grouped one that's
|
||||
the back slash parentheses and inside these parentheses it's saying dot which is any characters you
|
||||
will recall 81 of them and it's anchoring them at the start of the line so the first 81 characters
|
||||
are to be taken off to be to be matched at the start of the line it's actually matching the whole
|
||||
line but the the grouped characters are the first 81 and the rest of the s command is back slash
|
||||
one slash so it's replacing the first 81 the whole line by the first 81 character all right it's
|
||||
basically trimming off anything beyond character 81 that's line 20 I don't know if I said that
|
||||
so finally we've got line 23 and this one is quite tricky and take it took me a while I sort of
|
||||
understand about I have to really think hard about what it's doing this is another substitute command
|
||||
and it goes like this s slash circumflex back slash open parenthesis dot asterisk back slash close
|
||||
parenthesis back slash in back slash open parenthesis dot asterisk back slash close parenthesis back slash
|
||||
two and then the slash so let's talk about that regular expression so it's grouping all of the
|
||||
characters from the start of the line up to the new line that's in the first group then it's
|
||||
matching the new line and then here's the really tricky bit remember there's there's a bunch of
|
||||
spaces after the new line and that bunch of spaces are such that everything the whole line is now
|
||||
81 characters long so that's 80 characters plus the the new line so the remainder of the line
|
||||
will be spaces after the new line and where by the regular expression has been formed is to say
|
||||
all the characters a bunch of characters an arbitrary number of characters is to be placed
|
||||
into a group and then that group is repeated so that forces the regular expression engine to select
|
||||
only as many spaces as can represent the group and a reiteration of the group in other words it
|
||||
it chops the the spaces after the new line into two sections two equal not it amounts then the
|
||||
rest of the substitution expression consists of back slash two back slash one so it's taken back
|
||||
slash two which is half of the spaces and is stuck in front of back slash one which is the the
|
||||
stuff before the new line so the effect of that is to place half of the spaces in front of the
|
||||
actual text thereby centering it it's still hard to find this hard to explain I can sort of visualize
|
||||
a bit it's really hard to turn into words I hope you you've got got some understanding from this
|
||||
but I hope I've made it clear enough that you've you understand it my text perhaps might
|
||||
explain a bit of the night just tried to do so this script is written to center stuff in 80 columns
|
||||
if you wanted to use a wider number of of columns then you'd have to change that s command
|
||||
that generated the 80 spaces and you'd also need to change the expression on line 20 where it uses 81
|
||||
to compensate for that it's not very flexible however it's clever and it's a useful demonstration
|
||||
what can be done with said if it were me I would not be using said to do something like this I'm
|
||||
sure I would come up with something better as the manual itself says using org or pearl or something
|
||||
so the second example from the GNU manual is something which emulates the unix command
|
||||
attack which is a reverse version of the cat command and it simply lists a file backwards
|
||||
named the section reverse lines of file this is quite well documented actually some of the one some
|
||||
the examples are really really cryptic I find but which is why I've explained this one it is
|
||||
um it's pretty well done however I thought it would be useful to just drill down into a little
|
||||
bit more than is in the manual so this one's available in a file call tacktac.set and it's part
|
||||
of the the bundle of stuff for this show so the first line is the hash bang line and it's using
|
||||
slash bin slash said and in this case it's not only using minus f it's also using minus n and
|
||||
that turns off auto printing we'll see why shortly so it's pretty small there are only three
|
||||
commands so on there's quite a number of comments online seven is an address which is one
|
||||
and a single command but the one is followed by an exclamation mark so it means all lines other than
|
||||
the g command depends new line to the contents of the pattern space the capital g that is and then
|
||||
depends the contents of the whole space to the pattern space so whatever is in for all lines that
|
||||
are not one take whatever is in the whole space and stick it on the end of the pattern space
|
||||
then line 10 consists of a dollar and a p and that means on the last line print whatever is in
|
||||
the pattern space remember we've switched off auto printing so so that will only trigger on the last
|
||||
line line 13 simply consists of the command lowercase h and what that does is replace the contents of
|
||||
the whole space with the contents of the the pattern space and this is done it has no address
|
||||
on it so it's done for every input line so I thought it's worth just explaining the algorithm
|
||||
in a bit more detail the first line that's read doesn't do anything other than trigger the h command
|
||||
online 13 and this means that the line is simply stored in the whole space then the second and
|
||||
subsequent input lines trigger the the capital g command online seven and so if we were dealing
|
||||
with the the second line for example we would append a new line to the pattern space and then a
|
||||
penned line one which was stored previously in the whole space to it then the h on line 13 is
|
||||
invoked and the pattern space which is now in in the order line two then line one is stored in
|
||||
the whole space again and that way every line gets the reversed hold space stuck on the end of it
|
||||
and is then stored in the whole space and when the last line is read the g command online seven
|
||||
we triggered as before and it will append the whole space a contents again so that the pattern space
|
||||
now holds the entire file in reverse order then because it's the last line the p command will be
|
||||
invoked for the only time and it'll print everything out so what you'll see is the entirety of the
|
||||
file which has been stashed away inside buffers in and said printed out I said that it bothered me
|
||||
slightly that after doing that p command it will also invoke the h command again which is just
|
||||
a slightly obsessive thing and I'd have been tempted to to have written line 10 with a with a q
|
||||
on it so it would stop stop it doing that but it really makes no difference at all I'm ashamed
|
||||
to have you written this now okay here's the the third and last example if you're finding this
|
||||
a bit much not surprised if you are because it's not easy it's not even easy to read this stuff
|
||||
when I was writing these notes I was thinking oh yeah yeah wait so obviously it's so clear now
|
||||
and it was a month or so ago I wrote this and I'm sort of going through it now and now I read it again
|
||||
my god it's it's really really obscure said it's brilliant said is a fantastic thing but I don't
|
||||
ever see me using it in this way other than to say to somebody look what I did isn't this clever
|
||||
and everybody whoa that's amazing but they're asking under their breath anyway I'm saying too much
|
||||
so reverse characters of lines is the last one and this emulates the standard units command called
|
||||
RAV which reverses takes each line reverses the characters now I've put the script into a file called
|
||||
reverse underscore characters dot said and it's available with the bundle for this episode and
|
||||
I have obviously changed the path so it can actually run you can run it and the other thing I did
|
||||
was there were in the original example there are actual new lines used that is you know that the
|
||||
line is broken by typing a new line in it whereas I've used a backslash ends so in doing this
|
||||
I've made it a new specific so let's dive into this one the first line is the hash bang line
|
||||
which is it's just got a minus f on it no no suppression of auto printing and the first command
|
||||
is online three and it consists of an address and a single lowercase b command the address consists
|
||||
of a string which which is simply contains two dots inside slashes and then it's negated it's
|
||||
got a it's got an exclamation mark after it so what it means is any line that doesn't contain
|
||||
two characters the b command if you remember is an unconditional branch to a label except there's
|
||||
no label here basically it means ignore or just carry on to the next one so whenever a line
|
||||
containing fewer than two characters then the unconditional branch simply triggers a new cycle
|
||||
which will print the line and ignore the rest of the script and carry on with the next line so
|
||||
a line with only one character in it there's not much point in reversing it is there that's the logic
|
||||
wow that's a lot ended up being quite a lot to say of a line with just a small number of characters
|
||||
said like machine covenants and you get to this level anyway let's move on line six is a substitute
|
||||
command which the regular expression part is a circumflex dot star or asterisk dollar it means
|
||||
the entirety of the line so we know it's going to have two characters in it or more and it's
|
||||
replaced by backslash n ampersand backslash n slash and so what that's doing is it's embedding
|
||||
the line in between two new lines there's a new line placed at the front and one placed at the end
|
||||
now line ten is a weird one it's it contains the t command which is a conditional branch
|
||||
and the label is branching to is x well line eleven contains colon x so there's a line ten simply
|
||||
says jump to the next line and this is documented in the GNU manual with the text this is often
|
||||
needed to reset the flag that is tested by the t command this is a thing simply to make the t
|
||||
set itself to a known state and I think that's probably because there was a substitution that
|
||||
happened just before that t command and remember the t command has to it's condition is that there
|
||||
has been a success for substitution prior to it being invoked so I guess that makes sense so I
|
||||
tried taking it out see if it made any difference and it didn't seem too so anyway line eleven is
|
||||
this label colon x they keep their labels pretty short in these examples in line twelve is a
|
||||
substitution s slash backslash open parenthesis backslash n dot backslash close parenthesis backslash
|
||||
open parenthesis dot star that last risk I should say backslash close parenthesis backslash open
|
||||
parenthesis dot backslash n backslash close parenthesis so that expression was three groups the
|
||||
first group was the new line and the character after it the second group was all of the characters
|
||||
other than the ones that match the other groups so basically the characters between the new lines
|
||||
but excluding the one we already picked up after the new line and then the group three is like group
|
||||
one in reverse it's the character before the second new line three groups so we simply output
|
||||
them as the second half of the s backslash three backslash two backslash one whatever was whatever
|
||||
the character was just after the left most new line becomes the character on beside the right
|
||||
most new line when they get swapped over so these new lines quite important as markers for which
|
||||
characters are to be swapped and in swapping them the character to the right of the first new line
|
||||
becomes the character to the right of the second new line so it's out it becomes outside the new
|
||||
lines the same for the for the group three the characters that were inside just inside the two
|
||||
new lines become the characters just outside the two new lines and swapped over does that help
|
||||
the rest of the line is left alone so you'd say okay okay well it's just swap two characters well
|
||||
yeah but it's using this labeling thing this jump to label thing because line 13 consists of
|
||||
another tx so this is one of these conditional jumps two back to label x and if the substitution
|
||||
was successful it will fire that particular jump so it goes back to label x where it simply
|
||||
executes that s command on line 12 again so it's going to be looping around
|
||||
obeying this command as many times as is necessary until there are no more
|
||||
characters between the new lines so the last line of this this script is simply on line 16 it's
|
||||
a substitute where it goes slash s slash back slash n slash slash g so in other words remove all
|
||||
new lines or both new lines and then because we're in auto print mode the thing will be printed
|
||||
now took my while to get my head around what was happening here and part of the process of
|
||||
finding out was to put some debug stuff into the script just to be certain that it was doing what
|
||||
I thought it was doing and I've offered you an alternative version of this script which I've
|
||||
called reverse underscore characters underscore debug dot said which contains L commands remember
|
||||
the L command lists the current contents of the pattern space and it does it in a way that makes
|
||||
it easier to spot what's actually going on puts dollars to show where new lines are and so on and
|
||||
so forth so I've listed it out but I won't go through it basically I've done it so that I put
|
||||
the L's in so that you see the line after it's had the initial new lines attached to it remember we
|
||||
put a new line at the front and the back of the line then I've got an L command in this loop with
|
||||
the tx through the branching back to the x label so every every time it iterates through that
|
||||
loop it prints out the result when you run that what I did was I fed it echoed to it the alphabet
|
||||
and lowercase pipe that to reverse characters underscore debug dot said which is executable and
|
||||
it demonstrates you can actually see the characters inside of the new lines being flipped such that
|
||||
outside the new lines and they've been reversed so that ABC etc gets reversed to z y x w I hope
|
||||
you find that that's useful it's only made it clearer to me as to what was going on maybe you
|
||||
can visualize these things better than I can and the last yeah the last line in the block of
|
||||
lines in the in the notes is the the auto-printed result of doing all of the swaps so that's it
|
||||
that we've done it we've covered the three at least of these very very hairy sand examples actually
|
||||
there's a lot more hairy ones than that but I'm sure I'm sure you would be switching off very very
|
||||
fast if I were to do any more and to be honest I have not looked at any more of them and understood
|
||||
them so last item of business is I said a quiz in the last episode and I said I give you the answer
|
||||
well the quiz was to take famous said demo 1 dot txt only concentrate on the first line
|
||||
and take the first letter of each word place it at the end and follow that with the letters a y
|
||||
so that pig p i g becomes igpe i g pa y this is called pig latin and latin becomes atin lay
|
||||
I said skip one and two letter words I think I said skip three as well in the in the last episode
|
||||
but then my example of manipulating the word pig would be silly because it would be skipped
|
||||
so one and two makes more sense and the reason you you don't want to do anything with with one
|
||||
letter words is pretty obvious because it would turn the single letter a to a a y which is
|
||||
make a lot of sense plus also a number of the words in that file the first line of that file
|
||||
of a capitalized and I said don't bother about them fact that the capital is end up in the wrong
|
||||
place is not a big deal so here's my answer what I did was said space minus n e so we're we're not
|
||||
all to printing quote 1 s slash back slash up parentheses back slash b back slash w back slash
|
||||
close parenthesis back slash open parenthesis back slash w that's locust w I should have said
|
||||
back slash open curly bracket two comma back slash close curly bracket back slash close
|
||||
parenthesis slash so that's the whole regular expression let's see if we can explain this
|
||||
the first group in parenthesis contains back slash b back slash w so back slash b is the thing
|
||||
that it's the strange boundary marker meaning whenever there's a word the back slash b marks
|
||||
the the edge of it the transition from non word to word and the back slash w is simply the first
|
||||
character of a word or a character of a word so we're looking for in this first group the first
|
||||
character of any word the second group contains back slash w and that's followed by a quantifier
|
||||
in curly brackets which is two comma and you'll recall that that means I want word characters and
|
||||
I want two or more of them don't care how many there are in total I have no upper limit but two
|
||||
has to be at least two so that's that's how we're skipping the two character words so in doing
|
||||
this we're matching a sequence of the start of a word and the rest of the word and as soon as
|
||||
the space is encountered the scan for the the multiple word word characters stops so in the group
|
||||
we have the first letter of a word followed by the rest of the word and then the rest of the s
|
||||
expression consists of back slash two back slash one a y close slash so that means put them
|
||||
the second group which is the bit of the word after the first character follow that with the
|
||||
first character back slash one and then follow that with a y and then the qualifiers to the flag
|
||||
so I should say for the the s command rg and p it's to do this repeatedly across the line
|
||||
and when it's finished printed and the rest of the line is space said demo one dot txt so the result
|
||||
is that you get the words instead of hacker public radio get aca hay oblique pay adior ray then
|
||||
it starts getting a bit silly because it's it changes hpr to prh a which is just plain silly
|
||||
but you can't do much more about that with this is quite simple I mean the said solution is simple
|
||||
and then it is an an are not touched across their two character words then internet internet is turn
|
||||
into internet EA which is awful I'm sure there I know that there are rules in this pig Latin
|
||||
that shouldn't allow that to happen you can't really write that sort of stuff in said radio turns
|
||||
into adior ray which I think is quite nice but shows shows sorry turns into how say which is silly
|
||||
and podcast do not quite like podcast turns into odd cast pay that turns into hattay to silly release
|
||||
releases big turns into elisa's ray anyway I know at least one other hpr listener quite
|
||||
likes pig Latin but this is not the best examples but you know it it answers the the question
|
||||
so who won well sadly there were no winners because there were no entries come on guys have you
|
||||
actually got to sleep that's what I said in my notes here probably just as well I'm finishing
|
||||
this series here because I think I probably sent everyone to sleep several episodes back so
|
||||
if you've been snoring through this one well good luck to you and hope you had a good sleep
|
||||
so that's it I finished that's it no more said and I hope you did enjoy it hope somebody got
|
||||
something out of it anyway okay that's it bye bye
|
||||
you've been listening to hecka public radio at hecka public radio dot org we are a community podcast
|
||||
network that releases shows every weekday Monday through Friday today show like all our shows
|
||||
was contributed by an hpr listener like yourself if you ever thought of recording a podcast
|
||||
and click on our contribute link to find out how easy it really is hecka public radio was found
|
||||
by the digital dog pound and the infonomican computer club and it's part of the binary revolution
|
||||
at binrev.com if you have comments on today's show please email the host directly leave a comment
|
||||
on the website or record a follow up episode yourself unless otherwise status today's show is
|
||||
released on the creative comments attribution share a light 3.0 license
|
||||
Reference in New Issue
Block a user