Files
hpr-knowledge-base/hpr_transcripts/hpr2729.txt

285 lines
26 KiB
Plaintext
Raw Normal View History

Episode: 2729
Title: HPR2729: Bash Tips - 18
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2729/hpr2729.mp3
Transcribed: 2025-10-19 15:48:48
---
This is HPR episode 2007-129 entitled Bash Tips, 18 Planet on the Array and in part of the series, Bash Cripting.
It is hosted by Dave Morris and in about 32 minutes long, and Karim and exquisite flag.
The summary is Arraying Bash Part 3.
This episode of HPR is brought to you by An Honest Host.com.
Get 15% discount on all shared hosting with the offer code HPR15, that's HPR15.
Better web hosting that's honest and fair at An Honest Host.com.
Hello everybody, this is Dave Morris, welcome to Hacker Public Radio.
I'm doing show Bash tips number 18 today.
This is the third of a group of shows on the subject of Arrays in Bash.
In the last show we looked at ways of accessing elements with negative indices and how to concatenate Arrays.
Then we launched into parameter expansion in the context of Arrays.
This was a bit of a loopback to an earlier show where I talked about parameter expansion,
but I never really concentrated much on Arrays in that context.
In fact, I've discovered quite a lot more about what you can do with this stuff and Arrays,
so I thought I would share this with you.
I've got a few more of these parameter expansion things to look at in this episode.
I was going to go on to some further stuff after that,
but I'm going to put it in the next episode talking about the declare built-in, which we've covered.
But we haven't looked at it in detail.
Then there's commands which assist with loading data into Arrays,
and it does it in a better way than the way I've been doing so far.
The next parameter expansion operation is string replacement,
and it allows you to perform a single replacement within a parameter string
or repeats the replacement throughout the entire string.
And of course, it can do the same sort of thing with Arrays.
The syntax is dollar, open curly bracket parameter, whatever the parameter is,
then a slash followed by a pattern, followed by a further slash, and then a string.
So the pattern is a global x-t-glob pattern.
The parameter is expanded,
and then a search is carried out for the longest match with the pattern.
And when that's found, it's replaced with string.
If there's no string in the syntax, then it deletes,
this operation deletes the thing that matches.
So in other words, replace the pattern with nothing.
So you would write that as dollar, open curly bracket,
parameter slash pattern, close curly bracket.
So there's no slash string in there.
Now, the first character of pattern has a special meaning if it's one of the following three.
If it's a slash, so you've got a double slash in that case,
all matches of the pattern are replaced by string.
So this is the one where it repeats the replacement throughout the string.
If it's a hash mark, then it must match at the beginning of the expanded value of the parameter.
If it's a percent sign, then it must match at the end of the expanded value of the parameter.
So these are the things that we saw earlier on with where they were actually operators that were by themselves.
If you're trying to match pattern, it actually starts with a hash or a percent sign,
then or indeed a slash, you can use, you put a backslash in front of the character to tell
that it's not meant to be one of these characters that have special meaning.
So I've got a bunch of examples in this one.
I haven't got many downloadable examples for this stuff until get the end
because it didn't seem worth it, really.
So what I've done is to show some examples of what you can type on the command line.
The first one was setting a variable called phrase to a string
and it's using the old typing exercise, certainly the thing that I was taught when I was a kid.
Now is the time for all good men to come to the edge of the party.
I'd demonstrate here how you would take that and change the word men into people.
So if you echo the phrase and put double quotes, dollar open, print, open curly bracket, then phrase,
the phrase is the name of the variable, slash men, slash people, close curly bracket, closed double quotes,
then that will perform the operation.
So when it's printed out, which is what the echo is doing, you see,
now it's time for all good people to come to the edge of the party.
Now if we used a slash as the first character pattern
and we can do things like replacing all the currencies of THE, lowercase by THE uppercase.
So you would do that with an expression, dollar, open curly bracket,
phrase, two slashes, THE, lowercase slash THE and uppercase, close curly bracket, close quotes
and you'll see that you get the strange shouty V's in the sentence.
It's possible to use an EXT glob pattern, as long as you've got X glob or EXT glob
whatever you want to call this, where there's some setting switched on.
And I've made it an expression which begins with an at sign
and then in parentheses the words the THE and two TO separated by a vertical bar.
So that's an X glob pattern which matches either the or two.
Then after that I put a slash and capital X. So it's just just to demonstrate it.
I've not to spoken out the entire line here but hopefully you'll get the gist.
So it says now is X time for all good men, X etc.
So it's replaced the instances of the two words by, well actually not words,
but sequences of letters we should say by the value X.
And I said unfortunately it's not possible to vary the replacement string depending on the match.
You need to write something more complex to do this.
And I've got an example script later on which demonstrates the sort of thing that you could do
if you wanted to go through and do selective replacements in a sentence.
You can't do it in one of these expressions anyway.
So if you wanted to do, you wanted to replace the capital N
and the two letters that follow it at the start of the string in the variable phrase
by the sequence of three capital X's.
Then you would use, let's spell this one out, at least the variable expression,
dollar, open curly bracket, phrase slash, then a hash mark, meaning at the start of the string,
capital N, question mark, question mark, slash, then three capital X's close curly bracket,
close quote.
So you'll then see XXX is the time.
Now what this is doing is it is making a change at the start of the string.
And you're giving it a string which is in the variable phrase.
And there's only one start.
You can replace whatever's at the start.
So I've replaced the N and the three letters that follow it.
But you can't do that multiple times because there's only one start.
So this was not immediately obvious to me.
Hopefully it is with you, but I want you to benefit from my mistakes.
I tried to do a bit of an experiment or a demonstration which shows this business of trimming things
off the front and the end of a string.
So I've got a different variable P2, I've called it, and it's equal to the string in single quotes,
hash mark, ABC hash, close single quote.
And then I echo that as in double quotes, dollar open curly bracket,
P2 slash, then back slash hash, close curly bracket, closed double quotes.
So that's it matching a case where the string itself begins with one of these magical characters.
And you get back ABC without the leading.
And I just for the sake of complete nets, I guess.
I did a similar thing where I won't read the whole thing.
I've got P2 slash percent hash inside the curly brackets.
And what that does is to remove the trailing hash.
Because there we are using one of these controlling characteristics,
special characters after the slash.
And it's the percent which means do trimming at the end of the string
and the thing to remove is a hash.
We don't need to back slash it because it's not a special,
it's not a meta character in this particular case.
So let's look at examples of all this stuff using arrays.
If the parameters are an array expression using an at sign or an asterisk as an index,
then the substitution is applied to each member of the array.
So in my demonstration here I declare an index to array
whose elements are the words from the example phrase.
And the way I do that is to use declare space,
hyphen a, space, and I've called the array words equals,
and then open parenthesis, dollar phrase, close parenthesis.
I've actually put spaces around phrase,
but you don't need to, it just makes it a little easier to read.
That's the reason I put them there.
So if you echo the expression in double quotes,
dollar open curly bracket, words, open square brackets,
at sign, close square brackets slash, question mark, slash x.
So what that's going to do is it's going to replace the first letter
of each element with the capital X that's part of the expression.
So set of now is we get XR, XS, and so on and so forth.
You could do a similar thing with replacing the last letter of each element with an X
by inside the double quotes, putting dollar open curly bracket, words,
and square brackets, and at sign slash percent,
this is the meta character, x percent, question mark, slash,
capital X, close square, curly bracket, close double quotes.
So here we see, set of now you get NOX is,
it turns into IX, and so forth.
Now here's an interesting one that was not obvious to me at the start.
If the pattern part of the expression consists of a hash
or a percent on its own, you can add text to each element at the start or the end.
So I demonstrate this by an expression which is in double quotes,
dollar, open curly bracket, words, open square bracket,
at sign, close square bracket, slash, hash mark, slash,
then an equals and a greater than sign in a space,
close curly bracket, close double quotes.
And then what you find is that in front of each word is an equal sign
a greater than sign in a space which looks like a little arrow type thing,
and so on and so you can do this similar thing by adding them to the end,
but I won't spell that one out because I think it should be pretty obvious.
It's in the notes here anyway.
Now it is possible for the string part of this expression to be a reference to another variable.
And you can even put a command substitution in there,
but it's not really very, it's not a very powerful feature,
but so I've gone to demonstration hereof in the double quotes after an echo,
dollar, open curly bracket, words, square brackets, at sign, slash, hash mark, slash.
And then there's the string part, dollar, open curly bracket, words,
square brackets, one, close curly bracket, space, close curly bracket, double quotes.
So it's taking the second word of words, remember they are index from zero,
and just sticking it in front of every word.
So that word is is, so you're seeing is now is the is time and so on.
So you can do that, no idea why you would want to, but you could.
But you can't change that value as you're going along.
The value is derived once before the multiple substitutions begin.
I've written down here that this statement is not a script and is executed internally by bash.
So you can't do stuff that you would use in a bash script that would change each time it was invoked.
So my final example in this particular subsection is to use dollar random,
that's capital random written in capitals, which is a variable I've mentioned before,
which when you put a dollar in front of it, you get back a random number.
If you do that and you put it in an expression which propends it to each element of the array,
what you get is random is invoked once and the value that you get back is then used to stick in front of each word.
So in this particular test I did, the number came back as 9, 5, 5, 9.
So that's just simply put in front of every word in the every element of the array.
So it's, you'd think that would be cool I could do something with random numbers in, but you can't.
Not by this method.
So the next and final parameter expansion we're going to look at today is the one where it allows you to change the case of the array.
So I've just taken the syntax diagram from the GNU bash manual and put them into the notes here.
So you've got dollar open curly bracket and then the name of the parameter, then you've got a carrot or an outbarrow,
if you like to call it that, followed by the pattern enclosed curly bracket.
That is the one instance and the other one is you put two of these characters, two circumplexes or carrots or however you like to call them.
So one or two of these things.
Those two change to uppercase.
The other two use a comma in between the parameter and the pattern.
You get one comma or two commas and they force the matching characters to lowercase.
Now it's important to understand what it says in the manual.
It's one of these cases where you really need to read every word of the description in the manual.
I'm not sure whether this is the case in the man page or I was actually going from the GNU bash manual.
But it's important to understand what they're trying to say because this is quite dense information.
And it took me a while for it to sink in.
Let me read it to you.
Each character in the expanded value of parameter is tested against pattern.
If it matches the pattern, its case is converted.
The pattern should not attempt to match more than one character.
So the pattern matches only one character, not a word or something similar.
There's also another quote from the manual.
Part of which I put down here and it says the character of the upper row character and comma expansions match and convert.
Only the first character in the expanded value.
So if you're using single ones of these things, they only match the first character.
So if you're expecting as I did things to work differently, you'll be caught off guard by this.
So that's why I'm making a point of the actual description.
Hopefully it'll be clear after my examples.
Now if the pattern is emitted, it's treated as if it's a question mark which will match every character.
If the parameter is an array variable with an at or asterisk subscript, then it's going to do things on the entire array each element at a time.
Let's look at simple variable case.
And if we take the phrase that we were playing with earlier, that was the time thing.
Then if we echo in double quotes, dollar, open curly bracket phrase, then two carrot symbols, open square bracket, A, E, I, O, U, close square bracket, close curly bracket, close double quotes.
What that's saying is to find every vowel and change its case to uppercase.
So what you get back is a sentence which, or a phrase which says, now is the time where each vowel, the O, and the I, and the E until they're all changed to uppercase.
That is because we're using the double carrot, it's going to find all the matches.
So it's going to go to each element, each part of the string, each instance of a vowel in the string, and it's going to change it to uppercase.
But if you use the thing, the next example, which is the same expression with a single carrot and apply that to the phrase, you get nothing.
Nothing comes about. Nothing changes. The first vowel is not converted. The first vowel is the O, now.
That's because the pattern is compared with the first letter, capital N, which isn't a vowel, obviously. So it does nothing.
And that's because of the matching with the first character in the parameter. I hope that's clear.
So let's look at the array case, and we'll use the array called words, which we built earlier, and we're going to try using the vowel pattern again, because that seems like a good thing to play with.
So if we echo in double quotes, dollar, open curly bracket, words, and then square brackets and at sign, then we follow that with a single carrot and square brackets and then the vowels, close, square brackets, close, curly bracket, close double quotes.
Well, that's going to do is to go down the array, and for each word, it will replace the first vowel, the first character, with a capital form, if it's a vowel.
So when you look at it, you see, now is the time for all. So now doesn't have any vowels, doesn't start with a vowel, is begins with a vowel.
So that's capital, the time, well, it will have vowels, but they're not the first one. For, it contains a vowel, but doesn't neither of them, the vowel is not at the start of the word.
Then we come to all, well, all is capitalized because it begins with a vowel. It's matched, I've written a standard, probably a better way than I just said.
It's matched any array element that starts with the vowel and has made the vowel leading vowel uppercase.
So if we do the same thing again, but use two carrot signs, then that operates on all vowels in each element and gives the same result as when we used the single variable called phrase.
You see, now is the time, with all every vowel in every word has been upcased.
Now I did one where it does an equivalent, but operates on all non-valve, and you'll recall that, you do that by simply beginning your square bracketed list of characters with a carrot, which reverses the effect.
So it's looking for all letters, which are not vowel. So in this case, it changes NOW to uppercase the W, which is not vowel, is gets an uppercase S and so on.
But the final point is to say that, don't try the type of x-t-glob type match when you're working in this sort of mode, because it doesn't work.
So my example is, after the two carrot signs and at, open parenthesis, good, and then vertical bar, men, close parenthesis.
And so in other words, match, either the words good or the word men, and run that, nothing happened, because that doesn't match with the requirement in the manual, which says the pattern should not attempt to match more than one character.
And here we're trying to match two words, and it doesn't work. So let's get on finally to my two examples, which are very, very, very similar, both the, pretty much the same thing, but using two different methods.
I was thinking about, how would you make a script to change specific words into other words? I was thinking more in terms of the find and replace type operations.
How would you do that in using loops and stuff? And I've said in the notes here, it's contrived and overly complex, changing selected words and phrase by different words.
But it's not a trivial exercise, but my first attempt at solving it is way, way over the top in terms of complexity. And after I'd written it, suddenly hit me like a bolt of lightning that it was really daft way of doing it.
So I wrote a better way and put that in the example too. I'll skim over this one fairly quickly. This script switches EXT glob on, as the first point.
It creates a variable called phrase with this wonderful nose, the time business. It creates an empty variable called new phrase. That's going to contain the transform thing.
Then it declares two arrays. The first one is an associative array, which I call transform. And it consists of a bunch of elements.
Remember associative array takes words or character strings as indices. So I'm making elements like one, which, where the key is, or the index is the word good.
And that, the value, relating to that, is bad, the word bad. Then we've got men and people and party and community, as the different keys and elements.
We also declare a index array called keys. So what I then try and do is to make a pattern, which is created by collecting together all the keys out of the array,
and turning them into a string that looks like a vertical bar key. And that's done by using the thing we talked about earlier on.
Last episode, I can't remember, the dollar open curly bracket exclamation mark, name of array, then in square brackets and at sign or an asterisk, close curly brackets.
What that returns is the list of all the keys and put that in the array called keys. Then make a variable called targets, which is made by adding vertical bars after each word.
Then take off the last one, because there was an extra one stuck on, then remove all spaces, because there was spaces crept in.
And then finally, say that in a variable called pattern, didn't need to have done it this way, but which puts all that list together inside and after an at sign and inside parentheses.
So we would have an expression or a string that consisted of an at sign and open parentheses, list of words, good men, party, each separated by a vertical bar, close parentheses, that's our pattern, it's an EXT or pattern.
So then there's a loop, which goes for word in dollar phrase, semicolon space do. So that's going to take the phrase and then work through each word within it, each space to limited word.
And then there's a test if then in extended, extended test with the double square brackets dollar word equals equals dollar pattern.
That's comparing the word that we just got with the EXT, glob pattern, semicolon, then word equals dollar open curly bracket, transform square brackets dollar word, close square bracket, close curly bracket.
So in other words, make the variable word equal whatever the element of the transformer ray contains relating to corresponding with the key of that particular word.
Then add the word that comes out of all this, which neither the words is the the loop presented it or will have been transformed to adding it to new phrase with the space after it, then at the end, echo new phrase.
So that actually works, it actually goes through the phrase, now is the time for all good men and you get back now is the time for all bad people, blah blah blah.
So it will be, but it's quite a lot of work went into that. And as I say for example to there are other ways of doing this and it struck me that the loop could simply check if the current word is in the transformer ray and replace the word, if so, leave it alone if not.
There's no explicit does key ex exist in this associate associate of array feature in bash. So it's not entirely obviously you do that until you think about it or in my case experiment with it a bit.
So I've made a second example, the first one was called bash 18 EX1, this is EX1.SH, this is EX2.SH and it's using a simple way of transforming the individual words in text.
And the key of it is that we use the conditional expression we looked at in show 2659, which is hyphen v, space, then the name of a variable.
Remember the name of a variable, okay, not the contents, the name of it. We looked at it, I think I mentioned it as we were looking at this list, great long list of conditional things.
And I said that that could be useful, but it wasn't entirely clear, I didn't go into details about how you would use it, so recall.
So looking at example 2, it's pretty much the same, we declare phrase, we declare new phrase. This time we just declare the associate of array transform.
I've added another word to it, so we've got good is to be transformed into bad, men become people, aid will become assistance, party will become community.
The idea was that you could just add words to this array as the mood took you and that would change the behavior of the script and it would perform different transformations.
So again, we've got a loop for word in dollar phrase, semicolon space do, but the test is if, and then in double square brackets, hyphen v space, transform is no dollar, transform square bracket, dollar word, close square bracket, close the double square brackets, semicolon space then.
So that's a test to see whether the transform array with a key of whatever word is at this particular time, so in the first case it would be the word now, is there an element of transform with the key now, the semicolon.
Well, the answer will be no, there isn't because, you know, there isn't one, and so that particular test would be skipped, but if it does match, then we set the variable word to whatever the value of that particular element is in the transform array and then that gets saved in the variable new phrase by appending to the end of the plus equals.
As as before, now you simply print that out of the end, so the important aspect of this is that it's possible to test if an element exists in an array, an associative array by using hyphen v.
It's not obvious from the description, but it is possible that the description says hyphen v, var name, true, if the shell variable var name is set, brackets has been assigned a value, I put that into, I copied that into the notes here, that's not entirely clear to me as meaning does this array contain a key of this type.
Well, it doesn't strictly mean that, it just means does it have an element which looks like this, and it will, it will actually match, it will return true even if the array element with that particular key is empty.
If you haven't set it to any value as I experiment with this, it didn't put the experiment in the notes, but it will return true, return false if it doesn't exist in that array.
So that's really, it's a much, much, much better, it's not as clever, but then cleverness, sometimes the downfall of me anyway, trying to be too smart.
So that works, and the other thing is the hyphen v is followed by the name of the variable, so the name of the array with the subscript amounts to the same, it's the name of the element I guess, but anyway, that's what it's acceptable.
So if you run it, now of course I had to run this, now is the time for all bad people to come to the assistance of the community.
Hope you find that useful, bye now.
You've been listening to HECCA Public Radio at HECCA Public Radio.org.
We are a community podcast network that releases shows every weekday, Monday through Friday.
Today's show, like all our shows, was contributed by an HBR listener like yourself.
If you ever thought of recording a podcast, then click on our contributing to find out how easy it really is.
HECCA Public Radio was founded by the Digital Dove Pound and the Infonomicon Computer Club, and it's part of the binary revolution at binwreff.com.
If you have comments on today's show, please email the host directly, leave a comment on the website or record a follow-up episode yourself.
Unless otherwise status, today's show is released on the creative commons, attribution, share a life, 3.0 license.