Episode: 2719 Title: HPR2719: Bash Tips - 17 Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2719/hpr2719.mp3 Transcribed: 2025-10-19 15:30:37 --- This episode of HPR is brought to you by Ananasthos.com. Get 15% discount on all shared hosting with the offer code HPR15. Better web hosting that's honest and fair at Ananasthos.com. Hello everybody, this is Dave Morris. Welcome to Hacker Public Radio. I'm doing another show in the Bash Tips series. This is the 17th in this whole sub-series. And this one is continuing stuff about a raise in Bash, which is really rather good, I think. Nice to have a raise in a scripting language like this. So in the last show we saw the two types of arrays and learned about how you can create them in a whole variety of ways. And we saw a little bit about how you can populate them. We also looked at how you can access array elements and entire arrays. So I want to continue on that theme looking at array access and some of the hints and tips about doing that and how you can then manipulate the contents of arrays using parameter expansion operations, which I'll talk about in a short while. The first topic is talk about how you can actually use negative indices with index to raise. We only looked at positive numbers and of course the asterisk and the at sign subscripts in the last episode. So it's also possible to use negative numbers. And what they do is do index relative to the end of the array. So the index minus one means the last element, minus two, one before that or the penultimate, if you like, and so on and so forth. So rather than going to a lengthy explanation here, I wrote a little demo script to show what you can do with it. Nothing very exciting, but just I don't know about you, but I quite like to see worked examples, especially if there's a bit of an explanation about what they're doing. So I thought I would just add that one to here and it's downloadable if you want to grab it and play with it. It's called bash17 underscoreex1.sh. And what it does is it fills an index to array with numbers. But just for a bit of fun, I thought I would use the Fibonacci sequence to populate them. And that's actually quite a simple sequence, which as a one-time biologist, something that we had drummed into us because it's used a lot in the world of nature, things like plant growth and all manner of things, the spiral of a snail shell, etc. Anyway, the script starts by declaring and then seeding an index to array, which I called Fib. And it puts in there the numbers 0, 1 and 1, which is the start of the Fibonacci series. Then there's a for loop which starts with an index of 3 and goes up to the index of 20, which in each iteration in creating a specific element, it adds the previous two. Now one of the things that I did was the addition is being done in an arithmetic expansion, which we've covered many, many times, but it's possible in fact it's recommended that if you refer to variables in such an expression, you emit the dollar signs and you don't need the curly brackets around the array references, the array element references and so forth. So it actually looks a lot, lot cleaner, I wish Bash as a whole was able to use that format. So having populated the array, simply printed out with the message Fibonacci series and then the whole contents of the array, which is listed in the notes. Then the final thing that the script does is to build a for loop which starts from minus 1 and goes to minus 4 and for each of these numbers, it prints out the relevant indexed element of the array Fib. So you'll see that it's just the last four elements off the end of the array. I printed out the whole array as you can see in the notes and this is going backwards down the array. So second thing I wanted to talk about was how you take two arrays and concatenate them. Now it's possible to concatenate arrays of both types, though my example here only uses an indexed array. Looking at it now, I think maybe I should have done an example using an associative array, but I didn't do that. So maybe leave that as the next size for the listener, whatever. But it's a syntax that you have to use to do this, which is maybe not immediately obvious. What it consists of is the case I've got here. I've got an array, which I've called array 1. I'm setting this as a generic example. I'm setting it to one of these compound expressions, which is an equals followed by an open parenthesis. Then some stuff in the parentheses. Then in the parentheses, I've got in quotes dollar array 2 and then in square brackets and at sign enclosed in curly brackets. Then follow that with another one in quotes and array 3. So it's using the at sign because it's going to return the entirety of the array and it's going to do it as individual items, individual words as the terminology has it. So what will that actually do is it will effectively fill the parentheses with the contents of array 2 and then the contents of array 3 in sequence and that will be the what's actually placed in array 1. And you can also append to an array, another array in similar sort of way. So array 1 plus equals open parenthesis and then one of those sorts of expressions in double quotes would do it. So again, I made an example to demonstrate this and this is bash17ex2.ish and in it I declare three index arrays, which I called A1A2A3. Now I'm going to put random numbers in these and in order to do that, I'm using a bash feature which is really a variable, it could recur, regardless of function, I suppose. But it's a variable called random and then capital letters. And in order to get, it's a pseudo number, random number generator. In order to seed it, so you get different sequences each time you need to set it to a different number each time. What I've done is to set it to the current date expressed in milliseconds. I think that's the millisecond part of the current date and time. So that's date plus percent capital N. You'll see that in the script. Then there's a loop which simply goes around 10 times, sending a variable I from 1 to 10. And the two statements in the four loop are setting A1 and A2 to the result of an expression. The expression is in this is an arithmetic expression with the dollar and two of parentheses. And in it we've got random percent 100. That's that variable random, which you don't put a dollar in front of because it's inside a arithmetic expression. That's in brackets in parentheses, I should say. And then add one to the result of it. So what we're doing is we're asking for random number. And the numbers come back between 0 and 3, 2, 7, 6, 7. That's what random does. We then use the mod operation, which is percent, which we talked about in earlier show. And so that forces it to be a number between 0 and 100. Since we don't actually want 0, in this case anyway, I have added one to it. But I've put it in parentheses because the plus 1, I'm not sure if that's necessary. I think the priority of the modulo operator is higher than the plus. Anyway, it looks cleaner. So that will just fill these two arrays with a bunch of random numbers. Then I simply echo them a1 colon and then the contents of a1. And the way I do it is dollar, open curly bracket, a1 square bracket, asterisk, closed square bracket, closed curly bracket, closed quote, and the same for a2. The reason I use the asterisk version is because that's the one that concatenates all of the individual elements of the array into a string. So putting that inside a string, substituting it in a string is going to be the best way to work. Otherwise you get individual words which I think can cause problems. I'm not quite clear what the problems are to be honest with you. I get nagged by the shell check thing that I use to check my scripts and it says don't do that. So to be honest, I haven't really gone to find her. Why? The next step is to take these two arrays which are populated with stuff and concatenate them together into the array a3. So we've got that sort of expression which is a3 equals open parenthesis. Then in double quotes, dollar, open curly bracket, a1, open square bracket, at sign, closed square bracket, closed curly bracket, closed quotes, and that will result in all of the 10 elements of a1 being listed as 10. Different words. The same then for a2 and we simply echo it out to show what happened. And when you look at the output which follows this in the notes, then we see a1 gets some numbers, a2 gets some numbers and a3 gets a1's numbers and a2's numbers. So just to prove that that's the way it works. Not not shatteringly interesting but maybe it'll be useful to something to refer to in the future. Now I want to talk about parameter expansion stuff. Back in episode 1648 which was way back in 2014 and it was the first of these bash tips so it wasn't actually called bash tips. I described most of the bash parameter expansion operations that are available and I did mention that they're using arrays in passing but I didn't go into great detail. I want to visit them again now the ones which are specifically useful in the context of arrays and maybe talk a little bit more about what they can do. So that's going to result in a moderate number of things to look at. So this show is going to be followed by another one which will tie the whole thing up I think. So let's look first at what was called substring expansion. I think I gave them these names, they don't really have proper names in the documentation which is slightly frustrating anyway the substring stuff can perform two different functions. It can select substrings from strings so pieces of strings from wherever inside a string but it can also subset an array by picking at individual elements, multiple elements at a time. So the general syntax is dollar, open and curly bracket and the name of the parameter and a colon and an offset or the offset can be followed by a colon and a length of certain length both numbers of course. So yeah and the notes it says the offset and the length are arithmetic expressions and there can be negative in some cases which means to account backwards from the end of the string or indeed from the end of an in next array. A negative offset because if you didn't put a space after the colon this would cause bash to confuse what you were talking with another one of the expressions of it offers you have to put a space. So as we colon space minus number so that's a slightly unpleasant thing but that's the way it is. You can only use a negative length when you're working with strings not with arrays and if you don't have a length at all then what's meant is that everything from the offset to the end of the string or the array is to be returned. So it's a quick summary of what these things do. Just look at a few examples. So what I've got here is an example which isn't a downloadable one this time it's just one in the in the text in which I work on an individual array elements as strings. So a declare an index array which I'm calling planets and it's being set to the planets mercury being as earth Mars Jupiter Saturn Uranus and Neptune and then the expression echo in double quotes dollar open curly market planets square brackets four so that's the fourth element which is Jupiter then colon two colon three close curly market closed double quotes that will pick out the middle letters of the word Jupiter. So it's starting at letter two and these are zero relative they begin with zero. So j is zero, u is one, p is two and then the three means do three letters. So the answer comes back pit we do the same with planets five which is Saturn and this time we want to use a negative offset so that has to be colon after the array element spec space minus three colon two. So what that means is take the last three letters of Saturn which is urn and return the first two letters of that so you get back ur. The last one is just a hammer this point home is the same as the last one except there's no colon two on the end so that just returns the last three letters of Saturn. So if you use this with the entirety of an indexed array using a subscript of the at sign or an asterisk then it will extract individual array elements or numbers of array elements. So if we echo planets with an at sign in the square bracket colon one colon three that means go to element one and return three from there so we get back Venus earth and Mars. If we do the same thing but use the offset of minus three which has to have a space in front of it as you remember and two two elements from there then minus three goes backwards to Saturn and backwards down the array from the end and then display two elements so we get back Saturn and Uranus and then this is this is just a copy of the same arguments as the previous example except applied to an array and then if we simply put an offset of minus three and no length then we get back Saturn Uranus and Neptune the last three planets in the in the list. Can't use negative lengths when you're dealing with arrays but I didn't put a negative length example sorry about that. Now one of the questions that popped into my mind was could you do this stuff with associative arrays I mean what is the third element or whatever of an associative array because they're they're defined as not having an order to the elements I mean maybe they do have an order on the particular machine that you're using or on the particular operating system or something like that but between systems they might not be the same at all there might be different on separate days I don't know what factors determine them but I went off into experimenting with this to see whether I could I could actually do it and I created two examples which are called bash 17 EX3 and EX4 where I actually did this and it works but what I've done is I put a note in front of this marked off this section of the notes put a little note in the front saying you might want to skip this since it's a non-documented feature which you should not use in production so I'm going to skim over it fairly quickly because perhaps shouldn't have even gone but it seemed interesting to me I'm not sure how you would ever implement this in a in a reliable and sensible way but maybe the scripts are interesting in their own right so I'll zoom through the first one and leave the other one for you to look at if you're interested this is another case of making a couple of indexed arrays this time I put 10 letters into them and I do it by a command substitution inside the parentheses which consists of an echo followed by one of those curly bracket brace expansions that we looked at way back in there this series but the first one returns A to J letters and second one K to T so you've got two arrays which contain those those series of 10 letters and we declare an sociable array which I've called hash and in it I place the elements of the two indexed arrays so the first index array A1 is used as the keys to the the associative array and the second one is used as the values within it so you'll see a rather complicated looking assignment where they're being set I want to read read it out letter by letter so if I then run a loop which goes through the array the hash that is the associative array prints out what's in each element just so you have some sort of confirmation of what it is and it's a loop which prints out for each element the name of the the array and the the subscript that we're using the key and the the second element is the contents so hopefully that's a that's a useful way of being able to view it in order to do this I use an expression which consists of in double quote dollar open curly bracket exclamation mark hash square bracket open square bracket at sign close square bracket close curly bracket close quotes now this is something I'm just going to talk about in a few minutes what that does is return a list of the subscripts for the associative array so it's being used as the expression in a for loop and a variable key is being set to each each element and you get back a list of words which is the thing that for loops like unless they're the numeric type and finally in the script print out the values from the hash but what I've done here is I've got a variable i which is being set to one in the for loop and it goes up to value 10 but it's doing it in steps of two so I want to first of all echo the print the display the the contents of i and then want to echo the contents of hash using the asterisk subscript and the offset will be dollar one so it'll start with one but I want it to give me back two elements so when this is run you get back for i equals to one you get back k and l so it's actually printing out first to well whatever the first how can you say first two in a thing that's not in in any specific order except that it apparently is but you you get back k and l for one m and n for three o and p for five etc not quite sure what why that is so but I think other than a curiosity it's something that we should walk away from probably so moving on from from that let's just talk briefly about the the thing that lists the keys of associative array also called indices or subscripts depending on the different contexts that I use it in and various other people so that consists of either dollar open curly bracket exclamation mark name of array open square bracket and an at sign close square bracket close curly bracket we use an asterisk in the same case and it expands to the list of array indices or keys which are part of that associative array when you use an at sign you get and it's in double quotes then you get a list of separate words otherwise it's just a string containing all of these so all of these particular keys I use this as we saw in our 17 underscore EX3 and for the record in EX4 as well EX4 I didn't really talk about but that contains a slightly more complex version of EX3 using randomly chosen words instead of sequential letters just in case there was something about the sequential letters that made it work in EX3 but it works as well in the EX4 so that's really all there is to say about the key list of keys it's a very useful feature next we're looking at the length of the string or an array we saw this in show 1648 before probably not a huge lot to say about it actually the main thing to say is that you use dollar open curly bracket and then a hash mark and the name of the variable if it's just a plain variable then you just close the curly bracket at that point but if it was an array then you would put the name of your array and then you'd put in square brackets either an at sign or an asterisk so when it's a simple variable and you get the length of the the string so I've got a simple example where I set a variable which I called veggie to the string coal rubby and it prints echo one of those expressions and the answer comes back eight it's eight characters long when you use an array with an index of an asterisk or an at and it returns a number of elements in in the array and there's an example here where it sets a array called veg veggies to three vegetable saliriac artychoke in theparagus I was obviously cooking around the time I was thinking of this and it comes back with the answer three that's how many there are and I did the thing where I used the name of the array instead of an indexed version of the array and remember we looked at this last in the last episode where bash interprets that as having an index of zero so when we asked for the name of the contents we get back the first element saliriac from the array when we asked for its length we get back it the length of that particular string which is eight that's weird but just thought again that you should know about it so you don't do it by mistake and wonder how you got an answer when you would have expected it to blow up so the last one of these parameters substitution thingies is what really are bundled two together here but it's where you can remove leading or trailing parts of a string or indeed of yeah well we'll come onto the array context in a minute you can do that in a simple variable let's say and what it consists of is the syntax is dollar open curly bracket and the name of the variable or indeed an array but we'll come to that followed by hash mark and what's what's noted here is just word word is a glob or x glob pattern if you've got ext glob enabled and that's used to perform operation on the variable it's hard to talk about this in a generic way because we really need to drill down but the other version of it uses two hash marks we'll we'll talk about why and what the relevance of this is so the hash version removes leading characters that match the the glob pattern the word and we come on to what's effectively equivalent to the last two but it uses a percent sign instead and this is for trimming things off the end it matches stuff with the with the glob or x glob pattern and it will do percent the single percent means or the single hash means do the the shortest match the double hash and the double percent means the longest match once you get onto here some examples hopefully you will see how this is this is done so i've written a script x5 in this this particular series which demonstrates this and it just builds an array and fiddles around with it in various ways to give you some idea of the results so i obviously had root vegetables on the mind on my mind when i was doing this so i declare an indexed array called vidges which is set to list of root vegetables which i will list i will read for you celeriac, artichug, asparagus, parsnip, mangle, wasle, dicon and turnip and the script prints out the contents of this array so you've got something to refer to the first trimming exercise we do is removing the first character so that consists of an expression which which is dollar open curly brackets this is in double quotes vegs open square bracket at closed square bracket so that's the whole array hash question mark so that word will match each of the elements of the array and will remove the third the question mark is it any character so the any character will be the first one on each word and it will just remove that so when we run it we get back sort of celeriac, ilioreac, ritichug and so on i won't read this that because i have it silly so second example we're removing characters up to and including the first of our how to do that well use dollar in double quotes again dollar open curly bracket vgs vegs i don't i choose things you can actually pronounce i don't know in square brackets and at sign hash asterisk then in square brackets a e i o u close square brackets close curly brackets close quotes so you will recognize that as a glob pattern which says match anything you like up to and including a vowel so when you look at the the output for that one you see that it's from it the each each example if you're if you're not looking at the script itself just now the output prints out the thing that it's doing but it doesn't print out the expression which is why i'm reading them out to you the first one it takes celeriac which is spelled c-e-l and it removes the c and the e because the e is a vowel and it prints at lioreac and then article begins the nasa prints at richi chuk asparagus vgs of the nasa prints at asparagus and so on and so forth third one we are removing characters up to and including the last vowel so in this case the expression is the same as before except that we've got two hashes in front of it so it will keep going until they're all processed so in this case it reduces celeriac to just to the last c it reduces artichoke to nothing at all and it reduces asparagus to an s the last s and i'd use print f here and it puts square brackets around each thing that it prints that was just so you could see that there was a blank returned when there was a blank return and the string was removed to nothing fourth example is using ext-glob and i said it on explicitly in the script remember we did this some while back shopt space hyphen s space ext-glob this time we are echoing this array with a hash sign after it's name we're using it and at sign is the index and then the expression we're using is an at followed by in parentheses c-yeli vertical bar a-r-t-i vertical bar a-s-p-a vertical bar and m-a-n-g-l close parenthesis close curly bracket close so you remember that at thing with a parenthesized list after it is a list of all of the prefixes in a list of words or filenames or whatever which you want to match we've talked about all this stuff in the recent past so what this does is it removes each of these things from the relevant word so it takes saliriac and in that matches c-e-l-e which has been removed and you end up with r-a-a-c and arty choke gets arty removed and it turns into choke and so on but it doesn't match some of the words which are just left alone past nip is not touched dicon and turnip don't get touched okay fifth example is like the first one actually but this time we're using the percent sign we're removing the last character so we've got in the the curly brackets after the dollar v-e-g-s open square bracket at sign close square bracket percent question mark so that matches the last character whatever it is from each word and you get back this is example five you get saliriac turns into saliria and so on six example we remove from the last vowel to the end so the expression here is the name of the array with an at sign percent then we've got in square brackets eight e-i-o-u the vowels and an asterisk so we're looking for the last vowel yeah because we're looking at the end of the string and to the end of the thing when you look at example six in the printout then saliriac is reduced to salirii so the a-c on the end is removed and arty choke has the final e-removed asparagus has the us on the end so the last vowel plus anything that follows it to the to the right and I went for removing from the first vowel to the end that's why I got confused and this one uses the name of the array with an at sign with a double percent and then in square brackets a-i-o-u close square bracket asterisk so that one will match the first vowel that it finds in the word because it's it's looking backwards up the word and repeating until it runs out of things to and to match so the first thing it will match will be the first vowel and then it will strip out everything after it so saliriac is left with just the first c and arty choke and asparagus begin with the vowel so there's nothing left and so on and so forth the last example eight is using the same the e-x-t-glow pattern to remove several different trailing patterns so we're just I just put some some trailing parts of these words so i-a-c-o-k-e-g-u-s-n-i-p-z-e-l so saliriac becomes solar arty choke because we're matching okay he gets turned to asparagus parsnip gets turned to pars there's a nip in this list so hopefully that's a bit tedious but it gives you some idea of what you can do with this if you wish if you happen to have an array and you want to do stuff to it that's really all the risk to say so that's that's it that's I think we've had enough of this just now I mean there's a few more of these not very many left there's one or two more of these parameters substitutions things to do in the next episode and I want to go on to some of the commands you can use to to do cool stuff with arrays and that'll be that thanks very much for listening hope you found it useful bye bye you've been listening to hecka public radio at hecka public radio dot org we are a community podcast network that releases shows every weekday Monday through Friday today's show like all our shows was contributed by an hbr listener like yourself if you ever thought of recording a podcast then click on our contributing to find out how easy it really is hecka public radio was found by the digital dog pound and the infonomican computer club and it's part of the binary revolution at binwreff.com if you have comments on today's show please email the host directly leave a comment on the website or record a follow-up episode yourself unless otherwise status today's show is released under creative comments attribution sharelite 3.0 license