Bash snippet - some possibly helpful hints (HPR Show 3551)

Using ‘eval’, ‘mapfile’ and environment variables

Dave Morriss


Table of Contents

Overview

I write a moderate number of Bash scripts these days. Bash is not a programming language as such, but it’s quite powerful in what it can do by itself, and with other tools it’s capable of many things.

I have enjoyed writing such scripts for many years on a variety of hardware and operating systems, and Bash is my favourite - partly because Linux itself is so flexible.

This is just a short show describing three things I tend to do in Bash scripts to assist with some tasks I find I need to undertake.

  1. Generate Bash variables from a text file - usually output from a program
  2. Fill Bash arrays with data from a file or other source
  3. Use environment variables to control the Bash script’s execution

Tasks

Generating Bash variables

There’s a Bash command 'eval' that can be used to evaluate a string as a command (or series of commands). The evaluation takes place in the current shell, so anything returned - Bash variables in this case - is available to the current script.

This is different from setting variables in a sub-shell (child process). This is because such variables are local to the subshell, and disappear when it finishes.

The eval command takes a list of arguments which are concatenated into a string and the resulting string is evaluated.

The eval command is seen as potentially dangerous in that it will execute any command it is given. Thus scripts should take precautions that the command (or commands) are predictable. Do not write a Bash script that executes whatever is given to it!

One particular case I use eval for is to set variables from a text file. The file is generated from the HPR show upload process and I want to grab the title, summary and host name so I can generate an index file for any supplementary files uploaded with a show.

The file contains text like:

Host_Name:      Dave Morriss
Title:  Battling with English - part 4
Summary:        Some confusion with English plurals; strange language changes

In my script the file name is in a variable RAWFILE, and I run the following command:

eval "$(sed -n "/^\(Title\|Summary\|Host_Name\):/{s/^\([^:]\+\):\t/\1='/;s/$/'/;p}" "$RAWFILE")"

Taking the sed command by itself, we get:

$ sed -n "/^\(Title\|Summary\|Host_Name\):/{s/^\([^:]\+\):\t/\1='/;s/$/'/;p}" "$RAWFILE"
Host_Name='Dave Morriss'
Title='Battling with English - part 4'
Summary='Some confusion with English plurals; strange language changes'

The sed commands find any line beginning with one of the three keywords and generate output consisting of that keyword, an equals sign and rest of the line. Thus the matched lines are turned into VAR='string' sequences.

So, eval executes these and sets the relevant variables, which the script can access.

This method is not foolproof. If a string contains a single quote the eval will fail. For the moment I haven’t guarded against this.

Filling a Bash array

I have a need to fill a Bash indexed array with sorted filenames in a script that deals with pictures sent in with HPR shows.

I want to use the find command to find the pictures, searching for files which end in jpg, JPG, png or PNG. I don’t want to visit sub-directories. The command I want is:

find "$SHOWDIR" -maxdepth 1 -regextype egrep -regex '.*\.(jpg|JPG|png|PNG)'

The variable SHOWDIR contains the directory holding the files uploaded for the show.

Given that I want to sort these files alphabetically, I can populate an array by this method:

declare -a pix
pix=( $(find "$SHOWDIR" -maxdepth 1 -regextype egrep -regex '.*\.(jpg|JPG|png|PNG)' | sort) )

The output from the find and sort commands in the command substitution expression will consist of a number of newline separated lines, but Bash will replace the newlines by spaces, so the array-defining parenthesised list will consist of space-delimited filenames, which will be placed in the array.

However, what if a file name contains a space? It’s bad practice, but it’s permitted, so it might happen.

This is where the mapfile command might help. This was introduced in episode 2739 where its options were described. Typing help mapfile in a terminal will show a similar description.

declare -a pix
mapfile -t pix < \
    <(find "$SHOWDIR" -maxdepth 1 -regextype egrep -regex '.*\.(jpg|JPG|png|PNG)' | sort)

We use a process substitution here which preserves newlines. One of the features of mapfile that is useful in this context is the -t option which removes the default delimiter, a newline. The delimiter can be changed with the -d DELIM option. The text between delimiters is what is written to the array elements, so as long as there are no filenames with newlines in them this will be better than the previous method.

To be 100% safe the find command should use the -print0 option which uses a null character instead of the default newline, and mapfile should be changed to this delimiter. We also need to tell sort to use null as a line delimiter which is done by adding the -z option1.

declare -a pix
mapfile -d '' -t pix < \
    <(find "$SHOWDIR" -maxdepth 1 -regextype egrep -regex '.*\.(jpg|JPG|png|PNG)' -print0 | sort -z)

What it doesn’t tell you in 'help mapfile' is that an empty string as a delimiter (-d '') causes mapfile to use a null delimiter. There doesn’t seem to be any other way to do this - or nothing I could find anyway. You can read this information in the Bash manpage.

Having discovered this information while preparing this show I shall certainly update my script to use it!

Turning debugging on

I find I need to add debugging statements to the more complicated scripts I write, and to help with this I usually define a fairly simple function to do it. Here’s what I often use:

_DEBUG () {
    [ "$DEBUG" == 0 ] && return
    for msg in "$@"; do
        printf 'D> %s\n' "$msg"
    done
}

This uses a global variable 'DEBUG' and exits without doing anything if the variable contains zero. If it is non-zero the function prints its arguments preceded by 'D> ' to show it is debug output.

I add calls to this function throughout my script if I want to check that values are what I expect them to be.

The issue is how to turn debugging mode on and off. There are several ways, from the simplest (least elegant) to the most complicated in terms of coding.

  1. Edit the script to set the DEBUG variable to 1 or zero
  2. Set it through an external variable visible to the script
  3. Add option processing to the script and use an option to enable or disable debug mode

I use tend to choice 3 when I’m already dealing with options, but if not then I use choice 2. This is the one I’ll explain now.

I use Vim as my editor, and in Vim I use a plugin ('BashSupport') with which can define boilerplate text to be added to scripts. I have configured this to generate various definitions and declarations whenever I create a new Bash script. One of the lines I add to all of my scripts is:

SCRIPT=${0##*/}

This takes the default variable $0 and strips off everything up to the last '/' character, thus leaving just the name of the script. I talked about these capabilities in show 1648.

I have recently started adding the following lines:

DEBUGVAR="${SCRIPT}_DEBUG"
DEBUG="${!DEBUGVAR:-0}"

This defines a variable 'DEBUGVAR' which contains the name of the script concatenated with '_DEBUG'. Then, assuming the script name is testscript, the 'DEBUG' variable is defined to contain the contents of a variable called 'testscript_DEBUG'. The exclamation mark ('!') in front of the variable name causes Bash to use its contents, which is a form of indirection. If the indirected variable is not found a default of zero is set.

This means that debugging can be turned on by calling the script thus:

testscript_DEBUG=1 ./testscript

Variables set from the command line are visible to scripts. One set on the command line only lasts while the script (or command) is executing.

You could set it as an exported (environment) variable:

export testscript_DEBUG=1
./testscript

and then it would continue after the script had run. I prefer not to do this.

I name my debug variables as I do so that there’s less chance of them affecting scripts other than the one I’m currently debugging!

Conclusion

These are just three things I have found myself using in recent Bash scripts, which I hope might prove to be useful to you.

If you have hints like this which you could share, please make an HPR show about them. We are always in need of shows, and at the time of writing (2022-02-26) we are particularly in need!


  1. I don’t think I mentioned the need for sort -z in the audio. However later testing showed that this option is needed to sort the output properly.↩︎