Files
hpr_website/www/eps/hpr2293/hpr2293_full_shownotes.html

406 lines
33 KiB
HTML
Executable File
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta name="author" content="Dave Morriss">
<title>More supplementary Bash tips (HPR Show 2293)</title>
<style type="text/css">code{white-space: pre;}</style>
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link rel="stylesheet" href="http://hackerpublicradio.org/css/hpr.css">
</head>
<body id="home">
<div id="container" class="shadow">
<header>
<h1 class="title">More supplementary Bash tips (HPR Show 2293)</h1>
<h2 class="subtitle">Pathname expansion; part 2 of 2</h2>
<h2 class="author">Dave Morriss</h2>
<hr/>
</header>
<main id="maincontent">
<article>
<header>
<h1>Table of Contents</h1>
<nav id="TOC">
<ul>
<li><a href="#expansion">Expansion</a><ul>
<li><a href="#pathname-expansion---continued">Pathname expansion - continued</a></li>
<li><a href="#notes">Notes</a></li>
<li><a href="#examples">Examples</a><ul>
<li><a href="#example-1---match-zero-or-one-occurrence">Example 1 - “match zero or one occurrence”</a></li>
<li><a href="#example-2---match-zero-or-more-occurrences">Example 2 - “match zero or more occurrences”</a></li>
<li><a href="#example-3---match-one-or-more-occurrences">Example 3 - “match one or more occurrences”</a></li>
<li><a href="#example-4---match-one-of-the-given-patterns">Example 4 - “match one of the given patterns”</a></li>
<li><a href="#example-5---match-anything-but">Example 5 - “match anything but”</a></li>
<li><a href="#example-6---use-of-patterns-elsewhere">Example 6 - use of patterns elsewhere</a></li>
</ul></li>
</ul></li>
<li><a href="#conclusion">Conclusion</a></li>
<li><a href="#links">Links</a></li>
<li><a href="#manual-page-extracts">Manual Page Extracts</a><ul>
<li><a href="#expansion-1">EXPANSION</a><ul>
<li><a href="#brace-expansion">Brace Expansion</a></li>
<li><a href="#tilde-expansion">Tilde Expansion</a></li>
<li><a href="#parameter-expansion">Parameter Expansion</a></li>
<li><a href="#command-substitution">Command Substitution</a></li>
<li><a href="#arithmetic-expansion">Arithmetic Expansion</a></li>
<li><a href="#process-substitution">Process Substitution</a></li>
<li><a href="#word-splitting">Word Splitting</a></li>
<li><a href="#pathname-expansion">Pathname Expansion</a><ul>
<li><a href="#pattern-matching">Pattern Matching</a></li>
</ul></li>
</ul></li>
</ul></li>
</ul>
</nav>
</header>
<h2 id="expansion">Expansion</h2>
<p>As we saw in the last episode <a href="http://hackerpublicradio.org/eps/hpr2278" title="Some supplementary Bash tips">2278</a> (and others in this sub-series) there are eight types of expansion applied to the command line in the following order:</p>
<ul>
<li>Brace expansion (we looked at this subject in episode <a href="http://hackerpublicradio.org/eps/hpr1884" title="Some more Bash tips">1884</a>)</li>
<li>Tilde expansion (seen in episode <a href="http://hackerpublicradio.org/eps/hpr1903" title="Some further Bash tips">1903</a>)</li>
<li>Parameter and variable expansion (this was covered in episode <a href="http://hackerpublicradio.org/eps/hpr1648" title="Bash parameter manipulation">1648</a>)</li>
<li>Command substitution (seen in episode <a href="http://hackerpublicradio.org/eps/hpr1903" title="Some further Bash tips">1903</a>)</li>
<li>Arithmetic expansion (seen in episode <a href="http://hackerpublicradio.org/eps/hpr1951" title="Some additional Bash tips">1951</a>)</li>
<li>Process substitution (seen in episode <a href="http://hackerpublicradio.org/eps/hpr2045" title="Some other Bash tips">2045</a>)</li>
<li>Word splitting (seen in episode <a href="http://hackerpublicradio.org/eps/hpr2045" title="Some other Bash tips">2045</a>)</li>
<li>Pathname expansion (the previous episode <a href="http://hackerpublicradio.org/eps/hpr2278" title="Some supplementary Bash tips">2278</a> and this one)</li>
</ul>
<p>This is the last topic in the (sub-) series about expansion in Bash.</p>
<p>In this episode we will look at extended pattern matching as also defined in the “<a href="#manual-page-extracts">Manual Page Extracts</a>” section at the end of the long notes.</p>
<h3 id="pathname-expansion---continued">Pathname expansion - continued</h3>
<p>As we saw in the last episode (<a href="http://hackerpublicradio.org/eps/hpr2278" title="Some supplementary Bash tips">2278</a>), if we enable the option <code>extglob</code> using the <code>shopt</code> command we enable a number of additional <em>extended pattern matching</em> features<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a>.</p>
<p>In the following description, a <em>pattern-list</em> is a list of one or more patterns separated by a <b>|</b>. Composite patterns may be formed using one or more of the following sub-patterns:</p>
<dl>
<dt><b>?(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches zero or one occurrence of the given patterns</p>
</dd>
<dt><b>*(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches zero or more occurrences of the given patterns</p>
</dd>
<dt><b>+(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches one or more occurrences of the given patterns</p>
</dd>
<dt><b>@(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches one of the given patterns</p>
</dd>
<dt><b>!(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches anything except one of the given patterns</p>
</dd>
</dl>
<h3 id="notes">Notes</h3>
<ol>
<li>This is a fairly new feature</li>
<li>It does not seem to be very well documented</li>
<li>There are some similarities to regular expressions</li>
</ol>
<p><b>Warning!</b>: It is not explained explicitly in the Bash manpage but these patterns are applied to each <b>filename</b>. So the pattern:</p>
<pre><code>a?(b)c</code></pre>
<p>matches a file which begins with <code>a</code>, is followed by zero or one instance of letter <code>b</code> and ends with <code>c</code>. This means it can match only the filenames <code>abc</code> and <code>ac</code>. This is explained more completely below.</p>
<p>Some of the confusion this can cause can be seen in the Stack Exchange questions listed in the <a href="#links">Links</a> section below.</p>
<h3 id="examples">Examples</h3>
<p>It turns out that the 33,800 files generated in the last episode are not particularly useful when demonstrating how this feature works. I had not investigated extended glob patterns when I created them unfortunately.</p>
<p>Although these files will be used for these examples we will create some more directories and files of a simpler structure, and will turn on <code>extglob</code> (assuming its not on by default - see the footnote):</p>
<pre><code>$ cd Pathname_expansion
$ mkdir test
$ touch test/{abbc,abc,ac,axc}
$ touch test/{x,xx,xxx}.dat
$ ls -1 test/
abbc
abc
ac
axc
x.dat
xx.dat
xxx.dat
$ shopt -s extglob</code></pre>
<p>(Some examples here are derived from the Stack Exchange articles mentioned earlier and listed in the <a href="#links">Links</a> section.)</p>
<h4 id="example-1---match-zero-or-one-occurrence">Example 1 - “match zero or one occurrence”</h4>
<p><b>?(</b><em>pattern-list</em><b>)</b></p>
<p>In the first demonstration we are asking for zero or one occurrence of <code>b</code> between the <code>a</code> and <code>b</code>. We get the files <code>abc</code> and <code>ac</code> because they match the zero and one cases.</p>
<pre><code>$ echo test/a?(b)c
test/abc test/ac</code></pre>
<p>Next we have asked for zero or one letter <code>b</code> or letter <code>x</code> in the centre, so in this case we also see <code>axc</code>.</p>
<pre><code>$ echo test/a?(b|x)c
test/abc test/ac test/axc</code></pre>
<p>Note that the <em>pattern list</em> has become a little more complex, since we have an alternative character.</p>
<p>Now we will move to a more complex example using the large collection of test files.</p>
<p>Here we are searching though the directories that start with a vowel for all files that have a or b as the second letter and 01, 10 or 11 as the next two digits, <b>or</b> files whose second letter is a or b followed by the digits 50:</p>
<pre><code>$ ls -w 50 -x [aeiou]/?(?[ab][01][01]*|?[ab]50*)
a/aa01.txt a/aa10.txt a/aa11.txt a/aa50.txt
a/ab01.txt a/ab10.txt a/ab11.txt a/ab50.txt
e/ea01.txt e/ea10.txt e/ea11.txt e/ea50.txt
e/eb01.txt e/eb10.txt e/eb11.txt e/eb50.txt
i/ia01.txt i/ia10.txt i/ia11.txt i/ia50.txt
i/ib01.txt i/ib10.txt i/ib11.txt i/ib50.txt
o/oa01.txt o/oa10.txt o/oa11.txt o/oa50.txt
o/ob01.txt o/ob10.txt o/ob11.txt o/ob50.txt
u/ua01.txt u/ua10.txt u/ua11.txt u/ua50.txt
u/ub01.txt u/ub10.txt u/ub11.txt u/ub50.txt</code></pre>
<p><small><em>The <code>-l 50</code> option to <code>ls</code> limits the output width for better readability in these notes. We also use <code>-x</code> which lists files in row order rather than the default column order so you can read left to right.</em></small></p>
<p>There are some important points to understand in this example:</p>
<ul>
<li><p>Although we are using the “match zero or one occurrence” sub-pattern there are <b>no</b> cases where there are zero matches. The main benefit we are getting from this feature is that we can use alternation (vertical bar).</p></li>
<li><p>Use of the <code>*</code> wildcard in the sub-pattern avoids the need to be explicit about the <code>.txt</code> suffix on the files. The same effect would be achieved with the following:</p>
<pre><code>[aeiou]/?(?[ab][01][01]|?[ab]50).txt</code></pre></li>
<li><p>Adding a <code>*</code> wildcard to the <b>end</b> will result in the sub-expression having no effect, and all files in the directories will be returned. That is because the wildcard matches everything! The difference is shown below:</p>
<pre><code>$ echo [aeiou]/?(?[ab][01][01]*|?[ab]50*) | wc -w
40
$ echo [aeiou]/?(?[ab][01][01]*|?[ab]*)* | wc -w
6500
$ echo [aeiou]/* | wc -w
6500</code></pre></li>
</ul>
<!-- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<h4 id="example-2---match-zero-or-more-occurrences">Example 2 - “match zero or more occurrences”</h4>
<p><b>*(</b><em>pattern-list</em><b>)</b></p>
<p>In the next demonstration we are asking for zero or more occurrences of <code>b</code> between the <code>a</code> and <code>b</code>. We get the files <code>abbc</code>, <code>abc</code> and <code>ac</code> because they match the zero and more than zero cases.</p>
<pre><code>$ echo test/a*(b)c
test/abbc test/abc test/ac</code></pre>
<p>Not surprisingly, adding <code>x</code> to the list in the sub-expression also returns <code>axc</code>.</p>
<pre><code>$ echo test/a*(b|x)c
test/abbc test/abc test/ac test/axc</code></pre>
<p>There are files in the <code>test</code> directory with one to three <code>x</code> characters at the start of their names. We can search for them as follows:</p>
<pre><code>$ echo test/*(x).dat
test/x.dat test/xx.dat test/xxx.dat</code></pre>
<p>There is no instance of zero <code>x</code>es followed by  <code>.dat</code>  but a file  <code>.dat</code>  would match, though it would only be shown if  <code>dotglob</code> was set.</p>
<p>Applying this sub-pattern to the large collection of test files from the last episode we might want to find all files in directory <code>a</code> which begin with two <code>a</code>s and numbers in the range 1-3:</p>
<pre><code>$ ls -w 50 -x a/*(a)*([1-3]).txt
a/aa11.txt a/aa12.txt a/aa13.txt a/aa21.txt
a/aa22.txt a/aa23.txt a/aa31.txt a/aa32.txt
a/aa33.txt</code></pre>
<p>You might expect to get back only <code>a/aa11.txt</code>, <code>a/aa22.txt</code> and <code>a/aa22.txt</code> but what is actually returned matches <code>aa</code> followed by two numbers, each in the range 1-3. This is the same as:</p>
<pre><code>$ ls -w 50 -x a/aa[1-3][1-3].txt
a/aa11.txt a/aa12.txt a/aa13.txt a/aa21.txt
a/aa22.txt a/aa23.txt a/aa31.txt a/aa32.txt
a/aa33.txt</code></pre>
<p>Just to demonstrate how these sub-patterns work, the following example returns the three files in the first column above:</p>
<pre><code>$ ls -1 a/?(*(a)*(1)|*(a)*(2)|*(a)*(3)).txt
a/aa11.txt
a/aa22.txt
a/aa33.txt</code></pre>
<p>However, it does not seem very practical!</p>
<!-- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<h4 id="example-3---match-one-or-more-occurrences">Example 3 - “match one or more occurrences”</h4>
<p><b>+(</b><em>pattern-list</em><b>)</b></p>
<p>The next demonstration requests one or more instances of the letter <code>b</code> between the other letters and returns the files <code>abbc</code> (two <code>b</code>s) and  <code>abc</code>  (one<code>b</code>):</p>
<pre><code>$ echo test/a+(b)c
test/abbc test/abc</code></pre>
<p>As before, adding <code>x</code> as an alternative adds file <code>axc</code> to the list:</p>
<pre><code>$ echo test/a+(b|x)c
test/abbc test/abc test/axc</code></pre>
<p>The following example looks in directories <code>a</code> and <code>b</code> for files that begin with an <code>a</code> or a <code>b</code> and end with <code>01.txt</code>:</p>
<pre><code>$ ls -w 50 -x [ab]/*(a|b)*01.txt
a/aa01.txt a/ab01.txt a/ac01.txt a/ad01.txt
a/ae01.txt a/af01.txt a/ag01.txt a/ah01.txt
a/ai01.txt a/aj01.txt a/ak01.txt a/al01.txt
a/am01.txt a/an01.txt a/ao01.txt a/ap01.txt
a/aq01.txt a/ar01.txt a/as01.txt a/at01.txt
a/au01.txt a/av01.txt a/aw01.txt a/ax01.txt
a/ay01.txt a/az01.txt b/ba01.txt b/bb01.txt
b/bc01.txt b/bd01.txt b/be01.txt b/bf01.txt
b/bg01.txt b/bh01.txt b/bi01.txt b/bj01.txt
b/bk01.txt b/bl01.txt b/bm01.txt b/bn01.txt
b/bo01.txt b/bp01.txt b/bq01.txt b/br01.txt
b/bs01.txt b/bt01.txt b/bu01.txt b/bv01.txt
b/bw01.txt b/bx01.txt b/by01.txt b/bz01.txt</code></pre>
<p>This could just as well have been achieved with:</p>
<pre><code>$ ls -w 50 -x [ab]/[ab]*01.txt</code></pre>
<!-- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<h4 id="example-4---match-one-of-the-given-patterns">Example 4 - “match one of the given patterns”</h4>
<p><b>@(</b><em>pattern-list</em><b>)</b></p>
<p>This demonstration requests one instance of the letter <code>b</code> between the other letters and returns one file <code>abc</code>:</p>
<pre><code>$ echo test/a@(b)c
test/abc</code></pre>
<p>Again, adding <code>x</code> as an alternative adds file <code>axc</code> to the list:</p>
<pre><code>$ echo test/a@(b|x)c
test/abc test/axc</code></pre>
<p>To make some better search targets I ran the following commands:</p>
<pre><code>$ mkdir words
$ while read word; do
&gt; word=${word%[^a-zA-Z]*}
&gt; word=${word,,}
&gt; touch words/$word
&gt; done &lt; &lt;(shuf -n100 /usr/share/dict/words)</code></pre>
<ul>
<li>A directory <code>words</code> was created</li>
<li>A <code>while</code> loop was started to read data into a variable called <code>word</code> (this starts a multi-line command so the prompt changes to <code>&gt;</code> until the entire loop is typed in)</li>
<li>The <code>word</code> variable is stripped of all non alphabetic characters at the end to remove trailing apostrophes or <code>'s</code> sequences.</li>
<li>The <code>word</code> variable is converted to lower case</li>
<li>The <code>touch</code> command makes an empty file named whatever variable <code>word</code> contains</li>
<li>The loop ends with <code>done</code> and the loop is “fed” with data by a process substitution (see show <a href="http://hackerpublicradio.org/eps/hpr2045" title="Some other Bash tips">2045</a>). This runs the <code>shuf</code> command to return 100 random words from <code>/usr/share/dict/words</code>.</li>
</ul>
<p>If you try this you will get different words.</p>
<p>In my case I used the following command to return words containing one of <code>ee</code>, <code>oo</code>, <code>th</code> and <code>ss</code>:</p>
<pre><code>$ ls -w 60 words/*@(ee|oo|th|ss)*
words/commandeering words/katherine words/woolly
words/eighteenths words/slathering
words/ingress words/thoughtlessly</code></pre>
<!-- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -->
<h4 id="example-5---match-anything-but">Example 5 - “match anything but”</h4>
<p><b>!(</b><em>pattern-list</em><b>)</b></p>
<p>In the final demonstration we look for file names which do not contain a <code>b</code> between the <code>a</code> and <code>c</code>:</p>
<pre><code>$ echo test/a!(b)c
test/abbc test/ac test/axc</code></pre>
<p>Notice how this list includes <code>abbc</code> because there are multiple <code>b</code>s between the other letters and the pattern specified one.</p>
<p>If we replace the <code>b</code> in the pattern with a further pattern which means “one or more” then we do not get <code>abbc</code>:</p>
<pre><code>$ echo test/a!(+(b))c
test/ac test/axc</code></pre>
<p>This again demonstrates that patterns can contain patterns!</p>
<p>As a more complex example to show how this sub-pattern works we might try searching for files thus:</p>
<pre><code>$ ls -w 50 -x a/a!([c-z]*).txt
a/aa01.txt a/aa02.txt a/aa03.txt a/aa04.txt
a/aa05.txt a/aa06.txt a/aa07.txt a/aa08.txt
...
a/aa49.txt a/aa50.txt a/ab01.txt a/ab02.txt
a/ab03.txt a/ab04.txt a/ab05.txt a/ab06.txt
...
a/ab47.txt a/ab48.txt a/ab49.txt a/ab50.txt</code></pre>
<p>Here were looking for files in the directory <code>a</code> where the first letter is <code>a</code> (they all are) and the second letter is <strong>not</strong> in the range <code>[c-z]</code>. The output here shows a subset of what was returned.</p>
<p>Lets finish with an example searching the directory of words. This time we have a pattern within a pattern. The inner pattern is a <b>@(</b><em>pattern-list</em><b>)</b> which contains a list of pairs of letters, mostly identical. This pattern is surrounded by asterisk wildcards. The effect of this is to select all words that contain one of the letter pairs.</p>
<p>This is enclosed in a <b>!(</b><em>pattern-list</em><b>)</b> pattern which negates the inner selection making it match words which <b>do not</b> contain the pairs of letters.</p>
<pre><code>$ ls -w 70 words/!(*@(bb|cc|dd|ee|gg|ll|oo|pp|tt|th|ss)*)
words/adela words/falconers words/protectively
words/adversest words/frankie words/quits
words/ails words/gnomes words/rashes
words/airline words/haring words/recites
words/alton words/indianapolis words/rescuers
...
words/dickson words/pitchfork words/weightlifting
words/elitist words/pomade words/whales
words/enactment words/prepackaging words/writings
words/épées words/preview words/yens
words/exit words/profusion words/yodel</code></pre>
<p>The result is 81 of the 100 words in the directory.</p>
<h4 id="example-6---use-of-patterns-elsewhere">Example 6 - use of patterns elsewhere</h4>
<p>We have seen at various times in this series that <em>glob</em>-style patterns can be used in other contexts. One instance was when manipulating Bash parameters (<a href="http://hackerpublicradio.org/eps/hpr1648" title="Bash parameter manipulation">show 1648</a>):</p>
<pre><code>$ x=&quot;aaabbbccc&quot;
$ echo ${x/a/-}
-aabbbccc</code></pre>
<p>Here we created a variable <code>x</code> and used <em>pattern substitution</em> to replace the first <code>a</code> with a hyphen.</p>
<pre><code>$ echo ${x/+(a)/-}
-bbbccc</code></pre>
<p>This time we have used the <code>+(a)</code> pattern to match one or more <code>a</code>s. Note that the matched group is replaced by <b>one</b> hyphen. If we want to replace each of the letters with a hyphen then wed use an alternative type <em>pattern substitution</em> that works through the entire string:</p>
<pre><code>$ echo ${x//a/-}
---bbbccc</code></pre>
<p>This time we didnt want to match a group of letters, so didnt use extended pattern matching.</p>
<p>Another place where extended pattern matching can be used is in <code>case</code> statements. I will not go into further detail about this here. However, there is a Stack Exchange question about it listed in the <a href="#links">Links</a> section.</p>
<p>To summarise: anywhere where a filename-type pattern match is allowed then <em>extended</em> patterns can be used (assuming <code>extglob</code> is set).</p>
<h2 id="conclusion">Conclusion</h2>
<p>Until I started investigating these extended pattern matching features of Bash I did not think I would find them particularly useful. It also took me quite a while to understand how they worked.</p>
<p>Now I actually find them quite powerful and will use them in future in scripts I write.</p>
<p>Bash extended patterns are similar in concept to Regular Expressions, although they are written totally differently. For example, the Bash pattern: <code>hot*(dog)</code> means the same as the RE: <code>hot(dog)*</code>. They both match the words “hot” and “hotdog”. The difference is that <code>*</code> in a RE means that the preceding expression may match zero or more times, and can follow many sorts of expressions. The extended pattern is not quite so general.</p>
<p>I hope this episode has helped you understand these Bash features and that you also find them useful.</p>
<h2 id="links">Links</h2>
<!-- \_ -->
<ul>
<li>Previous shows in this series
<ol>
<li><a href="http://hackerpublicradio.org/eps/hpr1648">HPR episode 1648 “<em>Bash parameter manipulation</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr1843">HPR episode 1843 “<em>Some Bash tips</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr1884">HPR episode 1884 “<em>Some more Bash tips</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr1903">HPR episode 1903 “<em>Some further Bash tips</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr1951">HPR episode 1951 “<em>Some additional Bash tips</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr2045">HPR episode 2045 “<em>Some other Bash tips</em></a></li>
<li><a href="http://hackerpublicradio.org/eps/hpr2278">HPR episode 2278 “<em>Some supplementary Bash tips</em></a></li>
</ol></li>
</ul>
<!-- - -->
<ul>
<li>Other HPR series referenced:
<ul>
<li><a href="http://hackerpublicradio.org/series.php?id=90"><em>Learning sed</em></a> series on HPR</li>
<li><a href="http://hackerpublicradio.org/series.php?id=94"><em>Learning Awk</em></a> series on HPR</li>
</ul></li>
</ul>
<!-- - -->
<ul>
<li>Wikipedia article on <a href="https://en.wikipedia.org/wiki/Glob_%28programming%29"><em>glob patterns</em></a></li>
<li>Advanced Bash Scripting Guide: <a href="http://www.tldp.org/LDP/abs/html/globbingref.html"><em>Globbing</em></a></li>
<li>Article on <a href="http://mywiki.wooledge.org/glob"><em>Gregs Wiki</em></a> entitled “<em>Globs</em></li>
<li>Questions about <em>Bash extended globbing</em> on Stack Exchange:
<ul>
<li>Question 1: <a href="https://unix.stackexchange.com/questions/168769/bash-extended-globbing">How to list just one file with <code>ls</code></a></li>
<li>Question 2: <a href="https://unix.stackexchange.com/questions/203386/extended-glob-what-is-the-difference-in-syntax-between-list-list-list">What is the difference between…</a></li>
<li>Question 3: <a href="https://stackoverflow.com/questions/4554718/patterns-in-case-statement-in-bash-scripting">Patterns in case statements</a></li>
</ul></li>
</ul>
<hr />
<!-- {{{ -->
<h1 id="manual-page-extracts">Manual Page Extracts</h1>
<h2 id="expansion-1">EXPANSION</h2>
<p>Expansion is performed on the command line after it has been split into words. There are seven kinds of expansion performed: <em>brace expansion</em>, <em>tilde expansion</em>, <em>parameter and variable expansion</em>, <em>command substitution</em>, <em>arithmetic expansion</em>, <em>word splitting</em>, and <em>pathname expansion</em>.</p>
<p>The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.</p>
<p>On systems that can support it, there is an additional expansion available: <em>process substitution</em>. This is performed at the same time as tilde, parameter, variable, and arithmetic expansion and command substitution.</p>
<p>Only brace expansion, word splitting, and pathname expansion can change the number of words of the expansion; other expansions expand a single word to a single word. The only exceptions to this are the expansions of “<strong>$@</strong>” and “<strong>${name[@]}</strong>” as explained above (see <strong>PARAMETERS</strong>).</p>
<h3 id="brace-expansion">Brace Expansion</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr1884" title="Some more Bash tips">1884</a>.</p>
<h3 id="tilde-expansion">Tilde Expansion</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr1903" title="Some further Bash tips">1903</a>.</p>
<h3 id="parameter-expansion">Parameter Expansion</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr1648" title="Bash parameter manipulation">1648</a>.</p>
<h3 id="command-substitution">Command Substitution</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr1903" title="Some further Bash tips">1903</a>.</p>
<h3 id="arithmetic-expansion">Arithmetic Expansion</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr1951" title="Some additional Bash tips">1951</a>.</p>
<h3 id="process-substitution">Process Substitution</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr2045" title="Some other Bash tips">2045</a>.</p>
<h3 id="word-splitting">Word Splitting</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr2045" title="Some other Bash tips">2045</a>.</p>
<h3 id="pathname-expansion">Pathname Expansion</h3>
<p>See the notes for HPR show <a href="http://hackerpublicradio.org/eps/hpr2278" title="Some supplementary Bash tips">2278</a> for some of the material in this section.</p>
<p>After word splitting, unless the <strong>-f</strong> option has been set, <strong>bash</strong> scans each word for the characters <b>*</b>, <b>?</b>, and <b>[</b>. If one of these characters appears, then the word is regarded as a <em>pattern</em>, and replaced with an alphabetically sorted list of filenames matching the pattern (see <strong>Pattern Matching</strong> below). If no matching filenames are found, and the shell option <strong>nullglob</strong> is not enabled, the word is left unchanged. If the <strong>nullglob</strong> option is set, and no matches are found, the word is removed. If the <strong>failglob</strong> shell option is set, and no matches are found, an error message is printed and the command is not executed. If the shell option <strong>nocaseglob</strong> is enabled, the match is performed without regard to the case of alphabetic characters. Note that when using range expressions like [a-z] (see below), letters of the other case may be included, depending on the setting of <strong>LC_COLLATE</strong>. When a pattern is used for pathname expansion, the character “.” at the start of a name or immediately following a slash must be matched explicitly, unless the shell option <strong>dotglob</strong> is set. When matching a pathname, the slash character must always be matched explicitly. In other cases, the “.” character is not treated specially. See the description of <strong>shopt</strong> below under <strong>SHELL BUILTIN COMMANDS</strong> for a description of the <strong>nocaseglob</strong>, <strong>nullglob</strong>, <strong>failglob</strong>, and <strong>dotglob</strong> shell options.</p>
<p>The <strong>GLOBIGNORE</strong> shell variable may be used to restrict the set of filenames matching a pattern. If <strong>GLOBIGNORE</strong> is set, each matching filename that also matches one of the patterns in <strong>GLOBIGNORE</strong> is removed from the list of matches. The filenames “.” and “..” are always ignored when <strong>GLOBIGNORE</strong> is set and not null. However, setting <strong>GLOBIGNORE</strong> to a non-null value has the effect of enabling the <strong>dotglob</strong> shell option, so all other file names beginning with a “.” will match. To get the old behavior of ignoring filenames beginning with a “.”, make “.*&quot; one of the patterns in <strong>GLOBIGNORE</strong>. The <strong>dotglob</strong> option is disabled when <strong>GLOBIGNORE</strong> is unset.</p>
<h4 id="pattern-matching">Pattern Matching</h4>
<p>Any character that appears in a pattern, other than the special pattern characters described below, matches itself. The NUL character may not occur in a pattern. A backslash escapes the following character; the escaping backslash is discarded when matching. The special pattern characters must be quoted if they are to be matched literally.</p>
<p>The special pattern characters have the following meanings:</p>
<dl>
<dt><b>*</b></dt>
<dd><p>Matches any string, including the null string. When the <strong>globstar</strong> shell option is enabled, and <b>*</b> is used in a pathname expansion context, two adjacent <b>*</b>s used as a single pattern will match all files and zero or more directories and subdirectories. If followed by a <b>/</b>, two adjacent <b>*</b>s will match only directories and subdirectories.</p>
</dd>
<dt><strong>?</strong></dt>
<dd><p>Matches any single character.</p>
</dd>
<dt><b>[…]</b></dt>
<dd><p>Matches any one of the enclosed characters. A pair of characters separated by a hyphen denotes a <em>range expression</em>; any character that falls between those two characters, inclusive, using the current locales collating sequence and character set, is matched. If the first character following the <strong>[</strong> is a <strong>!</strong> or a <strong>^</strong> then any character not enclosed is matched. The sorting order of characters in range expressions is determined by the current locale and the values of the <strong>LC_COLLATE</strong> or <strong>LC_ALL</strong> shell variables, if set. To obtain the traditional interpretation of range expressions, where <strong>[a-d]</strong> is equivalent to <strong>[abcd]</strong>, set value of the <strong>LC_ALL</strong> shell variable to <strong>C</strong>, or enable the <strong>globasciiranges</strong> shell option. A <strong>-</strong> may be matched by including it as the first or last character in the set. A <strong>]</strong> may be matched by including it as the first character in the set.</p>
<p>Within <b>[</b> and <b>]</b>, character classes can be specified using the syntax <b>[:</b><em>class</em><b>:]</b>, where <em>class</em> is one of the following classes defined in the POSIX standard: <b>alnum alpha ascii blank cntrl digit graph lower print punct space upper word xdigit</b> A character class matches any character belonging to that class. The <b>word</b> character class matches letters, digits, and the character _.</p>
<p>Within <b>[</b> and <b>]</b>, an <em>equivalence class</em> can be specified using the syntax <b>[=</b><em>c</em><b>=]</b>, which matches all characters with the same collation weight (as defined by the current locale) as the character <em>c.</em></p>
<p>Within <b>[</b> and <b>]</b>, the syntax <b>[.</b><em>symbol</em><b>.]</b> matches the collating symbol <em>symbol</em>.</p>
</dd>
</dl>
<p>If the <b>extglob</b> shell option is enabled using the <b>shopt</b> builtin, several extended pattern matching operators are recognized. In the following description, a <em>pattern-list</em> is a list of one or more patterns separated by a <b>|</b>. Composite patterns may be formed using one or more of the following sub-patterns:</p>
<dl>
<dt><b>?(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches zero or one occurrence of the given patterns</p>
</dd>
<dt><b>*(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches zero or more occurrences of the given patterns</p>
</dd>
<dt><b>+(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches one or more occurrences of the given patterns</p>
</dd>
<dt><b>@(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches one of the given patterns</p>
</dd>
<dt><b>!(</b><em>pattern-list</em><b>)</b></dt>
<dd><p>Matches anything except one of the given patterns</p>
</dd>
</dl>
<!-- }}} -->
<section class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Note that on the versions of GNU Linux that I run (Debian, KDE Neon and Raspbian) <code>extglob</code> is on by default. It is actually set in <code>/usr/share/bash-completion/bash_completion</code> which is invoked directly or from <code>/etc/bash_completion</code> which is invoked from the default <code>~/.bashrc</code>. These are all Debian-derived distributions, so I cant speak for others.<a href="#fnref1"></a></p></li>
</ol>
</section>
</article>
</main>
</div>
</body>
</html>