Files
hpr_website/www/eps/hpr2011/hpr2011_full_shownotes.html

295 lines
30 KiB
HTML
Executable File

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta name="author" content="Dave Morriss">
<title>Introduction to sed - part 4 (HPR Show 2011)</title>
<style type="text/css">code{white-space: pre;}</style>
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link rel="stylesheet" href="http://hackerpublicradio.org/css/hpr.css">
</head>
<body id="home">
<div id="container" class="shadow">
<header>
<h1 class="title">Introduction to sed - part 4 (HPR Show 2011)</h1>
<h2 class="author">Dave Morriss</h2>
<hr/>
</header>
<main id="maincontent">
<article>
<header>
<h1>Table of Contents</h1>
<nav id="TOC">
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#how-sed-really-works">How sed REALLY works</a></li>
<li><a href="#commands">Commands</a><ul>
<li><a href="#the-y-command">The <em>y</em> command</a></li>
<li><a href="#the-command">The <em>=</em> command</a></li>
<li><a href="#commands-that-operate-on-the-pattern-space">Commands that operate on the pattern space</a><ul>
<li><a href="#the-d-command">The <em>D</em> command</a></li>
<li><a href="#the-n-command">The <em>N</em> command</a></li>
<li><a href="#the-p-command">The <em>P</em> command</a></li>
<li><a href="#the-l-command">The <em>l</em> command</a></li>
<li><a href="#example-using-the-pattern-space">Example using the pattern space</a></li>
</ul></li>
<li><a href="#commands-to-transfer-to-and-from-the-hold-space">Commands to transfer to and from the hold space</a><ul>
<li><a href="#the-h-command">The <em>h</em> command</a></li>
<li><a href="#the-h-command-1">The <em>H</em> command</a></li>
<li><a href="#the-g-command">The <em>g</em> command</a></li>
<li><a href="#the-g-command-1">The <em>G</em> command</a></li>
<li><a href="#the-x-command">The <em>x</em> command</a></li>
</ul></li>
</ul></li>
<li><a href="#flags-and-modifiers-we-omitted-earlier">Flags and modifiers we omitted earlier</a></li>
<li><a href="#examples">Examples</a><ul>
<li><a href="#example-1">Example 1</a></li>
<li><a href="#example-2">Example 2</a></li>
<li><a href="#example-3">Example 3</a></li>
</ul></li>
<li><a href="#quiz">Quiz</a><ul>
<li><a href="#pig-latin-igpay-atinlay">Pig Latin / Igpay Atinlay</a></li>
</ul></li>
<li><a href="#links">Links</a></li>
</ul>
</nav>
</header>
<h2 id="introduction">Introduction</h2>
<p>In the <a href="http://hackerpublicradio.org/eps/hpr1997" title="Introduction to sed - part 3">last episode</a> we looked at some of the more frequently used <code>sed</code> commands, having spent previous episodes looking at the <strong>s</strong> command, and we also covered the concept of line addressing.</p>
<p>In this episode we will look at how <code>sed</code> really works in all the gory details, examine some of the remaining <code>sed</code> commands and begin to use what we know to build useful <code>sed</code> programs.</p>
<h2 id="how-sed-really-works">How sed REALLY works</h2>
<p>In <a href="http://hackerpublicradio.org/eps/hpr1976" title="Introduction to sed - part 1">Episode 1</a> we looked briefly at the <em>pattern space</em> where <code>sed</code> holds the incoming data while commands are executed on it. In this episode we will look at this data buffer and its counterpart, the <em>hold space</em> in more detail.</p>
<p>When considering the <em>pattern space</em> in earlier episodes it was simpler to visualise it as a relatively small storage area, capable of holding one line from the input stream. In fact, it is a <em>buffer</em> which can hold an arbitrarily large amount of data, though it is normally used to hold just the latest input line.</p>
<p>As you know from previous discussions, the <em>pattern space</em> is processed in the following cycle:</p>
<ul>
<li>A line is read from the input stream, the trailing newline is removed and the result is stored in the <em>pattern space</em></li>
<li>The commands making up the <code>sed</code> script are executed (as appropriate regarding addressing, etc.)</li>
<li>When command execution has finished the <em>pattern space</em> is printed to the output stream, after adding the trailing newline (if it was removed). This <em>auto printing</em> does not happen if the <em>-n</em> command line option is in effect.</li>
<li>The cycle then begins again, with the <em>pattern space</em> being cleared before the next line is read. This part of the cycle can be altered by a few special commands, which we will look at later.</li>
</ul>
<p>The <em>hold space</em> on the other hand, is a storage buffer like the <em>pattern space</em> which is not affected by the cycle described above. Data placed in the <em>hold space</em> remains there until <code>sed</code> exits or it is deleted explicitly. Commands exist which can move data to and from the <em>hold space</em>, as we will see.</p>
<h2 id="commands">Commands</h2>
<p>This episode is following the <a href="https://www.gnu.org/software/sed/manual/sed.html" title="GNU sed manual">GNU sed manual</a>, particularly the section about <a href="https://www.gnu.org/software/sed/manual/sed.html#Other-Commands" title="Less Frequently-Used Commands">less frequently-used commands</a>. Some of the commands in this category in the manual have been omitted in this series though, and some will be held over to the next episode.</p>
<h3 id="the-y-command">The <em>y</em> command</h3>
<p>The <strong>y</strong> command transforms (<em>transliterates</em>) characters. The format is:</p>
<pre><code>y/source-chars/dest-chars/</code></pre>
<p>It operates on the <em>pattern space</em> transliterating any characters which match any of the <em>source-chars</em> with the corresponding character in <em>dest-chars</em>.</p>
<p>The delimiter used to separate the two parts is normally a slash (<strong>/</strong>), but it can be changed as we saw with the <strong>s</strong> command <strong>without</strong> preceding the first instance with a backslash.</p>
<p>If the delimiter is used in either of the two lists of characters it must be preceded by a backslash (<strong>\</strong>) to escape it.</p>
<p>The two lists must be the same length (not counting backslash escapes).</p>
<p>The <strong>y</strong> command has no flags.</p>
<p>In the following example the first two lines of the example file are processed with a <strong>y</strong> command. The lower-case vowels in these two lines are converted to the next vowel in the sequence, so 'a' becomes 'e', 'e' becomes 'i' and so forth:</p>
<pre><code>$ sed -ne &#39;1,2{y/aeiou/eioua/;p}&#39; sed_demo1.txt
Heckir Pabloc Redou (HPR) os en Intirnit Redou shuw (pudcest) thet riliesis
shuws iviry wiikdey Mundey thruagh Frodey. HPR hes e lung loniegi guong beck tu</code></pre>
<p>The next example uses <code>nl</code> as in earlier episodes to number the lines so you can see what has been done to them. The script contains two groups, both of which perform transliterations. The first group is controlled by an address expression which operates on odd lines, and the second group operates on even lines. The <strong>y</strong> commands perform similar vowel transformations as the previous example, but they cater for upper-case vowels as well. The vowel sequences are &quot;rotated&quot; differently for the even versus the odd lines. Only the first five lines of output are shown here.</p>
<pre><code>$ nl -w3 -ba sed_demo1.txt | sed -ne &#39;1~2{y/aeiouAEIOU/eiouaEIOUA/;p};2~2{y|aeiouAEIOU|iouaeIOUAE|;p}&#39;
1 Heckir Pabloc Redou (HPR) os en Ontirnit Redou shuw (pudcest) thet riliesis
2 shaws ovory wookdiy Mandiy thraegh Frudiy. HPR his i lang lunoigo gaung bick ta
3 Redou FriiK Emiroce, Bonery Rivulatoun Redou &amp; Onfunumocun, end ot os e dorict
4 cantuneituan af Twitoch ridua. Ploiso luston ta StinkDiwg&#39;s &quot;Untradectuan ta
5 HPR&quot; fur muri onfurmetoun.</code></pre>
<h3 id="the-command">The <em>=</em> command</h3>
<p>This is a GNU extension. It causes <code>sed</code> to print out the line number, followed by a newline. The number represents a count of the lines read on the input stream.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<p>The following example uses the <strong>=</strong> command to print out the number of the last line of the input file:</p>
<pre><code>$ sed -ne &#39;$=&#39; sed_demo1.txt
13</code></pre>
<p>The next example prints out the line number followed by the line. Note how the newline after the number means that it is not on the same line as the text:</p>
<pre><code>$ sed -ne &#39;${=;p}&#39; sed_demo1.txt
13
detail on a topic.</code></pre>
<p>The usual issues about contiguous or separate files apply here and using the <em>-s</em> command line option has the following effect:</p>
<pre><code>$ sed -sne &#39;${=;p}&#39; sed_demo1.txt sed_demo2.txt
13
detail on a topic.
26
contribute one show a year.</code></pre>
<h3 id="commands-that-operate-on-the-pattern-space">Commands that operate on the pattern space</h3>
<p>The following four commands perform actions on the pattern space. Their usefulness can be difficult to appreciate without examples, but we need to know about them, and the other set of <em>hold space</em> commands that follow before we can begin building such examples.</p>
<h4 id="the-d-command">The <em>D</em> command</h4>
<p>This command deletes from the <em>pattern space</em> in a related way to the <strong>d</strong> command. However, it only deletes up to the first newline. Then the cycle is restarted using the resulting <em>pattern space</em> and without reading any input.</p>
<p>If there is no newline the <em>pattern space</em> is deleted, and a new cycle is begun with a new input line being read. Under these circumstances, the <strong>D</strong> command behaves as the <strong>d</strong> command does.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-n-command">The <em>N</em> command</h4>
<p>This command adds the next line of input to the <em>pattern space</em>, preceded by a newline. If there is no more input then <code>sed</code> exits without processing any more commands.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-p-command">The <em>P</em> command</h4>
<p>This command prints out the contents of the pattern space up to the first newline.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-l-command">The <em>l</em> command</h4>
<p>Format: <code>l n</code></p>
<p>This command can be a useful tool for debugging a <code>sed</code> script since it shows what is currently in the <em>pattern space</em>.</p>
<p>The <em>pattern space</em> is &quot;dumped&quot; in fixed-length lines, where the length is controlled by the numeric value of <code>n</code>. There is a command-line option <code>-l N</code> or <code>--line-length=N</code> which provides a value if <code>n</code> is not provided with the command. The default value is 70. A value of 0 prevents line wrapping.</p>
<p>The <code>n</code> option to the command is a GNU <code>sed</code> extension.</p>
<p>The <strong>l</strong> command shows non-printable characters as sequences such as '\n' and '\t'. Each wrapped line ends with a '\' and the end of each line is shown by a '$' character.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<p>Running the <strong>l</strong> command on lines 1 and 2 of <code>sed_demo1.txt</code> with a width of 80 we see:</p>
<pre><code>$ sed -ne &#39;1,2l80&#39; sed_demo1.txt
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases$
shows every weekday Monday through Friday. HPR has a long lineage going back to$</code></pre>
<p>Using the <strong>N</strong> command to accumulate the two lines in the <em>pattern space</em> before dumping it (using the default width) we see:</p>
<pre><code>$ sed -ne &#39;1,2{N;l}&#39; sed_demo1.txt
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that re\
leases\nshows every weekday Monday through Friday. HPR has a long lin\
eage going back to$</code></pre>
<h4 id="example-using-the-pattern-space">Example using the pattern space</h4>
<p>This example demonstrates the use of the <strong>N</strong> and <strong>D</strong> commands.</p>
<pre><code>for i in {1..10}; do
w=$(shuf -n1 /usr/share/dict/words)
w=${w%\&#39;s}
echo &quot;$i: $w&quot;
done | tee /tmp/$$ | sed -e &#39;N;1,5D&#39;</code></pre>
<p>The loop iterates 10 times, using variable <em>i</em>. For each iteration variable <em>w</em> is set to a random word from the system dictionary <code>/usr/share/dict/words</code>. Since many of these words end in &quot;'s&quot; we remove such endings. The result is printed, with the iteration number in front.</p>
<p>The stream of 10 words are sent to the <code>tee</code><a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a> command which saves a copy (this command writes to a file and also copies its input to STDOUT). The file chosen is a temporary file <code>/tmp/$$</code> where Bash replaces the <code>$$</code> symbol with the process id number of the current process.<a href="#fn2" class="footnoteRef" id="fnref2"><sup>2</sup></a></p>
<p>The stream of numbered words is also sent to <code>sed</code> and each line is appended to the <em>pattern space</em>. For input lines 1 to 5 inclusive the <strong>D</strong> command deletes a line from the accumulated lines in the <em>pattern space</em>, and the result is the last 5 lines remain at the end and are auto-printed.</p>
<p>During testing, this type of pipeline can be written to a file and run as a Bash script, or it can be written out on one line, as I normally do:</p>
<pre><code>for i in {1..10}; do w=$(shuf -n1 /usr/share/dict/words); w=${w%\&#39;s}; echo &quot;$i: $w&quot;; done | tee /tmp/$$ | sed -e &#39;N;1,5D&#39;</code></pre>
<p>The temporary file is useful to check the before and after states.</p>
<p>The Bash script discussed here is available as <a href="hpr2011_demo3.sh" title="hpr2011_demo3.sh"><code>demo3.sh</code></a> on the HPR website.</p>
<h3 id="commands-to-transfer-to-and-from-the-hold-space">Commands to transfer to and from the hold space</h3>
<p>The next five commands move lines to and from the hold space.</p>
<h4 id="the-h-command">The <em>h</em> command</h4>
<p>This command replaces the contents of the <em>hold space</em> with the contents of the <em>pattern space</em>. After executing the command the original contents of the <em>hold space</em> will be lost, and the contents of the <em>pattern space</em> will be in the <em>hold space</em> and the <em>pattern space</em>.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-h-command-1">The <em>H</em> command</h4>
<p>This command appends the contents of the <em>pattern space</em> to the <em>hold space</em> preceded by a newline. The contents of the <em>pattern space</em> will not be affected by this process.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-g-command">The <em>g</em> command</h4>
<p>This command replaces the contents of the <em>pattern space</em> with the contents of the <em>hold space</em>. After executing the command the original contents of the <em>pattern space</em> will be lost, and the two buffers will have the same contents.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-g-command-1">The <em>G</em> command</h4>
<p>This command appends the contents of the <em>hold space</em> to the <em>pattern space</em> preceded by a newline. The contents of the <em>hold space</em> will not be affected by this process.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h4 id="the-x-command">The <em>x</em> command</h4>
<p>This command exchanges the contents of the <em>hold</em> and <em>pattern spaces</em>.</p>
<p>The command can be preceded by any of the address types we saw in <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">episode 2</a>.</p>
<h2 id="flags-and-modifiers-we-omitted-earlier">Flags and modifiers we omitted earlier</h2>
<p>When we looked at the <strong>s</strong> command in episodes <a href="http://hackerpublicradio.org/eps/hpr1976" title="Introduction to sed - part 1">1</a> and <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">2</a> we encountered a subset of the flags, and when we were looking at line addresses in episode <a href="http://hackerpublicradio.org/eps/hpr1997" title="Introduction to sed - part 3">3</a> we missed out one of the modifiers.</p>
<p>One of the missing flags to <strong>s</strong> was '<em>M</em>' (and '<em>m</em>' which is a synonym, just as '<em>I</em>' and '<em>i</em>' are) and the missing modifier was '<em>M</em>', and they all affect regular expression matching in the same way.</p>
<p>The '<em>M</em>' modifier/flag stands for <em>multi-line</em> and is useful in the case where the <em>pattern space</em> contains more than one line. It is a GNU <code>sed</code> extension.</p>
<p>The modifier causes '^' to match empty string after a newline and '$' to match the empty string before a newline. There are also special metacharacters which match the beginning and end of the buffer. These are: '\`' for the beginning and &quot;\'&quot; for the end</p>
<p>The following brief examples demonstrate the features of the '<em>M</em>' modifier.</p>
<p>Here we have accumulated two lines in the <em>hold space</em>, which have then been transferred to the <em>pattern space</em>. We use <strong>s</strong> commands (with a '<em>g</em>' modifier, which is superfluous in this example, but useful later<a href="#fn3" class="footnoteRef" id="fnref3"><sup>3</sup></a>) to add square brackets at the beginning and end:</p>
<pre><code>$ sed -ne &#39;1,2H;2{g;s/^/[/g;s/$/]/g;p}&#39; sed_demo1.txt
[
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases
shows every weekday Monday through Friday. HPR has a long lineage going back to]</code></pre>
<p>Remember that there's an extra newline at the start of the <em>pattern space</em> due to the way the <strong>H</strong> command works. This example shows that there is only one &quot;beginning&quot; and &quot;end&quot; in this buffer.</p>
<p>If we then modify both of the <strong>s</strong> commands with the '<em>M</em>' flag/modifier we get:</p>
<pre><code>$ sed -ne &#39;1,2H;2{g;s/^/[/gM;s/$/]/gM;p}&#39; sed_demo1.txt
[]
[Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases]
[shows every weekday Monday through Friday. HPR has a long lineage going back to]</code></pre>
<p>Now '^' and '$' relate to newlines and surround each of the lines.</p>
<p>Now, to indicate the start and end of the buffer we need to use '\`' and &quot;\'&quot;. However, we have a problem since these characters are significant to the Bash shell, so we move to placing these commands in a file called <code>demo4.sed</code>:</p>
<pre><code>$ cat demo4.sed
1,2H
2{
g
s/\`/[/gM
s/\&#39;/]/gM
p
}
$ sed -nf demo4.sed sed_demo1.txt
[
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases
shows every weekday Monday through Friday. HPR has a long lineage going back to]</code></pre>
<p>The <a href="hpr2011_demo4.sed" title="hpr2011_demo4.sed"><code>demo4.sed</code></a> file is available on the HPR website</p>
<h2 id="examples">Examples</h2>
<h3 id="example-1">Example 1</h3>
<p>This example mainly demonstrates the use of the <strong>P</strong> and <strong>y</strong> commands:</p>
<pre><code>$ sed -ne &#39;1,2{s/$/\n-/;P;y/aeiou/eioua/;p}&#39; sed_demo1.txt
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases
Heckir Pabloc Redou (HPR) os en Intirnit Redou shuw (pudcest) thet riliesis
-
shows every weekday Monday through Friday. HPR has a long lineage going back to
shuws iviry wiikdey Mundey thruagh Frodey. HPR hes e lung loniegi guong beck tu
-</code></pre>
<p>Auto-printing is turned off, and the <code>sed</code> commands are all grouped together and controlled by an address range that covers the first two lines of the file.</p>
<p>First an <strong>s</strong> command adds a newline followed by a hyphen to the current line in the <em>pattern space</em>. The following <strong>P</strong> command prints out the line that has just been edited in the <em>pattern space</em>, up to the newline that we just added (so we don't see the hyphen).</p>
<p>Then a <strong>y</strong> command operates on the line, which is still in the <em>pattern space</em>. It changes all the vowels by shifting them to the next in the alphabetic order - 'a' becomes 'e' and so forth. A final <strong>p</strong> command prints the edited line, which now generates two lines because of the newline and hyphen we added at the start.</p>
<h3 id="example-2">Example 2</h3>
<p>Here we use <strong>H</strong> and <strong>G</strong> to make use of the <em>hold space</em> and <em>pattern space</em>:</p>
<pre><code>$ sed -e &#39;1,/^$/{H;d};${G;s/\n$//}&#39; sed_demo1.txt
What differentiates HPR from other podcasts is that the shows are
produced by the community - fellow listeners like you. There is no
restrictions on how long the show can be, nor on the topic you can
cover as long as they &quot;are of interest to Hackers&quot;. If you want to see
what topics have been covered so far just have a look at our Archive.
We also allow for a series of shows so that host(s) can go into more
detail on a topic.
Hacker Public Radio (HPR) is an Internet Radio show (podcast) that releases
shows every weekday Monday through Friday. HPR has a long lineage going back to
Radio FreeK America, Binary Revolution Radio &amp; Infonomicon, and it is a direct
continuation of Twatech radio. Please listen to StankDawg&#39;s &quot;Introduction to
HPR&quot; for more information.</code></pre>
<p>On every line from number 1 to the first blank line the first group of commands is run. The <strong>H</strong> command appends the input line to the <em>hold space</em> with a newline on the front of it. The <strong>d</strong> command deletes the line from the <em>pattern space</em>, preventing auto-printing of it. All other lines outside this address range are printed automatically.</p>
<p>When the last line is encountered the another command group is run. The <strong>G</strong> command appends the <em>hold space</em> to the <em>pattern space</em>, but an extra blank line will have been generated on the front by the first <strong>H</strong> command. The <strong>s</strong> command removes the last newline from the <em>pattern space</em> balancing this addition. The <em>pattern space</em> will then be auto-printed.</p>
<p>The effect will be to take the first paragraph from the text and move it to the end.</p>
<h3 id="example-3">Example 3</h3>
<p>In <a href="http://hackerpublicradio.org/eps/hpr1986" title="Introduction to sed - part 2">Episode 2</a> I speculated on a solution to the problem of joining all of the lines in a text file to make one long line. The following example offers a solution to this:</p>
<pre><code>$ x=$(sed -ne &#39;H;${g;s/^\n//;s/\n/ /g;p}&#39; sed_demo1.txt)
$ echo ${#x}
768</code></pre>
<p>The example runs the <code>sed</code> command in a command substitution expression such that the variable <code>x</code> contains the result (this subject was covered in my episode entitled <a href="http://hackerpublicradio.org/eps/hpr1903_full_shownotes.html#command-substitution" title="Command Substitution">&quot;Some further Bash tips&quot;</a>). The length of the variable is then reported.</p>
<p>The <code>sed</code> command turns off auto-printing. Within the <code>sed</code> script itself the <strong>H</strong> command is run on every line, and this causes every input line to be appended to the <em>hold space</em> with a newline on the front.</p>
<p>When the last line of the file is encountered a group of commands is run. The <strong>g</strong> command replaces the <em>pattern space</em> with the <em>hold space</em>. The <em>pattern space</em> now contains the newline that was appended before the first input line was saved. The first <strong>s</strong> command removes this from the front of the <em>pattern space</em>. The second <strong>s</strong> command replaces all the newlines in the <em>pattern space</em> with a space, thereby making one continuous string. This is then printed with the <strong>p</strong> command.</p>
<p>As a point of interest, the resulting text is the same length as the original, as can be proved by the following:</p>
<pre><code>$ y=$(cat sed_demo1.txt)
$ echo ${#y}
768</code></pre>
<h2 id="quiz">Quiz</h2>
<h3 id="pig-latin-igpay-atinlay">Pig Latin / Igpay Atinlay</h3>
<p>Use the test data in <code>sed_demo1.txt</code> from <a href="http://hackerpublicradio.org/eps/hpr1976" title="Introduction to sed - part 1">Episode 1</a> and, using a single invocation of <code>sed</code>, convert the first line to <em>Pig Latin</em>. The rules of how to generate this are simple in essence, though there are some exceptions (see the <a href="https://en.wikipedia.org/wiki/Pig_Latin" title="Pig Latin">Wikipedia entry</a> for the full details). We will just go for the simplest solution in this quiz, though if you want to be more advanced in your submission please go ahead.</p>
<p>In brief the rules are:</p>
<ul>
<li>Take the first letter of each word and place it at the end, followed by 'ay'. Thus 'pig' becomes 'igpay' and 'latin' becomes 'atinlay'.</li>
<li>Skip 1-, 2- and 3-letter words, since 'a' -&gt; 'aay' is not wanted.</li>
<li>Do not bother about capitals. Ideally 'Latin' should become 'Atinlay', but <code>sed</code> may not be the best tool to use to do that!</li>
</ul>
<p>I will include <em>my</em> solution to this problem in the next episode. I hope you will be able to come up with a much better answer than I do!</p>
<blockquote>
<p><strong>Note:</strong> If you submit a working solution you may be eligible for a prize of some HPR stickers. Send your submission to me. My email address is available <a href="http://hackerpublicradio.org/correspondents.php?hostid=225" title="Dave Morriss">here</a> after removing the anti-spam measures. The competition will close after I have posted episode 5 in this series to the HPR site.</p>
</blockquote>
<h2 id="links">Links</h2>
<ul>
<li><em>Introduction to sed - part 1</em>: <a href="http://hackerpublicradio.org/eps/hpr1976" class="uri">http://hackerpublicradio.org/eps/hpr1976</a></li>
<li><em>Introduction to sed - part 2</em>: <a href="http://hackerpublicradio.org/eps/hpr1986" class="uri">http://hackerpublicradio.org/eps/hpr1986</a></li>
<li><em>Introduction to sed - part 3</em>: <a href="http://hackerpublicradio.org/eps/hpr1997" class="uri">http://hackerpublicradio.org/eps/hpr1997</a></li>
<li><em>Some further Bash tips</em>: <a href="http://hackerpublicradio.org/eps/hpr1903" class="uri">http://hackerpublicradio.org/eps/hpr1903</a></li>
<li>GNU <code>sed</code> manual: <a href="https://www.gnu.org/software/sed/manual/sed.html" class="uri">https://www.gnu.org/software/sed/manual/sed.html</a></li>
<li>Wikipedia entry for <code>sed</code>: <a href="https://en.wikipedia.org/wiki/Sed" class="uri">https://en.wikipedia.org/wiki/Sed</a></li>
<li>&quot;<em>Sed - An Introduction and Tutorial</em>&quot; by Bruce Barnett: <a href="http://www.grymoire.com/Unix/Sed.html" class="uri">http://www.grymoire.com/Unix/Sed.html</a></li>
<li>Wikibooks sed wiki: <a href="https://en.wikibooks.org/wiki/Sed" class="uri">https://en.wikibooks.org/wiki/Sed</a></li>
<li>Example files:
<ul>
<li>Demonstration Bash script: <a href="hpr2011_demo3.sh" class="uri">hpr2011_demo3.sh</a></li>
<li>Demonstration of '<em>M</em>' modifier: <a href="hpr2011_demo4.sed" class="uri">hpr2011_demo4.sed</a></li>
</ul></li>
<li>Wikipedia entry for &quot;<em>Pig Latin</em>&quot;: <a href="https://en.wikipedia.org/wiki/Pig_Latin" class="uri">https://en.wikipedia.org/wiki/Pig_Latin</a></li>
</ul>
<!--
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
-->
<section class="footnotes">
<hr />
<ol>
<li id="fn1"><p>My explanation of <code>tee</code> in the audio was less than clear. I should have said that <strong>everything</strong> sent through the command is written both to the file and to STDOUT. I think the text explains it though.<a href="#fnref1"></a></p></li>
<li id="fn2"><p>If you run the script <a href="http://hackerpublicradio.org/correspondents.php?hostid=225" title="Dave Morriss"><code>demo3.sh</code></a> and try to look at the temporary file you will not see it. This is because the <em>PID</em> generated by <code>$$</code> is local to the process running the script. I have modified the script to report the name of the file to allow you to examine it once it has run.<a href="#fnref2"></a></p></li>
<li id="fn3"><p>I used the '<em>g</em>' flags here just because I used them in the next example, they don't actually do anything. With hindsight, it might have been better if I had removed them in this one.<a href="#fnref3"></a></p></li>
</ol>
</section>
</article>
</main>
</div>
</body>
</html>