Files
hpr_website/www/eps/hpr2736/hpr2736_full_shownotes.html

125 lines
11 KiB
HTML
Raw Normal View History

2025-10-28 18:39:57 +01:00
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta name="author" content="Dave Morriss">
<title>Response to show 2720 (HPR Show 2736)</title>
<style type="text/css">code{white-space: pre;}</style>
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link rel="stylesheet" href="http://hackerpublicradio.org/css/hpr.css">
</head>
<body id="home">
<div id="container" class="shadow">
<header>
<h1 class="title">Response to show 2720 (HPR Show 2736)</h1>
<h2 class="subtitle">Some suggestions on how to improve a Bash script</h2>
<h2 class="author">Dave Morriss</h2>
<hr/>
</header>
<main id="maincontent">
<article>
<header>
<h1>Table of Contents</h1>
<nav id="TOC">
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#general-issues">General issues</a></li>
<li><a href="#specifics">Specifics</a><ul>
<li><a href="#line-50">Line 50</a></li>
<li><a href="#line-68">Line 68</a></li>
<li><a href="#line-88">Line 88</a></li>
</ul></li>
<li><a href="#conclusions">Conclusions</a></li>
<li><a href="#links">Links</a></li>
</ul>
</nav>
</header>
<h2 id="introduction">Introduction</h2>
<p>On January 4<sup>th</sup> 2019 Ken Fallon did a great show entitled <a href="http://hackerpublicradio.org/eps/hpr2720" title="Download youtube channels using the rss feeds"><em>hpr2720 :: Download youtube channels using the rss feeds</em></a> where he presented a Bash script called <a href="/eps/hpr2720/hpr2720_youtube-rss.bash" title="youtube-rss.bash"><code>youtube-rss.bash</code></a> for managing YouTube downloads through RSS feeds.</p>
<p>Ken said he welcomed constructive feedback <font size="+3"></font></p>
<p>When I see a Bash script these days I usually find myself looking for ways to rewrite it to make it fit in with what I have been learning while doing my <em>Bash Tips</em> sub-series. Either that or I find its got some better ideas than Ive been using which I have to find out about.</p>
<p>I also spend time going over my own old scripts (I was writing them in the 1990s in some cases) and trying to incorporate newer Bash features.</p>
<p>Suffice it to say that I spotted some areas for improvement in Kens script and thought this might be the way to share my thoughts about them. Were low on shows as I write this, so that gave me more motivation to make a show rather than add a comment or send Ken an email.</p>
<p><small><b>Apology:</b> Im still suffering from the aftermath of some flu-like illness so have had to edit coughing fits out of the audio at various points. If you detect any remnants then Im sorry!</small></p>
<h2 id="general-issues">General issues</h2>
<p>There are a few uses of <code>while</code> loops in the script that I would be inclined to rewrite, but as it stands the script does what is wanted without these changes.</p>
<p>I loaded the script into Vim where I have the <code>ShellCheck</code> tool configured to examine Bash scripts. It found many issues, some fairly trivial, but a few were a bit more serious.</p>
<p>Click on the thumbnail to get an idea of what <code>ShellCheck</code> reported.</p>
<p><a href="hpr2736_vim_with_shellcheck.png"><img src="hpr2736_vim_with_shellcheck_thumb.png" alt="Vim with ShellCheck" /></a></p>
<p>Here are some of the reported issues:</p>
<ul>
<li><p>[34] <code>ShellCheck</code> doesnt like <code>&quot;${logfile}&quot;.$(/bin/date +%Y%m%d%H%M%S)</code> but would be happy with <code>&quot;${logfile}.$(/bin/date +%Y%m%d%H%M%S)&quot;</code>. Its not clever enough to know that the <code>date</code> call will not generate spaces.</p></li>
<li><p>[39,48] Using <code>read</code> without the <code>-r</code> option means that any backslash characters in the input will not be handled properly so <code>ShellCheck</code> reports this.</p></li>
<li><p>[39,48] Personally, as alluded to above, I now try to avoid loops of the form:</p>
<pre><code> command_generating_data | while read -r var; do
do_things
done</code></pre>
<p>That is because the loop body is run in a sub-process which cannot communicate back to the main script.</p>
<p>I would much rather write it as:</p>
<pre><code> while read -r var; do
do_things
done &lt; &lt;(command_generating_data)</code></pre>
<p>because the loop body <em>can</em> affect variables in the main script if needed.</p></li>
</ul>
<p>In Kens script these are not serious issues since these are <code>ShellCheck</code> warnings and no loop needs to pass variables to the main script. However, as far as the loops are concerned, it would be all too easy to enhance them in the future to pass back values forgetting that it will not work. I have made this mistake myself on more than one occasion!</p>
<h2 id="specifics">Specifics</h2>
<p>I want to highlight a few instances which I would be <b>strongly</b> inclined to rewrite. I have referred to them by line within the <code>youtube-rss.bash</code> downloadable from show 2720.</p>
<h3 id="line-50">Line 50</h3>
<pre><code> if [ &quot;$( grep &quot;${thisvideo}&quot; &quot;${logfile}&quot; | wc -l )&quot; -eq 0 ]</code></pre>
<p>This <code>if</code> command uses <code>grep</code> to search <code>$logfile</code> for <code>$thisvideo</code> and wants to know if there is (at least) one match, so the output is passed to <code>wc</code> to count the lines and the resulting number from the command substitution is checked to see if it is zero i.e. there was no match.</p>
<p>However, since <code>grep</code> returns a <code>true/false</code> answer for cases such as this, the following would do the same but simpler:</p>
<pre><code> if ! grep -q &quot;${thisvideo}&quot; &quot;${logfile}&quot;</code></pre>
<p>This time <code>grep</code> performs the same test but its result is reversed by the <code>'!'</code>. The <code>-q</code> option tells it not to produce any output.</p>
<h3 id="line-68">Line 68</h3>
<pre><code> if [[ ! -z &quot;${skipcrap}&quot; &amp;&amp; $( echo ${title} | egrep -i &quot;${skipcrap}&quot; | wc -l ) -ne 0 ]]</code></pre>
<p>In the script shown in the notes for 2720 theres a <code>skipcrap</code> variable, but this has been commented out, so <code>ShellCheck</code> objects to this of course. I re-enabled it for testing. In general though this <code>if</code> command is unnecessarily convoluted. It can be rewritten thus:</p>
<pre><code> if [[ -n &quot;${skipcrap}&quot; ]] &amp;&amp; echo &quot;${title}&quot; | grep -E -q -i &quot;${skipcrap}&quot;</code></pre>
<p>The changes are:</p>
<ul>
<li><p>Rather than checking if the length of string <code>skipcrap</code> is zero then negating it, better to use the non-zero test from the start <code>-n</code>.</p></li>
<li><p>Since this <code>-n</code> test needs to be in <code>[]</code> or <code>[[]]</code> but the rest does not we organise things that way.</p></li>
<li><p>The second half of the test after <code>&amp;&amp;</code> can be a pipeline since the result of the final part is what is considered in the test. In this case non-standard <code>egrep</code> is replaced by <code>grep -E</code> and the test is performed quietly with only the true/false result being of importance.</p></li>
</ul>
<p>A further change might be to do away with <code>echo &quot;${title}&quot; | grep -E -q -i &quot;${skipcrap}&quot;</code> and use a Bash regular expression. Here <code>grep</code> is being used as a regular expression tool, comparing <code>title</code> with <code>skipcrap</code>. The latter is a regular expression itself (since thats what <code>grep -E</code> needs), so everything is ready to go:</p>
<pre><code> if [[ -n &quot;${skipcrap}&quot; &amp;&amp; $title =~ $skipcrap ]]</code></pre>
<p>This passes the <code>ShellCheck</code> test and when I run it from the command line it seems to work:</p>
<pre><code>$ skipcrap=&quot;fail |react |live |Best Pets|BLOOPERS|Kids Try&quot;
$ title=&#39;The Best Pets in the World&#39;
$ if [[ -n &quot;${skipcrap}&quot; &amp;&amp; $title =~ $skipcrap ]]; then echo &quot;Crap&quot;; else echo &quot;Non-crap&quot;; fi
Crap</code></pre>
<p>The downside is that the test is case-sensitive whereas the <code>grep</code> version used <code>-i</code> which made it case-insensitive. This can be overcome in a slightly less elegant way perhaps by forcing everything to lowercase for the test:</p>
<pre><code> if [[ -n &quot;${skipcrap}&quot; &amp;&amp; ${title,,} =~ ${skipcrap,,} ]]</code></pre>
<h3 id="line-88">Line 88</h3>
<pre><code> cat &quot;${logfile}_todo&quot; | ${youtubedl} --batch-file - --ignore-errors --no-mtime --restrict-filenames --format mp4 -o &quot;${savepath}&quot;&#39;/%(uploader)s/%(upload_date)s-%(title)s⋄%(id)s.%(ext)s&#39;</code></pre>
<p><code>Shellcheck's</code> complaint about this is that its a <em>useless cat</em>. Heres the full message:</p>
<pre><code>SC2002: Useless cat. Consider &#39;cmd &lt; file | ..&#39; or &#39;cmd file | ..&#39; instead.</code></pre>
<p>The commands can be rewritten as one of the following:</p>
<pre><code> ${youtubedl} --batch-file - --ignore-errors --no-mtime --restrict-filenames --format mp4 -o &quot;${savepath}&quot;&#39;/%(uploader)s/%(upload_date)s-%(title)s⋄%(id)s.%(ext)s&#39; &lt; &quot;${logfile}_todo&quot;
&lt; &quot;${logfile}_todo&quot; ${youtubedl} --batch-file - --ignore-errors --no-mtime --restrict-filenames --format mp4 -o &quot;${savepath}&quot;&#39;/%(uploader)s/%(upload_date)s-%(title)s⋄%(id)s.%(ext)s&#39;</code></pre>
<p>See the <a href="https://en.wikipedia.org/wiki/Cat_(Unix)" title="Wikipedia article on `cat`">Wikipedia article on <code>cat</code></a> for more information (also <a href="https://en.wikipedia.org/wiki/Cat_(Unix)#Useless_use_of_cat" title="Useless Use Of Cat">UUOC</a> and <em>demoggification</em>).</p>
<h2 id="conclusions">Conclusions</h2>
<p>Bash is powerful but does some things in an obscure way.</p>
<p><code>Shellcheck</code> is both a very helpful tool, helping to catch scripting errors, but can also be something of an irritant when it nags about things. I am preparing another HPR episode about its use with Vim as well as the possible use of other checkers.</p>
<p>Apologies to Ken if it seemed as if I was making excessive criticisms of his scripts. What I intended was <em>constructive criticism</em> of course.</p>
<h2 id="links">Links</h2>
<ul>
<li><a href="http://hackerpublicradio.org/eps/hpr2720">HPR episode 2720 “<em>hpr2720 :: Download youtube channels using the rss feeds</em></a></li>
<li><a href="/eps/hpr2720/hpr2720_youtube-rss.bash"><code>youtube-rss.bash</code></a></li>
<li><a href="https://en.wikipedia.org/wiki/Cat_(Unix)">Wikipedia article on <code>cat</code></a>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Cat_(Unix)#Useless_use_of_cat">Useless use of <code>cat</code></a></li>
</ul></li>
</ul>
</article>
</main>
</div>
</body>
</html>