Files
hpr-knowledge-base/hpr_transcripts/hpr4417.txt

243 lines
20 KiB
Plaintext
Raw Normal View History

Episode: 4417
Title: HPR4417: Newest matching file
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4417/hpr4417.mp3
Transcribed: 2025-10-26 00:29:10
---
This is Hacker Public Radio Episode 4417 for Tuesday 8 July 2025.
Today's show is entitled, Newest Matching File.
It is hosted by Dave Morris and is about 22 minutes long.
It carries an explicit flag, the summary is, writing a function or script to find a given
file.
Hello, welcome to Hacker Public Radio.
This is Dave Morris with a somewhat delayed show.
Today I'm talking about a script, actually several scripts that are written that do pretty
much of the same thing, but different ways, and the show is called, Newest Matching File,
which is about writing a functional script to find a given file.
So to start with, several years ago, I wrote a script of a form, a task I needed to perform
it every day, he pretty much find the newest file in a series of files.
This point I was running a camera on a Raspberry Pi, which was attached to a window and viewed
my background.
I was taking a picture every 15 minutes, giving them the name to turn in the date and time,
and storing the minute directly.
And it was useful to be able to display the latest picture.
Since then I've found that searching for newest files is a useful thing to do in many
contexts.
I run a random recipe chooser that we're going to eat each week, and I've done a show
on this in the distant past, and so I get the thing to run and put an image in the
clipboard and send it, and then send it to the Telegram channel that like the AI use.
Another use is generating a weather report from WTTR.in and sending it to Matrix.
I also find that I've made a screenshot, but I obviously want to find it as they tend
to be created with dates and times on, and then putting in the clipboard so I can send
it to the weather.
I was making it for all, indeed, putting it into a show or something of that sort.
Of course, I could just use the same name every time I make these files, rather than
impacumulating the whole pile of them with based apps and stuff.
But I often want to look back through such collections.
I'm concerned about the files that are accumulating in an unwanted way, and I usually am.
I write chron scripts where you run every day and delete this one by some material rather
than three months old than a week or whatever.
So the original script was written as bash function, which was loaded at log-in type function
is called newest matching file, it takes two arguments.
First is a file glob expression to match the file I'm looking for, and if you listen
to others of my HBO shows on bash stuff you all know what I'm talking about, it's the
thing you use in LS or whatever to say I want the files up again with this, but I'm not
going to want to end it like you use a star, a rasterous commit, or you can also use question
marks.
There are other bits of blog syntax that I have covered in other shows.
The other argument is an optional directory to look for the file, and if you omit it,
then the current directory would be used.
So the first version of this function was a bit awkward since it used a for loop to scan
the directory using the glob path to find the file, but I didn't like it because when
you do a glob search in bash commands in echo or whatever, then if there's no match with
the blog, then it will just return the thing you typed in, the actual argument to whatever
it is you did.
So you did echo and then some glob thing, the answer you'd get back would be the very
same thing you just typed, meaning print funding, and if I like that I always thought that
was a bad piece of design.
You can make it behave differently by using the option no glob, and I first sat in the
notes and did a show, maybe one or two shows on the subject a few years ago, so you can
use that to turn that, this feature or not.
The only trouble is that if your script crashes, it will leave no glob on because it's not
a local to the script, it's a global across your whole shell version.
So later on I upgraded this bash version by using a pipeline with find command, quite
like using find tab thing, done it a lot in other scripts, once you get to get your head
around what you can do with find it's very powerful.
So the script is in the notes here, the improved one using find, and it's also on GitLab
which I'll refer to at the end of the notes, let's end the show, Git is in the notes
here, and you can read through it, I'm not going to read it to you because it's not
more boring than that, hearing scripts read out or with characters and stuff I find, but
I will do a quick explanation of each of the chunks, because it's a function, and it's
actually in a file called newest underscore, magic underscore, file underscore, one dot
sh, because there's another one that I've been playing with now, and it's on GitLab.
In order to use that function, you need to load it or source it, or declare it, I guess
you could say, and you can either do dot full stop space, newest magic file, blah blah blah,
and the name of the file, but the full stop or dot is a shorthand version of the command
source, SOUR, so to have a prefer.
The other version of the bash then uses find regular expressions, which is quite a nice
feature and powerful, but not powerful enough for my, my jokes, but you might think differently,
but I actually prefer the one I'm talking about here, so it's an explanation, function
begins with that sort of declaration, but she's using the word function, so there are
alternative ways of doing this in bash. Following that function line, there are two lines
of getting local, this is where it's defining variables, which local functions, they only
last as long as the file is running, and these are to hold the arguments.
The first one is called glob underscore pattern, and you would put space to receive a glob pattern
like example, and the notes screenshot underscore 2025, for height and zero, for height and
asterisk.pndg, so pre-opus, I hope.
The second argument is called DUR, and it will hold the directory that you're going to
just next in the function, there's an if statement which checks that they're the right
number of arguments. So, must have one argument, but it can also have two, but no more, no less,
so a little tip writes, and there's an echo command there with an error, and it writes to
standard error, STD, ERR, it tends to be written to standard error, and the way it's done
is by appending greater than the ampersand to the end of the echo that goes to this error
channel. That's in order for any output interpreted as being the answer, it's a name, and it
also scripted exits with value 1, or talk about that. Then the next bit, another if which
checks that directory exists, is directory, and the author bought with an error message
in the same sort of way, if it doesn't. Then another local variable called newest file is defined.
It's good practice not to create global variables in functions, unless you need to do that,
because they leak into the core environment, you can confuse things, might otherwise write
something, it already exists there, for example. So, the variable newest file is then set to the
result of a command substitution, and in that's the command substitution, the text actually
wrapped over two lines with the backslash on the end of the first line to be continued. So the
find command's first thing in the pipeline, and it's searching the target directory, basically,
and then it gets the argument hyphen max.1, which limits the search to that direction, doesn't
descend into any sub-correctries. Such pattern is defined by the option hyphen name, then in
double quotes, dollar-globe pattern, that's in case there are spaces in your pattern mark, and
you can do that. So, the next option is hyphen type f, that it's searching to file, and the last
option to find is hyphen print f, print f, lets you print things about the file with the current
file, that's being examined. The following print f, there's double quoted string, which contains
sent capital T at. Now, what that means is having found a file, report its modification time as a
number of seconds is the unique epoch, that's the sort of start time unique situation,
the 1070 or something. Anyway, it's just a number, and that every file will have a different one,
and they will represent the two represent their ages. So, it's a larger number if the file is
older, I just read the notes here, and then this is followed by space, and then full path to the file,
which is obtained with sent key, and then backslash and get new line after each, which should be a
viral path. So, then we pipe that into sort, and because each file path is preceded by a number,
you're going to sort, sort, sort by the numbers in ascending border, then said it's the next thing
in the pipeline, and what it does is it uses hyphen n as its first, and what that means is don't print
anything unless it's passed for explicitly, see what that means in a minute. So, this is then followed
by hyphen e and expression, which people call said program, I think I do anyway. So, what that
contains is dollar sign, the first dollar in that, in the quote, that's a quoted string of the
wire system to do, and writing sent expression, what that means is at the point at which you reach
the last line of the file, or the street data, do the follow, you know, all lines accept the last one,
and then when it finds the last line, the leading maritime is removed, and that's done with the
expression des for substitute slash dot backslash plus space slash slash, which means everything,
all characters up to an including the first space just delete, and then that's followed by semicolon
and p, it means print the line you find. So, no other lines print up that last one, I've got to
say that after the dollar sign, there were color graces, which it says trigger on the last line,
and then do the stuff in the graces. Okay, so that's the end of the pipeline. So, the intention is that
it will generate a file name, which is the newest file, of course, it's sorting them in that order,
or it will return nothing, because the search to match anything, but ultimately a bit of the
script is the test to see if newest file is empty. So, the Python N in doggle square brackets is
checking to see if it's if it's not empty, and if it is not empty or isn't empty, if you like,
then the code there will print the contents of Lewis file using a print dev command,
so if there is something to print, then print it, and then the whole thing ends with a return 0,
which means true success. So, success is generated by the tonic, yes, at this point, whether it
found a file or not, but it takes it, so we've returned one way, there is a content, that's a
pretty common convention. So, the function is designed to be used in commands or other scripts,
and I've given an example, which is the definition of an alias, and the alias is called
copy screenshot, in order to create one, you do this in a file that gets run log in time, alias
base, called screenshot equals, and then a quoted string, the quoted string contains the commands,
command, really, I guess, so in this case, the alias stands in for, and in this case, it runs
XClip, which is the thing for manipulating the clipboard, and it says, iPhone selection space
clipboard, then iPhone T image slash ENG, so it's expecting an image of that type, then
iPhone i will be followed by a file, the file name is generated by running in a command substitution,
newest matching file, then you notice that it's backslash in front of the dollar, before the
the dollar parenthesis, that's because this is an alias definition, and if you didn't do that,
then the bash would assume that you were trying to substitute something in there, at the time,
you were defining the alias, whereas actually what you want to do is to put in result of
newest matching file at the point at which the alias is called. The newest matching file is given a
document, a cloud pattern, a screenshot, underscore, asterisk.fng, and it's going to look
in tilde slash pictures, slash screenshots, and tilde, if you know, it's whole pictures,
right to it, and they're called screenshots. That loads that into the clipboard, and then you
could paste that into whatever you need. Put the screenshot like social media client all over.
This doesn't cater for or newest matching file returning a return enough, but it's
difficult to make alias is really resilient. You would turn to a function or script to do that,
but it works for me, but it was really just a demonstration of how to use it. So I wrote a
pearl alternative, I wrote a pearl alternative to this script, because I, you know, I'm okay with
bash first, but it caused me problems when I was trying to use it in conjunction with pd menu,
the thing which gives you menu features and runs commands, shell commands for you, according to
which menu you choose or swap change in the menu. pd menu uses the born shell, and rather than
trying to hack it to stop it doing that new bash instead, I decided about youth pearl. It's just
the entertaining to learn how to do that. So that script is available in the notes, and on get up,
and I will do a quick summary of it for you right now. This is pretty straightforward pearl script,
it's run out of an executable file, you need to put it in the file and then set the execute bit
at the beginning of this, so called shipanges hash, the exclamation mark, and in this case it's
the part at end, and then follow it with curl, and that end using ends to run the thing you want to
run, so I could put it to the end. Then first few lines of the script. First one is to define which
version of pearl to use, 5.40 is the latest version, not just at least, because it doesn't have to be
5.40, if you were only going to run this, and you don't have that installed, then you can drop
number to what you do have. The thing that says use open is all about making sure that all IO
the script forms is using UTF data characters, so that allows you to code characters into the
accents on them, and so forth. Then the use commands find which modules use, and what will
be in type of library, and CWD is then the provides functions for determining the path name,
current working directory, and file find rule gives the module which binds the tools
associated with file system, which is similar to find bash find command with more features.
I'd define a variable called probe, dollar probe, which contains the name of the script,
the terminal name, not full path of script, that's just useful for you to say below on script
and there's a problem. Then it comes to a bit that finds variable dollar rejects, and it is set
to shift and shift the use away, which perl grabs the an argument from an online, as it does
in vertical dollar rejects, I should have said things that give you dollars are in code,
are scalers, simple variables, simple strings, simple numbers, so forth. Then the second argument
is option, and if permitted is set the current working directory, so the use shift to load a variable
or dollar doer, and but if shift returns nothing, because there is no second argument, the double
slash operator in perl will info thing that follows it, and it was the sensors which is called to get
CWD, which turns the current working directory. There's a test seen that dollar rejects is defined,
and the command dies, used, terminates script, the error message, that's the way things are done,
the wonder waste things are done in, and go, let me run the search itself using
fire find rule, and results from that search are added to the array at files. Now,
at is this the situ, I think it's called to put a character on the front of the name, in
close that it is a array, or list, the call is like weird in that there are multiple methods
being called in a chain, the hyphen greater than sign is a sort of join joiner for each of the
methods. So the method file sets up a file search, then it's followed by a file passes on
results to name, and name is called with the regular expression, I won't go into what that
precisely means, but basically it's using regular expression on the perl register expression
the former, and of course that's the thing that does the search to checks every file to see which
one's match. Max depth follows that set to one, whether it's going one level or two to send below
the top of the block, we don't go into subject, same, same as the name, bash find.
The last method is in which to find which directly to search, a map of that, search runs,
and if the search is one productive, if the array is empty, the script ends with a die,
unsuccessful search that follows, and set, if it was successful, then files array is sorted,
and this is done by comparing modification times files, and the arrays reordered such that the
youngest or newest file is sorted first, a strange less than equals greater than operator checks if
the value of the left upper end is greater than the right upper end, and if it is then conditional
kind of, so it's one of the things you've seen, and I've only really seen it in the sort function,
but maybe it's user's. The finally, the script reports the file, which is the newest file,
that's the first element of the array, which is turned with the expression dollar files,
named the array, square bracket zero, that's the element of it, there are other ways of doing
that, but turns, so we are chose. So this script can be used almost the same way as the bash
variant, and the only difference is that the patent used to find the file, or the last file,
is a power rigorous version, so it's a lot more powerful, potentially, if you want to get into
that sort of stuff, which I do of course. I keep this script in mind, the bin directory, and that
is where I've got things that can be involved, just by talking, the name palm, as long as they're
they've got executable settings. It's a convention, tilde slash bin is added to the path,
in Linux, in those cases, so it's good, because I'm lazy and I didn't want to type the full name,
every time I call this, I also created a simlink to it called NMF, simlink is a link,
back to the directory entry that says I'm pretending to be a file, but the real file that I'm
representing is over there, I think it's a path of the file that's represented. So my example
shows same as the previous one, except that this time I'm using the simlink NMF to look up the
look file, and the expression that follows NMF is a pearl rigor expression, so it differs in as much
as the underscore is followed by dot asterisk, which means any character zero or more times,
then that's followed by backslash dot png, and backslash dot says, although that the dot is a
meta character meaning any character, we escape it so that it just represents that particular character
that the dot. Now it's not time to be digging deeply into a pearl rigor expression to
all right, like I'm maybe need to move to that, but I'm not going to need to talk about really a lot.
So the conclusion is that both both examples, both cases, it's simply, you know, the files are
actually the pattern, you accumulate them, and in the bash case, you get a modification time to them,
files are sorted by modifications time, or one means to another, and the one the lowest time is
the answer, so the bash version just has to remove the modification time for printing. So this
algorithm could be written in many ways, probably clever ways that I've done, and other languages
might do a better job, and so I'm going to try rewriting it in other languages as a sort of
useful yardstick to get to cryptos. I don't use very much, just see if you do the same
what you come up with. There's some references to we could be here about glob patterns,
a couple of my shows and way back, talk about globs and stuff, also a link to my GitLab repo,
where I keep all the various scripts that are working for the purposes of making HVR shows,
hopefully find that useful. Okay, that's it then, I hope you find it interesting, and I'll catch you later.
You have been listening to Hecker Public Radio at Hecker Public Radio does work.
Today's show was contributed by a HVR listener like yourself, if you ever thought of recording
podcast, and click on our contribute link to find out how easy it really is. Posting for HVR has
been kindly provided by an honesthost.com, the internet archive and our syncs.net.
On the Sadois stages, today's show is released under Creative Commons, Attribution 4.0 International
License.