356 lines
32 KiB
Plaintext
356 lines
32 KiB
Plaintext
|
|
Episode: 1430
|
||
|
|
Title: HPR1430: thebestofyoutube.com download script
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1430/hpr1430.mp3
|
||
|
|
Transcribed: 2025-10-18 02:13:07
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
And that's something I'm calling right now.
|
||
|
|
Hi everybody, my name is Tempa, and we're listening to another episode of Hector Public Radio.
|
||
|
|
In today's episode, we are going to be talking about what I hope will be a series, usual
|
||
|
|
bash scripts, or indeed not useful bash scripts, or just bash scripts that you put together,
|
||
|
|
to make your life a little bit easier. We've had a few of these on before, so maybe this is
|
||
|
|
something that people can take and run with. Anyway, why did I write this bash script in the
|
||
|
|
first place? Well, I like to keep abreast of what's going on on YouTube in the way of videos and
|
||
|
|
internet memes and that sort of thing, and to do that, I've traditionally used the site bestofyoutube.com,
|
||
|
|
which is a curated list of YouTube videos that are popular, and basically every day, you know,
|
||
|
|
few videos might go up, probably about 20 a week, something like that, videos are linked in.
|
||
|
|
They're not in more than, you know, five on a page with a word or with a embedded iFrame link
|
||
|
|
to YouTube, and yeah, I find it's a good place to get videos, be warned if you're going to
|
||
|
|
that website, it's kind of addictive, so they've kind of got quite a lot of videos there. But
|
||
|
|
anyways, I normally watch those videos, there's only a few, so I tend to just view them in my browser
|
||
|
|
more or less. But recently, I'm confined to bed quite a lot and holding the laptop isn't conducive
|
||
|
|
to resting, so I wanted to put these videos onto a laptop. But actually, I wanted to put them
|
||
|
|
on the central NAS so I could watch them on TV if I was lying downstairs or put them on the tablet
|
||
|
|
if I was upstairs in bed. So I knew I had a quick look at view source of that page and I found out
|
||
|
|
that there's actually not that much to the page per se, it's just, it's got a whole go of links,
|
||
|
|
and if you search there, you'll find frame that there's a link to a each section, each video
|
||
|
|
five per page has an iFrame with height and source and the frame, border, yeah, yeah, yeah,
|
||
|
|
linking to YouTube. And then as we all know, the YouTube, if you click on YouTube link,
|
||
|
|
then you will get to YouTube and it will be something like, oops, posing, the video started there.
|
||
|
|
You will get URL like www.youtube.com for such watch, question mark, v for video equals,
|
||
|
|
and then a 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 character, what appears to be ASCII, A to Z,
|
||
|
|
uppercase, lowercase with dashes and underscores. So my script is supposed to go to those,
|
||
|
|
the best of YouTube, download the videos, and find out what the videos are and then download them
|
||
|
|
directly from YouTube, seems simple enough. Okay, but there's a few gotchas, there are only five
|
||
|
|
videos on the per webpage and I wanted to get the last 10 pages. So okay, now we're talking about
|
||
|
|
a loop that's going to loop between, that's going to have to go to each of the pages. The good news
|
||
|
|
is the best of YouTube has a very predictable page style, so it's the best of YouTube, index.php,
|
||
|
|
question mark, page equals, and then 1, 2, 3, 4, 5, 6, 7, 8, going up. So new videos are always added
|
||
|
|
as video 1 on the page 1, so that was pretty cool. So I was able to put a thinking of doing a loop
|
||
|
|
then, all I needed to do is find the YouTube embed, remove that part, and then I would know what
|
||
|
|
the video ID was, basically I'm trying to find the video IDs, make a list of those, and then use
|
||
|
|
what is a brilliant piece of software called YouTube-DL, which is the YouTube Downloader,
|
||
|
|
and that is, let me see who makes this, there's no help page, there's no information about it on
|
||
|
|
the help page, but actually, if I go, why, oh, you, you, you, you, you, you, you, be, dash, dash,
|
||
|
|
version, might show me the name, yeah, it just shows me what the version number is,
|
||
|
|
2013, 11, 29, anyway, if you install YouTube-DL, I think it's available in most repos,
|
||
|
|
you can actually upload it updated with the dash, you, it's got a self uploading capability
|
||
|
|
with it, it gives you loads of options, you can do plenty of lists, it's not just limited to YouTube,
|
||
|
|
you can get videos from Ted and Canon Academy, and that sort of thing as well, so it's a pretty
|
||
|
|
cool Downloader, so therefore I didn't need to rewrite all this, so initially my first attempt
|
||
|
|
of this was just a basic script that looped through the best of YouTube pages, and then as I found
|
||
|
|
each of them a Downloader, but I wanted to get a little bit better than that, so I wanted to be
|
||
|
|
able to separate it out, that's when I ran it, I created a new directory, this particular
|
||
|
|
year, I'm now a server-created new directory, and then each of the videos would be put in there,
|
||
|
|
starting with 0, 0, 0, 0, 0, and then 0, 0, 0, 1, the oldest one having the newest,
|
||
|
|
having the oldest number, so 0, 0, 0, 0 would be the oldest one, one would be the second from
|
||
|
|
oldest, and so forth all the way down along, then I wanted it to download with various different
|
||
|
|
things, so let's just keep running through the script here, first five lines are just basically
|
||
|
|
commons, commons, what it is, what it does, and it's been released under create commons 0, because
|
||
|
|
I just really don't do not that worried about it, I added a few variables here at the top just so
|
||
|
|
that we can do some cool stuff, max to download is equal to 10, so if you wanted to download all of
|
||
|
|
them you'd change that to a thousand or something, and if it would, that's the number of pages
|
||
|
|
on the best of YouTube to download them all, if you go to the best of YouTube you'll find out,
|
||
|
|
you know, it goes from one page one, dash dash next, and that'll give you the final number whatever
|
||
|
|
that is there, so you can put that in there to run us the first time, and then the save path,
|
||
|
|
which I have mount media videos, tv, youtube, best youtube, so I've got loads of space and then as,
|
||
|
|
and this will take loads of space, because I'm downloading the highest quality video,
|
||
|
|
and then I'm going to have a saveger, so saveger is going to make is the first part of that is
|
||
|
|
the variable save path, so save path equals quote slash mount blah blah blah double quote,
|
||
|
|
and now I'm going to reuse that save path again for the saveger for this one, so the saveger
|
||
|
|
for this run is going to be saveger equals quote, dollar sign, open curly brackets save path
|
||
|
|
close curly brackets, forward slash dollar sign, open bracket, the back slash date, space, dash U,
|
||
|
|
space, the plus sign, the percent capital Y sign, dash, percent lowercase m, dash, percent lowercase
|
||
|
|
D, the underscore percent uppercase H, dash, percent uppercase m, dash, percent uppercase S,
|
||
|
|
uppercase Z, or Z, underscore, percent A, close brackets double quote,
|
||
|
|
now what that does there is, is concatenates three different things together,
|
||
|
|
the save path which we had before, so they reason up what the dollar brackets instead of just dollar
|
||
|
|
save paths, the dollar curly brackets instead of the dollar save path is because when you're
|
||
|
|
concatenating a string, it tells bash that it separates more cleanly out the variables, so it's
|
||
|
|
not assuming that they following characters after that are part of the variable, so it's kind of
|
||
|
|
good practice to do that anyway, so I tend to do that a lot, then I have the forward slash, so I've
|
||
|
|
got my best YouTube path forward slash, and then I'm going to run a command, and the output of
|
||
|
|
that command will be the directory, and what I'm using is the date command, and the back slash
|
||
|
|
in front of the date, on alias is the date, so I have an alias for the date set up actually
|
||
|
|
to convert it to ISO 8604, 8601, and which is your month day on that hour's minute seconds,
|
||
|
|
blah blah blah, but just to make sure that it's not alias by anybody else, you've run that command,
|
||
|
|
and the dash you specify in the date command, these are all parameters of the date command, so anything
|
||
|
|
run in between the bracket, the dollar open bracket and close bracket is run as a separate command
|
||
|
|
outside of this bash thing, so you're kind of, it's a bit like the back ticks, but the beauty of
|
||
|
|
this format is you can nest them, so you're going to have a dollar open bracket date, dollar,
|
||
|
|
open bracket, but some other command close bracket, so it would first run the nested one inside,
|
||
|
|
and then pass that to date, and then pass that out, and pass it, and so forth, don't know if I'm
|
||
|
|
following you anyway, they obviously the links to this, this will be included in the show notes
|
||
|
|
for this episode, so basically all I'm doing is formatting the date with the year month day on
|
||
|
|
score hours, minute seconds, in Zulu time, just the Z is to denote to myself that it's UTC,
|
||
|
|
so we don't have any of this silly time-changing nonsense, and the percent A is just gives the day,
|
||
|
|
so I'm going to choose to enter it there is a Friday, and notice today when using my daughter's laptop
|
||
|
|
that despite my locale being set to English, it picked up her locale from somewhere, so I'm going
|
||
|
|
to need to investigate that, and it was manda. So in Dutch, the Dutch actual name, but that doesn't
|
||
|
|
actually matter for my purposes, as it is just a signal for me. Okay, that's line 8,
|
||
|
|
this is going to be the speed of which we get through, this is going to be a long show,
|
||
|
|
anyway, it might be time for a cup of coffee soon. The next line line number nine is to make it
|
||
|
|
directly, and I use the dash P option, and then I just put in saved here, so that is the variable
|
||
|
|
I just created two few moments ago, it will make that directly, including the entire part
|
||
|
|
that it doesn't exist, and then I make another variable called log file equals, and again, dollar
|
||
|
|
save path forward slash, and I'm calling it download.log, so this is where I'm going to keep my history,
|
||
|
|
I don't want to be re-downloading movies over and over again, or you have videos over and over again,
|
||
|
|
so this is where I'm going to keep my history, and then I do a check, and line 11 to 15, to make sure
|
||
|
|
that that log file exists, and if it doesn't, it creates it, so there's actually no real pointing
|
||
|
|
in having that, because later on, I'm just going to overwrite that, anyway, so I'm going to
|
||
|
|
take that entire block out, as we speak, live editing, and there is gone, so we now have the path
|
||
|
|
to the log file, so now we need to gather the list from YouTube deal, and to do that, I'm using
|
||
|
|
my favorite command, sequence sq space 1, the number 1 space dollar max to download, and that is,
|
||
|
|
that is there where I'm now going to say I want to my script to loop through, it basically generates
|
||
|
|
the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, so that is the 10 pages that I'm going to
|
||
|
|
to use, or try and download, now many of you will say, but can bash already includes an option to
|
||
|
|
put in the sequences from what to what, but I think it's clear to you sequence, and to be honest,
|
||
|
|
I like the ability that sequence has to use a dash w, to create a list of numbers, so from 1 to 100,
|
||
|
|
if you use sequence dash w, 1 to 100, it will put in 0, 0, 1, 0, 0, 2, and then up to 15 will be 0,
|
||
|
|
1, 5, etc, up to 99, and then 100, so that you get nice spacing. One thing that I wish
|
||
|
|
sequence would do, and it doesn't, is do the reverse, so go, if you put in sequence 99 space 1,
|
||
|
|
that it would go 99 and count down, and I also think it's missing the ability to do
|
||
|
|
A to Z, upper case lower case that would kind of cool, but for the purposes of what we want to do
|
||
|
|
today, it's fine. So sequence 1 to max downloads, which is 10, in this case I'm going to pipe that,
|
||
|
|
that's the vertical up and down symbol, in or around the, above the entry key, shift entry,
|
||
|
|
above the entry key on us layer keyboards, and then I'm doing a big while loop, while read
|
||
|
|
video page now, what that's doing is, while it's going to read in, whatever reads in from whatever,
|
||
|
|
in this case, whatever's coming in from the pipe from its point of view, the other side of the pipe,
|
||
|
|
we have sequence pumping in one, the numbers 1, 2, 3, 4, 5, 6, 7, 9, 10, and then it's going to read
|
||
|
|
those, and then it's going to put each of them in turn into a variable called video page,
|
||
|
|
and then we have a new line, and we have the word do, this is required, the structure of the
|
||
|
|
while loop in batches, while read something, some variable name do, then the loop, and then to tell
|
||
|
|
batches that it's finished, you type done. So now we're in the meeting partages, excuse me,
|
||
|
|
and here we're going to use another massive big command execution thing, I don't know the correct
|
||
|
|
term for that, somebody can correct me, there will be super verb, I'm going to run a nested command,
|
||
|
|
and the output of which I'm going to put into the variable this video list, so the whole command
|
||
|
|
is this video list is equal to dollar sign open brackets, clause brackets, and the bit inside
|
||
|
|
of that is I'm using wget dash dash quiet, and then the URL, which it happens to be HTTP
|
||
|
|
on bestofyoutube.com for such index.phpquest remark, page equals the dollar sign,
|
||
|
|
open curly brackets, video page, dollar sign, and quotes, so that's URL, so in the first instance
|
||
|
|
that's going to be page equals one, and the second instance page is going to be equal to two,
|
||
|
|
third, page equals three, and so forth. Then after the URL we have dash capital O, which says
|
||
|
|
the output and space dash, so that's the wget's part of this, you know, this absolute
|
||
|
|
mod program, whatever, let's script a script, that's what I'm a little worried about for you,
|
||
|
|
so that's the wget's parts done, what I'm doing is just get me that web page, and don't say if
|
||
|
|
it just pipe it out to standard in and standard output. Now, when your reading files are doing
|
||
|
|
anything, so if you're doing a cache file or less a file or whatever, you can put it in a file
|
||
|
|
name, but the general convention for standard in and standard out in for most command line
|
||
|
|
programs is the dash, just a dash symbol, so keep your eye out for that and mentally substitute it,
|
||
|
|
and then we put immediately after that we put a pipe symbol, so we're piping it from wget into
|
||
|
|
the next command, and the next command of course is grip our favorite, and we look for the exact
|
||
|
|
string www.youtube.com forward slash embed forward slash. Now, I know what you are all going to say
|
||
|
|
that this is a web page, it is possible that that web page would have a w new line, actually no
|
||
|
|
account account, it has to be ww, it has to be in the line straight across, it can't be
|
||
|
|
separated, even the new line would be considered a space and that would create a valid URL, so
|
||
|
|
grepping is absolutely fine in this case, so what grep will do is it'll take all that
|
||
|
|
junk stripping out all the garbage and it'll give you a big line with youtube embed, one thing
|
||
|
|
that it could do is there's no reason that there is new lines at all in that, so a web page could
|
||
|
|
be simply one big long line, and grepping it this way will only take the grepping it this way will
|
||
|
|
return that one big long line, so that is essentially a flow in this script, but again it's quick and
|
||
|
|
dirty, because they happen to put it on different lines, but I could, I could work around that, but I'm
|
||
|
|
not gonna bother, haha, anyways, the next line is again grep has done its work, and I don't need
|
||
|
|
to specify standard input and standard output, because grep uses standard new input and standard
|
||
|
|
output anyway, and then I have a pipe symbol piping it on to the next command, which is
|
||
|
|
said, which is the stream editor, and what I'm going to do there is filter out everything
|
||
|
|
up until youtube.com forward slash embed, so I've got a big long line, presumably I've got
|
||
|
|
you know several lines of this stage actually, because grep, I will have five lines actually,
|
||
|
|
because there are five of these per page, so now this has returned five of these per page,
|
||
|
|
and I'm piping those into said, so said, and the actual command here is said single called s,
|
||
|
|
and then I'm using the tic-tac-tool octotor-pash symbol pound sign, whatever you call that,
|
||
|
|
where it happens in your jurisdiction, and I'm using the pound sign as a delimiter,
|
||
|
|
as a delimiter delimiter, we're winding, I'm using the pound sign, no I'm not using the pound sign,
|
||
|
|
I'm using the octoporp as a delimiter, because the forward slash is going to be used in the youtube
|
||
|
|
URL, so therefore I want to use a different delimiter to the traditional forward slash, and said
|
||
|
|
allows you to do that, anything after the s, an actual fact can be used, will be assumed to be
|
||
|
|
the delimiter you're using, okay, so said, space, single called s, octotor-p, the sheriff character,
|
||
|
|
which looks like a little roof, or a greater than less than symbol angled at 90 degrees pointing
|
||
|
|
up, over the six in the us keyboard, the dot asterix, www.ucube.com forward slash embed,
|
||
|
|
then two octotor-p's, octotor-p, octotor-p, octotor-p, and what that is doing is it is removing
|
||
|
|
everything up to and including the www.ucube.com forward slash embed, and what that leaves me with is
|
||
|
|
the URL screaming, yes, the identifier that youtube uses to identify videos, pretty cool eh,
|
||
|
|
however you're going to have a lot of crap after that, which you want to clean out and get rid of,
|
||
|
|
so I have said pipe that out to standard output, which is no problem for said, because it's,
|
||
|
|
I don't need to specify the dash again, because said by definition also pipes to standard output,
|
||
|
|
and then I use my friend, oct, to strip out the rest, oct, space, dash, capital F, space,
|
||
|
|
single called, double called, pipe symbol, question mark, single called, now I'll come back to that in
|
||
|
|
the second, space, single called, open curly brackets, print, space, double called, hdp,
|
||
|
|
colon, foresight, foresight, youtube.com forward slash watch, question mark, v equals, double
|
||
|
|
called, dollar sign one, closed curly brackets, single called, okay, so when um grip and said are
|
||
|
|
finished with their list, there are five lines passed through standard input, standard input,
|
||
|
|
which um begin with the ten character unique identifiers for youtube videos, now I don't want to
|
||
|
|
go cutting, I could use the cut command there, but I don't want to because uh older videos have
|
||
|
|
less characters and newer videos will have more characters, so we uh we will continue to use
|
||
|
|
those characters, so I want to tell oct that it should split and it should split the line that
|
||
|
|
it's receiving, based on the delimiter that is a double called or a question mark, so sometimes
|
||
|
|
a youtube URL will have a question mark and then a start and a stop time signifying where it should
|
||
|
|
where the video should um should cut in from on cut out to, so with the oct dash F you're able to
|
||
|
|
terminate very cleanly the youtube identifier and then oct will let's just print the correct URL
|
||
|
|
with the unique identifier out, so at the end of this I will have five URLs, so now I'm already in a
|
||
|
|
while loop five URLs and the outside while loop will continue to from sequence one to three four five six seven eight nine ten
|
||
|
|
producing each hammers goes the five new URLs from each that gets from each of the pages, so in this case
|
||
|
|
you will have 50 URLs, okay now within that main loop what I'm going to do is I'm going to do a
|
||
|
|
little bit of checking to see if I've downloaded this video before and to do that I'm going to have a
|
||
|
|
for loop, so um this for loop has actually going to be working on a variable uh list that is stored
|
||
|
|
in this video list which will be one big long string of these youtube URLs containing five of them
|
||
|
|
put together and for this the loop goes for this video in dollar open bracket echo this
|
||
|
|
dollar this video list close brackets call on do and do the stuff and then done, so the for loop is
|
||
|
|
very similar to the while loop for something in something do done and the while loop and the for
|
||
|
|
loop I tend to use while loops where I can but while loops, especially with playing ff with
|
||
|
|
converting an ff mpeg and in player there are slight very differences on the way they handle variables
|
||
|
|
and how they're executed within bashly kind of got to be a bit careful about that if if your loops
|
||
|
|
are not particularly working the way you think they should be you might want to switch from a while
|
||
|
|
to a for loop just just saying I'm not going to go into any more than that to be honest um because
|
||
|
|
I've come across it once or twice again as I say usually with ff mpeg or um or um um player that they
|
||
|
|
tend to prefer for loops for loops instead of while loop okay so what I'm doing here yes there are
|
||
|
|
definitely other ways of doing this this is why this series is series is not so much about
|
||
|
|
this is the best script ever but I'm actually interested in people taking this script and
|
||
|
|
niceying it up and doing the whole show basically on it or on something else but you know this is
|
||
|
|
one way of doing it as as Pearl says there are multiple ways of doing it so this is this is just one
|
||
|
|
way of doing it works for me it was quick and dirty script and it's pretty clean now and typically what
|
||
|
|
I tend to do anyway when I'm doing scripts is I'll put a one liner in and then the one line tends to get
|
||
|
|
more complicated and then I'll just put a very base text file so it's only a five liner and then
|
||
|
|
I'll leave it for a while and then I'll think okay and then you need to nicely that up a bit and then
|
||
|
|
I'll run it for a period of time and then sometimes that says it's done and then other times you
|
||
|
|
know I'll find out a cleaner way of doing it in in other scripts and then I'll go back and refractor
|
||
|
|
the scripts but anyway this is this is what it is so now within the for loop uh what I'm going to
|
||
|
|
get is um is what I want to do here is just do a little quick check in my log file to see if I've
|
||
|
|
downloaded this video before and if I haven't uh if I have I'll just skip over it if I
|
||
|
|
haven't I'll add the URL to a log file of videos to download so the script is if open square bracket
|
||
|
|
so for video in uh echo video list executed so it'll for video in uh youtube.com watch
|
||
|
|
v equals blah blah blah blah oh and five of those it'll do if open square brackets double
|
||
|
|
quote and here we are dollar open square open bracket and close brackets and we're going to run
|
||
|
|
a command in there and if the result of that command is equal to zero uh then do something so
|
||
|
|
what the command is inside there is grip this video log space log file so what I'm going to do is
|
||
|
|
look in the log file to see if I already have this URL if I have then I'm going to pipe that
|
||
|
|
into the word count command with the dash l option so what that does is it lists it only lists the
|
||
|
|
lines uh the number of lines that it's on not any of the other other junk that's really handy
|
||
|
|
so that's going to give me a number of hits in the file more than likely there's only it's either
|
||
|
|
going to be zero or one yeah so if that in double quotes is he is dash equal zero then I know I've
|
||
|
|
found a new video if it doesn't then I don't so if grip this video log file work out equal is
|
||
|
|
equal to zero then I echo out the statement found new file this video just gives myself some
|
||
|
|
feedback and the next line is I echo this video and I append it to a file called dollar log
|
||
|
|
file underscore to do and the redirect redirect is the greater than greater than symbol it will
|
||
|
|
send that to the log file and if you can remember from before the log file was created
|
||
|
|
by save path and save path was slash man blah blah blah blah blah blah forward slash download
|
||
|
|
dot log and now this is going to be underscore to do now if I had just used log file underscore to do
|
||
|
|
it would have put that into it wouldn't have found that variable so I would have just put it into
|
||
|
|
a blank file somewhere called log file to do probably in my home directory and you would just
|
||
|
|
spend whatever directory I would be in at the time of running this is create this log file there
|
||
|
|
somewhere and you're busy over on the other side which is an amount on the nas yet this file is
|
||
|
|
running locally on whatever the machine is you're running so that's why in closing they log file
|
||
|
|
variable in the dollar in the curly brackets allows bash to oh this is a variable but I'm going
|
||
|
|
to concatenate this this variable together so that's what that does that's pretty cool so essentially
|
||
|
|
I now log it my if statement then I have an else and then I echo out already downloaded
|
||
|
|
this video so as I'm going through it it'll say found new video youtube.com blah blah blah blah
|
||
|
|
or if it didn't find it it would or if it hasn't found it in the log file it would go already
|
||
|
|
downloaded youtube URL and then I close the for loop I close the dawn I close the
|
||
|
|
a for sorry I close the if statement with an f i if I always thought that was a bit odd to put on you
|
||
|
|
that is I close the for loop and I close the while loop so now I have a list of all the file
|
||
|
|
URL all the URLs that I want to download but they are in that are on order yes they are
|
||
|
|
they are so they're from page one item one so the first line of the file is page one item one
|
||
|
|
item two item three item four item five line six is page two item one two three four five six
|
||
|
|
and so on and so forth so what I want to do now is I want to do the actual downloading part which
|
||
|
|
I'm using youtube dash dl to do but I want to do that in reverse order so instead of
|
||
|
|
catting my file I'm going to tack my file which actually cats out the file in reverse order
|
||
|
|
brand command and you see what they did there tac is the reverse of cat wow pretty cool those
|
||
|
|
unique guys are so funny but it is actually a very useful command when you when you use it you
|
||
|
|
tend to use it a lot very nice way to do the first thing I do is I check to see if my log file
|
||
|
|
underscore to do exists because if there are no new videos there won't be a log file to do file
|
||
|
|
and I can just simply finish this thing not download it so that's pretty cool all in all which
|
||
|
|
means I can script this I can put this into a cron job or something run it every day or something
|
||
|
|
and download download the videos the only thing I'm doing is hitting the best of youtube but the
|
||
|
|
pretty static pages anyway so they're probably cached and then I'm not pulling anything from youtube
|
||
|
|
except when I'm downloading the video once and hey this has been a good niche citizen
|
||
|
|
but presuming I do find a log file a file of stuff that I need to download or that I want to download
|
||
|
|
I take the last file I tack that file out into a pipe and I take the last file and first and then
|
||
|
|
the second last second and the third last third and so forth and I'm going to pipe that simply
|
||
|
|
into the youtube dl command if you type youtube dash dl dash dash help you'll get a full list of
|
||
|
|
youtube dash dl commands it that commands changes so quickly the recommend you use youtube dash dl dash
|
||
|
|
you regularly which will allow you to update the file directly from their website I personally wouldn't
|
||
|
|
recommend that per se normally for a script but in this case I think a I think they're doing a
|
||
|
|
very good job and you can open the file up yourself and verify that it's not doing anything naughty
|
||
|
|
so after reading the help file I decided the following were the commands that suited myself the most
|
||
|
|
and therefore I will read them off so I'm going to use I'm going to use the long as poor the long
|
||
|
|
options here as opposed to the short ones I tend to use that in batch files anyway because it makes
|
||
|
|
it a lot more readable and by long I mean sort of instead of ls in so instead of dash h dash dash
|
||
|
|
hlp so that that tends to be the long format okay so it's youtube dash dl and the first parameter is
|
||
|
|
dash dash batch file space dash so we now know that the dash is except standard input and batch
|
||
|
|
file tells youtube downloader that instead of expecting the commands being put on the command line
|
||
|
|
that they're going to be redirected in from something else now the batch file could be also a
|
||
|
|
text file I could simply have pointed it to the log file directly and use that but problem with that
|
||
|
|
doing it that way is you guessed it yes it wouldn't be in the right order because I want to number
|
||
|
|
these zero to ninety nine or whatever the numbers are with the all with the highest number being
|
||
|
|
the newest video why just call me crazy then we have dash dash ignore dash errors if you didn't
|
||
|
|
use that used youtube if it came across a neural message in all that it wasn't able to download the
|
||
|
|
video then it would simply just stop and you would be left not having downloaded the remainder of
|
||
|
|
the videos that happens quite a lot on youtube because it doesn't have enough and you're not allowed
|
||
|
|
to view it in this area or it was temporarily suspended by the user or whatever so it's best to
|
||
|
|
put that in I found myself that mostly when I run this is one or two videos that are can download a
|
||
|
|
big deal I watch directly on youtube dl anyway then I have the dash dash no dash m time no
|
||
|
|
modification time so what that one does is it just uses today's time so that when I do a sort by
|
||
|
|
time in that directory I will get them not in the order that they were uploaded to youtube but the
|
||
|
|
order on which I downloaded them the dates of which I downloaded them which is fine next parameter
|
||
|
|
is dash dash restrict dash filenames which if the characters use extended unique code characters
|
||
|
|
or just limits the character set down to a z the uppercase lowercase while I don't have a problem
|
||
|
|
with people using a unicode and yes crayon my nas does support unicode as do all my computers
|
||
|
|
because we're living this side of the year 2000 yes it's just simpler all around far too restrict
|
||
|
|
the filenames that way you don't guess on printable characters on play out devices which don't
|
||
|
|
necessarily cause problems it just leads to a less than optimal user experience then the next
|
||
|
|
switch is dash dash max dash quality which gives me the max quality that you have for this one
|
||
|
|
youtube dl also allows you to download all of the option all the versions which is actually pretty
|
||
|
|
cool as well and the dash dash format mp4 is my preference format for downloading simply because
|
||
|
|
the extremer I have downstairs does come out before the web m format was standardized so it can't
|
||
|
|
play those yes I know like I'll convert them but I don't want to if it can't find an mp4 it will
|
||
|
|
download the web m so you're not going to miss anything so next option is dash dash right dash
|
||
|
|
auto dash sub right auto sub directory and then you have to use dash o parameters and what that
|
||
|
|
does is it allows you to use sort of youtube variable names to give stuff like title extension
|
||
|
|
IDs and so forth and what I'm using there is the first variable is my own variable which is dollar
|
||
|
|
saved here so that tells it where to dump in the base directory then I have quote forward slash
|
||
|
|
percent open bracket auto number close bracket s dash percent open bracket title close bracket
|
||
|
|
s dash percent id close bracket s dash percent open bracket ext close bracket s and close double
|
||
|
|
quote well well that that will do is create a save in my saver it'll create the auto number so
|
||
|
|
the first video that it downloads will be zero zero zero zero and you specify the number of
|
||
|
|
zeros I didn't bother except of the default with a dash and then the title as you would see it if
|
||
|
|
you're browsing on youtube like cat does funny thing with dog whatever dash and then the youtube's
|
||
|
|
id which is the id of the video number which is kind of handy if you go back to it and you see
|
||
|
|
that video I want to keep that and you paste it into google and you can get the into youtube you
|
||
|
|
can get the direct youtube link and then the extension web m or mp4 or whatever and that basically
|
||
|
|
creates all of those and then once it's finished downloading that I cat they log file underscore
|
||
|
|
to do and a redirect redirect into the log file and then I remove the log file underscore to do
|
||
|
|
and the fi finishes the download section of the of the script and that's pretty much it gets
|
||
|
|
something from a web page downloads it I'm going to put this into chrome and I never going to have
|
||
|
|
to worry about it again and I can watch those videos on my leisure on my laptop on my tablet or
|
||
|
|
on the extramar downstairs and that was it I won't say that this is the most pretty of scripts
|
||
|
|
but it is nonetheless a script to as nightwise so so eloquently puts it getting
|
||
|
|
errrr
|
||
|
|
ironically now completely lost but nightwise normally says yes and all your brody's screaming
|
||
|
|
yes getting technology working for you and that's what it's doing it means I don't need to worry
|
||
|
|
about it anymore it's just automated one thing of course it will do is it'll fill up my disk so I
|
||
|
|
do need to keep track of that so I'll probably not run it from chrome I'll probably just
|
||
|
|
for this period just run it manually right now okay so that was this that was an amazing amount
|
||
|
|
of time to talk about what I thought I was only going to cover in a few seconds I thought I
|
||
|
|
would be able to record more shows tonight but I doubt that that is going to happen and now my
|
||
|
|
hour is up so it means I can get out of bed for 10 minutes and walk around okay folks tune
|
||
|
|
in tomorrow for another exciting episode of hakarababiliq
|
||
|
|
already you have been listening to Hacker Public Radio at Hacker Public Radio does
|
||
|
|
aren't we are a community podcast network that releases shows every weekday on
|
||
|
|
day through Friday today's show like all our shows was contributed by a HPR listener like
|
||
|
|
yourself if you ever consider recording a podcast then visit our website to find out
|
||
|
|
how easy it really is Hacker Public Radio was founded by the digital dog pound and new
|
||
|
|
phenomenal computer cloud HPR is funded by the binary revolution at binref.com all binref
|
||
|
|
projects are proudly sponsored by linear pages from shared hosting to custom private clouds
|
||
|
|
go to lunar pages.com for all your hosting needs unless otherwise stages today's show is
|
||
|
|
released on the creative commons attribution share a like lead us all license
|