hpr-knowledge-base/hpr_transcripts/hpr1204.txt

Episode: 1204
Title: HPR1204: My Magnatune Downloader
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1204/hpr1204.mp3
Transcribed: 2025-10-17 21:34:11

---

Hello, this is Dave Morris in Edinburgh, Scotland. Today I want to tell you about a project
I've been working on recently. Let me start by introducing myself a little bit. I'm
a retired IT manager. I took early retirement a few years ago. I spent most of my working
life helping run IT services for university staff and students in two UK universities, one
of them in Northern England and one in Scotland. I worked as a programmer to start with and then
became a manager. Even as a manager I seem to still be writing programmes. I've always enjoyed
solving practical problems and still do. I've always felt that writing a well-crafted computer
programme is like making a physical object in either wood or metal or whatever. It appeals to
some of the same drives or motivations and can be a satisfying if it comes together well.
I guess this is the motivation that keeps me writing programmes now.
Right, so here's the problem. I'm a fan of the Magnetune service, which is at Magnetune.com.
And I've been buying music from them for a number of years, seven or eight years maybe.
The Magnetune website has got some good interfaces for exploring and downloading the music
and it's got interfaces for browsing as well in the form of various clients. There's also
facilities available in a number of players on Linux and I have direct experience of
some of these. Amorock is one which allows you to browse and purchase, look at artist information
and album details. There's a plug-in in rhythm box, or at least there was, it's currently unavailable
but it's apparently due to return soon. There's also the Gnome Music Player Client which is actually
a front end to the Music Player Demon MPD. That also offers a Magnetune browser. Don't think
you can buy music through it. There's a Magnetune web player which you can access off the
Magnetune site. It's a web-based tool that will let you browse and play and buy stuff.
And it's actually very good. There's also an Android Player which is a fairly basic browser
and player that's been provided by Magnetune and runs on Android 2.0 and up. Of that collection,
the best in my mind is the Magnetune web 2.0 player and it's good for exploring and listening to
music. But none of these interfaces quite do what I want. So I decided to write something of my own.
What I wanted to be able to do was to download Og files, to download the cover art,
to get any artwork that happened to be which is artist details and album details and so forth.
And I wanted to be able to store my music indexed by album name and none of them quite do that
or all of that anyway. Now I currently host my music on an HP per line microserver which I'm
sitting right next to. You can hear the noise of it. I hope you can't. And from there I share the
music across the network and play it to the home network that is and play it with the music player
Demon on my desktop system or wherever I happen to be. I normally keep the album cover image file
and the artwork and any related material in the same director as the album itself. Some clients
quite like that so that they can display the various elements when they're playing things.
So I wanted to be able to download all the bits and put them all in the same place automatically.
Now Magnetune provides an API which is documented on their website. There's a URL in the show notes
which I won't read out here. And I should say actually that this information is only available
to members which is one of the things I've forgotten to mention. You need to be a subscriber to
get much further into Magnetune though you can browse the music without being a member.
Their API offers data in several formats in XML, SQL, Light and ViceQL formats.
So having thought about the design of this I concluded that I didn't want to write anything too
fancy. I didn't want a full-blown application especially since really all I was wanting was a
downloader. So I decided that I would end up writing a collection of scripts being a bit of a
command line guy anyway that appeals. I decided to use the XML data that they offer
and they they have it in various formats one of which is organized by album. This gets updated
about once a week or once every two or three weeks. There's a signaling mechanism that they
offer through a downloadable file which contains a CRC code. And when this CRC changes that means
that the data itself has changed and it can be downloaded. So at the time of writing this,
the time of telling you about this, I simply run this by hand when I receive an email alert from
Magnetune. Now they refer Magnetune. This refers to the albums using a unique key made up from
the artist's name and the album name. And it refers to this as an SKU which I believe stands for
Stop Keeping Unit. They use this as a URL component and it's in XML tags. So I use this to identify
the stuff I download and to keep a simple inventory. So I decided to write some basic scripts.
I wanted one to download the catalog. I wanted something to search and browse the catalog report stuff
back. I wanted something to download an album and I wanted something to unpack the downloaded album
plus other stuff into the target directory. Along the way one of the other goals was to learn more
about manipulating XML data. So I decided to use XXL, the extensible style sheet language
to manipulate this stuff. And this lets you define style sheets for XML data which includes ways
of identifying XML components with X path expressions and of transforming XML with XXLT.
Now this is pretty obscure stuff and I've included a number of links in the show notes pointing to
the resources I use to learn about this stuff.
Naturally I'm keeping my scripts and so forth in a version control system I use Git.
And in order that you can share and follow me in this journey I've put them under the free
Gatorius service which is a hosted Git repository. I chose that because Ken Fallon and I've been
using it for various other things. So it seemed pretty straightforward to continue using it.
In the repository I've also included a set of extended notes as a readme file.
Just as an aside I've written these using one of the common markup languages.
The one I've chosen is called ASCII DOC. It's a simple fairly simple markup language but a bit
more advanced than the markdown language which a lot of people use. Along the way I've generated
HTML from it which is in the repository and also PDF. So if you go to the Gatorius site and you can
see the URL in the show notes you'll find that you can browse the code, look at the readme and of
course as with all of these things you can clone the repository and get a local copy if you're so
minded. So next I want to look at the various scripts that I've ended up building.
When I was thinking of this stage of the podcast I was wondering what was the best method of doing
this. I thought I didn't really want to be talking you through scripts line by line or anything
like that. I'm sure people would be switching off in droves if I did that. I ended up writing some
very brief show notes to go along with the podcast on the website which just summarises the details
of the scripts and put more details in the Git repository. When I was learning the ins and outs
of Unix it was back in the late 80s, early 90s. There was a guy who wrote some very good books
in the O'Reilly series, a guy called Jerry Peak and his technique of explaining shell scripts and
so forth was to just drop the whole lot into it, into a book annotate it and then go through it
line by line. I certainly found I learnt a hell of a lot from him. I owe him a lot actually.
I've got a lot of my knowledge of shell scripting and various other aspects of Unix from him.
So I wondered about doing that but I haven't actually done it. If anybody feels that it would
be helpful, maybe you don't but if anybody did then I'd be prepared to annotate these things,
produce annotated versions of them and put them up in the Git repository. So if you do feel that
would be helpful, let me know. Drop me some feedbacks in me an email. My email address is on the
HPR website. Anyway, getting on with the scripts briefly. The first one is called update albums.
It's a bash script and its purpose is to download a new version of the album catalog from
Magnetune. It gets this as a BZIP XML file. It looks to see whether there's any work to do and it
does it through this mechanism of downloading a little file that contains a checksum of CRC
which it compares with the current version of this file. If there's a difference or if the file
didn't exist at all, it's the first time you've ever run the script, it will go and grab the
catalog. Then it generates a summary of the catalog in a format that's easily searched. The
XML isn't that easily searched so I thought that I would generate a summary of it. Now
just to digress briefly, the summarisation of this XML file is done through a piece of
XSLT which is a method of recognising components of the XML file and displaying them or printing them
out in various ways. The XML file contains a simple loop which cycles through the whole XML and
outputs the album information. Again, I wondered if there was any scope for talking about XSLT and all
of this good stuff. Probably my audience is not going to be that bothered by it and not really
want to get into it in any big way. I only did this myself as a learning exercise because I've
seen this stuff before and wondered what the hell it meant and thought it would be a good opportunity
to find out more. If you feel the same and feel that maybe I could help you in any way to
get you started in this, then let me know because I'd be prepared to maybe do a podcast on
what I've discovered and on the ways of using XSLT. Which isn't a huge lot but it might get you
started that's the way you want to go. Next we have a script I've called report album SKU
which is another bash script. It's really just a wrapper around a bit of XSLT. It takes the
stock unit SKU which is this concatenation of the artist and album as a parameter
and looks up the details in the XML catalog. This is really nothing very remarkable
but it was an interesting exercise in how to do that type of stuff in in XSLT. So to be honest,
I hardly ever use it but I present it to you as a curiosity. Get album is the name of the next script
which is another bash script whose job is to download an album and all of the related files,
the cover images, artwork and so forth. It takes the SKU as an argument and uses it to make
URL for an XML file and this points at all the components and has to be downloaded with
authentication because this is the point at which you're actually buying something.
The script parses this file, it's XML so it uses XSLT to pull out the relevant bits
and it uses it to collect the necessary URLs for downloading the components.
Personally, I only use the org format but there are many other formats available and the script
could easily be changed to collect any or all of the formats. The script records the fact that
this particular SKU code has been downloaded so it isn't collected again in error and all downloaded
files are given names beginning with this code and are stored for the installation phase.
The final component is called install download. This one is a pearl script
which unpacks the downloaded zip file that came as the album which is going to be unpacked to
its final destination. Then it adds the cover images and the artwork to the same place.
I use pearl here because it allowed me to look at the zip file and determine the name of the
directory that was going to be created. This directory name is going to be the name of the album.
I couldn't find an easy way to do that in a bash script so I used pearl so I could
work out what was going to happen, make it happen and then drop the other files into the same place
and thereby achieve the goal of getting everything in the same directory.
So that's the system. There are a few developments happening,
particularly as I said at the beginning, the music that I download is actually stored on my
server so it's available to my home network. I've written scripts that synchronize the music
which has been downloaded to my workstation up to the server and make sure it gets backed up and
so forth. The other thing I've done is to make a sort of queuing system or wishlist system.
I've got a 200 gigabyte download limit per month on my broadband contract so I try not to
download music too often and thereby avoid contention with the rest of the family.
My queuing system is used to keep a list of stuff that I've listened to and like
and would like to buy and I simply take the top element from this queue every so often
and feed it to the download and installation scripts. Do this maybe once a week once a
fortnight something like that. In the future I expect to be refining these scripts and making
them less vulnerable to errors. For example I found a few cases where Magnetunes XML is not valid
and this causes the XSXLT proc tool that does the XML parsing to fail. So I'd like to be able to
recover from such errors more elegantly than I'm doing now. I'd also like to be able to
deal with interrupted downloads and that type of thing. The software as it stands
is a bit basic I guess it's not as resilient as I'd like it to be. The other thought is that
at some point I might well want to rewrite the whole thing in a different language. Maybe make
it into a single script. I don't know we shall see. So let me finish with a I guess a disclaimer.
I have no links to Magnetune other than being a contented customer so this is a technical
chat rather than an attempt to sell anything. Honestly. Okay, cheers.
You have been listening to Haker Public Radio or Haker Public Radio does our. We are a community
podcast network that releases shows every weekday Monday through Friday. Today's show,
like all our shows, was contributed by a HPR listener like yourself. If you ever consider
recording a podcast, then visit our website to find out how easy it really is. Haker Public Radio
was founded by the digital dog pound and the infonomicum computer cloud. HPR is funded by the
binary revolution at binref.com. All binref projects are crowd sponsored by lunar pages.
From shared hosting to custom private clouds, go to lunar pages.com for all your hosting needs.
Unless otherwise stasis, today's show is released under a creative commons,
attribution, share a like, lead us our lives.