- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
431 lines
39 KiB
Plaintext
431 lines
39 KiB
Plaintext
Episode: 2852
|
|
Title: HPR2852: Gnu Awk - Part 16
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2852/hpr2852.mp3
|
|
Transcribed: 2025-10-24 12:13:19
|
|
|
|
---
|
|
|
|
This is HPR Episode 2008-152 entitled Genoaq Part 16 and is part of the series Learning Ork
|
|
It is hosted by Dave Morris and is about 43 minutes long and Karim an exquisite flag
|
|
The summer is winding up the Genoaq series
|
|
This episode of HPR is brought to you by archive.org
|
|
Support universal access to all knowledge by heading over to archive.org
|
|
forward slash donate
|
|
Hello everybody and welcome to Hacker Public Radio. This is the 16th and final episode of
|
|
the Learning Ork series and with me tonight I've got Robert, also known as Be Easy. Hi, yeah.
|
|
Hey, how's it going, Dave? Great to be talking to you in person for once.
|
|
Yeah, yeah. This is so weird isn't it? We've been doing this now for a fair number of years.
|
|
I made a note when we started 2016 so that's a good three years so that we've been doing this.
|
|
We've never spoken to each other just emailed and whatever. It's really nice doing it this way.
|
|
It is and actually now that I have mumbles set up on this on my desktop here,
|
|
maybe I'll have an opportunity to join the community meeting every once in a while.
|
|
Oh, that would be lovely. Yes, yes, the more the merrier really and get different opinions on stuff
|
|
rather than just can and myself going on and on and boring ourselves or what I'm boring the audience or whatever.
|
|
Well, I personally really like the consistency of having you guys do it but every time there is a
|
|
special guest that comes on it is it does break it up and it makes it more interesting so. Yeah,
|
|
yeah, it's nice having just different outlooks and different voices on isn't it? Yeah, I agree.
|
|
So I thought we ought to start by talking about why? Why are we finishing this? Why are we winding up
|
|
this series? We've we've not reached the end of the walk but I think personally that it we've
|
|
probably reached the sort of practical limit on this. What's your feeling? Yeah, well, I think for
|
|
for my personal use, we've probably reached my practical and my knowledge of it maybe two or three
|
|
episodes of mine ago and maybe two or three episodes that you've done to just because we started
|
|
going into some of the edge cases and some of the more the non-standard functions, if you will.
|
|
And I think it's great to learn that stuff and for me, if I get to a situation where I need to go
|
|
into one of those rapid holes, I always have the documentation but it is good to have a recorded
|
|
message now to to to be able to go back to. Yeah, yeah, I think so. My thinking was that most people
|
|
use ork, maybe in the same way as they use said, just as a thing that takes a piece of text
|
|
coming in from one direction, does some stuff to it, filters it in some way and passes it onto
|
|
the next step. And as we're just on the edge of getting into talking about
|
|
ork has a full-blown language, which it is, it is though I would I would maintain that it's it's
|
|
shows its age quite a lot and it's maybe not the best thing to be using for for a lot of functions
|
|
but it's brilliant, absolutely brilliant for for doing the stuff that we've been talking about
|
|
pretty much up to now. I agree and when you think about when ox started and the type of power it
|
|
could give before some of these other languages have come around, it really was a really unique tool
|
|
in the toolbox for a really long time. But since I've been using computers really heavily,
|
|
it was great to learn, but yeah, it is I think the way you said it was perfect when you're talking
|
|
about how, you know, if it's in the middle of another command that you're piping around,
|
|
it is great to use ork. And actually I've been in a little mode right now where I've been trying to
|
|
simplify and simplify and simplify and sometimes I've been taking out ork and putting in cut
|
|
when I don't need ork just because, you know. Yeah, yeah, I know it'll be. You can do things like
|
|
just feed it into ork and then output field one. Whereas you didn't really need to use ork
|
|
ork is not particularly heavy, but it's heavier than say cut where cut would enable you to just
|
|
chop out that one field, you know. So yeah, I know what you mean. And then the same thing with said,
|
|
I think and we're going to go into this in a little bit, but some of my favorite projects were
|
|
a combination of set and knock together for cleaning up files. Yeah, yeah. So should we go on
|
|
to that? I think we've probably made the point that we feel that we've taken it as far as we really
|
|
should. And as you said, if you need more, the manual is actually very, very good and there's lots
|
|
and lots of pointers to it and you can find your way through it quite easily. So I would suggest
|
|
going there. What we wanted to do was talk about how we got into using ork our personal experiences.
|
|
Do you want to start that off? Yeah, I can. Well, I think it all really starts with my experience
|
|
getting into getting into Linux, which I guess I started on Ubuntu 1104 a little while ago. I guess
|
|
that's 2011. When I was given a computer that had Ubuntu on it and it was the only computer I had
|
|
at the time. And so I had to learn it. But then it turned into learning some of the things that you
|
|
learn on the internet and when it came up to work sticking with Linux. And then I found myself
|
|
into some really interesting projects where I had to do data cleaning. And some of my favorite things
|
|
that I had to do were around cleaning up dirty like CSV files, other types of text files were set
|
|
in knock were really handy. And one thing that you could rely on being on a computer would be
|
|
all can said. And so that I was able to do that without having to bring on a whole bunch of other
|
|
programming languages, especially on when I'm on some projects where I'm using computers that
|
|
are not mine. And so it's been really great for things like that. I was just looking at one of my
|
|
favorite ones. I just did a little bit of history search and I was working for a big pharmaceutical
|
|
company on this on this project. And I was looking for all of the words in a certain CSV file
|
|
that were uppercase and turning them into lowercase. And you know, I just found I just found one of my
|
|
scripts where it was just several lines of of that like cleaning up cleaning it up and say cleaning
|
|
up and all cleaning up and say cleaning up and all just and then go back and back and back and back
|
|
until we got the the finalized versions of the files because you know sometimes you can't control
|
|
the data input that you get. All you can do is deal with it when you when you have it. Oh yes,
|
|
yes. So yeah, that's my that's my probably my biggest experience with it. But I've been using it on
|
|
using AUK on and off. Well, I guess almost every day at a certain level just you know, just to do
|
|
things here and there since 2011. But I've really been moving to other programming languages to do
|
|
more of it more recently and we'll talk about that later. Yeah, that's your experience. Yeah, okay.
|
|
Well, I go way back. I was working at a local university here just to just a mile or so down the
|
|
road from about 1981 through till I retired 10 years ago. And there was one point where we had
|
|
a Vax cluster running Vax VMS operating system and that was a pretty good. They used to call it a
|
|
mainframe though. Some people would call it a mini, but you know in today's terms it was a
|
|
a number crunching centralized machine that certain people got access to. But we had very little
|
|
in the way of text processing features within it. So you had to write a program to do things. There
|
|
weren't many tools, but there was a move because it was very popular in the university scene across
|
|
the whole of the world really, particularly in the US and in the UK. And so people were actually
|
|
moving Unix tools onto it. And I saw that this was happening and found that GNU AWK was available
|
|
quite a different from what we have now, but in the 80s and also said it was available. And so
|
|
I installed them on this machine and was able to use them for some of the sort of applications
|
|
that I had a particular thing that I did. And I really got to be careful. I don't go on and on
|
|
about this because it was a thing I did for many, many years and I could talk to you to death of it.
|
|
Was we didn't have a way of creating accounts for students as they came into the university.
|
|
So new students would come along and they'd, oh I need to use the computer for my course and
|
|
requires me to do for trying or something. So they'd have to walk to our computer center and say
|
|
can I have an account please? And some guy would sit at a terminal and type in all the details
|
|
and stuff. And it got the number of people who needed this increased as the years went on.
|
|
And it became an enormously laborious process. So I took on the task of automating it by getting
|
|
the data from the student registration computer and using it as the basis of code which would
|
|
generate accounts for them. But the data quality, as you were saying before, was pretty bad.
|
|
And I don't know how they managed to get it quite so bad, but you know you'd get people coming in
|
|
and there would be student records of becoming in that said they'd just been born this year.
|
|
Which is fairly unlikely for university students. Somebody'd forgotten which year to type in.
|
|
They've maybe got the date of birth right, but they've forgotten which year.
|
|
And so spotting things like this, flagging them back and fixing them was a major thing.
|
|
And all can said we're just fantastic for that. And dealt with that really, really well.
|
|
And then the data was passed on to other programs that we developed for doing the
|
|
nitty gritty of account creation. So yeah, and then a bit later on, I won't go on to
|
|
a great length here, because I'm talking about 28 years of doing this sort of stuff.
|
|
Wow. I bet I'm not going too far. But in the late 80s and early 90s, we started moving towards
|
|
Unix. And we had a whole range of Unix versions. This was really before Linux was coming on to
|
|
the scene. And Auckland was there, of course. And was getting very much used. We were learning how
|
|
to do pipelines. And we weren't using bash at that time. I don't think it existed in the earlier days
|
|
that we were using the Born Shell and the C shell and the TCSH shell and so on. But you still
|
|
use your org and said there. So certainly did a lot of that for all manner of uses.
|
|
And later on, we moved to Linux quite extensively with a lot of servers running
|
|
initially with Fedora and later Red Hat. And of course, Quark was very much the thing to use.
|
|
And I ended up in some cases teaching people a bit about how to use them.
|
|
Particularly regular expressions. People didn't seem to grasp the concept of regular expressions,
|
|
which very much are being in an awkward state. And so that's something that actually
|
|
got me into it as well as the regular expressions. I do have a background. I know you do a lot of
|
|
prod. I have a little bit of pearl experience. And that's what got me really into regular
|
|
expressions before I got into Linux. And so it was a little bit easier for me to be able to go into
|
|
some of that stuff. But yeah, for a lot of people, it really is just a foreign language.
|
|
It's hard to wrap around your head around. Oh, yeah. It's really hard to read and understand.
|
|
When we moved to Unix systems, there was a point at which I had an old trick's workstation on my desk.
|
|
And we ran our mail system. We had a mail system which was private to just us in our department.
|
|
And I used to get a lot of email because I had configured various of the service to send me
|
|
reports in emails. I would get thousands of messages a day. And so I got into using a thing
|
|
called prop mail, which is a very, very clever tool for filtering your mail. You know, checking
|
|
its subject, its form and to and or and the contents of the body of the mail and stuff,
|
|
all using regular expressions. So I got, I became quite skilled comparatively anyway in regular
|
|
expression use by doing that. And of course, that's that moved on to the other tools that used
|
|
regular expressions big time. So yeah, so I certainly feel pretty comfortable. In fact, I find
|
|
Ork and said a little bit limited in terms of what they can do compared to say pearl.
|
|
Pearls gone way ahead of some of those things, isn't it? Yeah, sometimes, I mean, I pretty much
|
|
stay away from Pearl for the most part now. But sometimes I do have a time where I do just pick
|
|
Pearl just because it is, especially for Pro 5, I haven't really done too much in Pro 6,
|
|
but getting in Pro 5 and getting really deep into the regular expressions has been something
|
|
I've done for over a long time ever since my bioinformatics master's program.
|
|
Yeah, because Pearl used to be very popular in that sort of area, didn't it? It was used in a lot
|
|
of genetic analysis and that type of thing. DNA sequences and stuff was processed with
|
|
pearl in the earlier days. Yeah, and so if you're ever dealing in some old code base of a bioinformatics
|
|
pipeline, you'll see Pearls still. A lot of times people are moving to R or to Python now
|
|
in the bioinformatics world, but you still see a lot of pearl out there. Yeah, yeah. I certainly see
|
|
Pearl a third, I still write in Pearl quite a lot because I just feel more comfortable in it
|
|
having done it for so long, but trying to move on to Python and stuff like that. But yeah,
|
|
it's still about, it's still about, it's still quite common actually when you look behind the scenes,
|
|
but it's not, it's not a thing that anybody would volunteer to learn and go and write. These days
|
|
it's more legacy stuff, I think. Yeah, I agree with that. So yeah, so yeah, that's pretty,
|
|
I mean, I could go into a lot more detail about how I got to war, but I'll shut up now, I think.
|
|
But was there anything else you wanted to add to your experiences?
|
|
Not really. I mean, I've kind of gone on about it in some of other episodes where I talked about,
|
|
you know, coming out of a university and going into molecular diagnostics and a couple different
|
|
big companies in here in the US and then moving over and getting into, oh, there's my son. Hey,
|
|
what? Hold on. Yeah, getting into some of the, some of the other topics like biome format is later
|
|
on. And that's when I started getting really into more computer programming. Before I was more of
|
|
a computer consumer, maybe a super user of certain applications that we used, but not really,
|
|
not really someone who's going to make changes or do customizations. But since, you know,
|
|
getting introduced into Pearl and R and then later C really got me going on that side. Then
|
|
moving over and I started consulting and then really getting into a lot of computer programming.
|
|
And oh, my goodness, my son is over here coughing in my ear. He does sound very well.
|
|
Yeah, he's home sick today. So you're going to have to bear with him a little bit. And he
|
|
hears me talking now and he's all excited to hang out with me. But yeah, it's been it's been a really
|
|
long road to where I am now in my career. But it's been, you know, it's still moving more,
|
|
more and more away from some of the bench lab work and moving more and more to automation
|
|
and data analysis and now they call it data science. When I first started, it was just called data
|
|
analysis or bioinformatics. But yeah, that's that's kind of where my career has taken me and
|
|
and I pretty much stay in those areas nowadays. So I do some web development as well. And some
|
|
application enterprise application customization as well. But I try to stay away from those as
|
|
much as I can. So are you a biologist by by training them? Yeah, I'm a molecular biologist by
|
|
training. So, you know, DNA RNA, that's stuff by training. And then also bioinformatics, which
|
|
is a combination of that with the, you know, automation and computer systems. Yeah, yeah. I have
|
|
a I have a degree in biology, but not the sort of biology that you've done because mine was in the
|
|
late 1960s, early 70s. So, so it was like sort of traditional biology taxonomy and stuff like that.
|
|
So, so yeah, but yeah, it's interesting. I've certainly been to a few talks on bioinformatics.
|
|
I went to some at Phosdem, the big conference in Belgium. And they are really, really fascinating.
|
|
So yeah, it's the I'd like to understand more about that sort of stuff.
|
|
And actually, that's one area that's it's been really kind of lucky providence for me,
|
|
because a lot of bioinformatics is done on Unix and Linux. And so learning all that while still
|
|
on Windows was kind of difficult. And when I moved over to Linux, I'm like, oh, you know, you
|
|
mean like pearls are already installed? Like, okay. Yes, it's great. My daughter recently got her
|
|
degree last year. She got her degree in biology. And she used a quite a lot, I think, you know,
|
|
to do statistics more. So it's for the computation stuff. Yeah, most of my bioinformatics has been
|
|
mostly in our in my program too. Yeah. So yeah, I've only doubled with it in helping her out.
|
|
It's got a lot of power. I know. Yeah, I think that's actually a pretty good segue to the next
|
|
topic that we wanted to talk about, which is what are some of the tools that we use now instead of
|
|
using ARC? Yeah, yeah. We were going to talk about what bits of what we've left out, but we'll
|
|
come back to that one in a minute, shall we? The, um, I just mentioned that, um, well, I really
|
|
said it. As far as I'm concerned, I moved from, from, uh, Gork on the back cluster that I was managing
|
|
to Pearl, which is version four, and then later to version five of Pearl. And this, this was not
|
|
a difficult move because Pearl is very much derivative of, uh, the tools like Ork and Bash and
|
|
um, said and, and all sorts of other things. I think, uh, the, um, the guy who wrote it, whose name
|
|
suddenly escaped me, uh, who designed it, had been using these tools a lot and, and sort of came
|
|
up with ideas that effectively glued them together into, into Pearl. Yeah, um, I, I know the name
|
|
two that I'm not remembering either. It just come back. Larry War. Yeah, that's it. He's a really
|
|
interesting guy. Not a originally computer scientist, so this is why Pearl is, is, is unique,
|
|
because it's come from somebody's, he's come from an entirely different place.
|
|
Well, well, that's often been the criticism of, uh, Pearls, is that it is our right only language.
|
|
Well, yeah, but then any language is, if you, if you wish, I mean, if you want to write something
|
|
that's incredibly obscure, you can do it in any language, uh, it's my, my feeling. Maybe not
|
|
but I don't know. Haskell is pretty hard to understand anyway.
|
|
Yeah, I've, I've taken a couple of times. I've really been enjoying the series on Haskell
|
|
because it is, it has been quite a mystery to me for a long time. Yeah, yeah. I have a friend who's
|
|
a computer scientist and whenever I mention Haskell, he said, oh, you should learn it. You should
|
|
definitely learn it because it's the, it's a wonderful language. But he's coming at it from a PhD
|
|
in computer science, you know, and I think he maybe sees the world somewhat differently from
|
|
the way I do. You know, I, I think that is it, because everyone that I know who really likes Haskell
|
|
comes from a, uh, either a, um, either from pure computer science or some type of other, um,
|
|
you know, philosophical stance on it. It's mathematical prompts or something like that,
|
|
which, which is great. And I'm sure it has a lot of great use cases, but for the work that I've
|
|
done, well, and also just some of those stuff that I do, there's already great ecosystems around
|
|
some of the other languages, which is really why I choose them. Yeah. So what would your, what would
|
|
you be recommending as an alternative to walk? Well, I have a two-part series on XSV if you're
|
|
dealing with, uh, delimited files. And that's been my go-to for, um, recently. It has a lot of power,
|
|
especially I'm sitting here right now on my, um, my desktop computer, uh, which has, uh, 12 cores.
|
|
And so the ability to, like as a 12 core, I say, having the ability to go multi-core and, and go
|
|
through a file that has, I just did one earlier this morning that has a five, over five million
|
|
rows in like three seconds. And, and get a summary, you know, the thing that we did where we said
|
|
for, you know, to get a group by count, it does it in like five seconds on five million rows.
|
|
So it's written, it's written originally in Russ. And it has a really, um, it has a really nice
|
|
syntax for the most part. It's not a full-blown language. It really is just a command line tool.
|
|
But I use that for, for just command line interactive, uh, looking at, at, uh, delimited files.
|
|
And then if it's more involved in that, then I get, I either pull out R and I've been really
|
|
into the tidy first lately, which if anyone knows ours, it's kind of a newer development in R,
|
|
not, not base R. There's, there's a community around, I think started by a Hadley Wickham,
|
|
a community around making a more fluent interface to dealing with, uh, to dealing with R.
|
|
And, and so, um, on the R side, I've been doing that. And then also using, um, Python and the pandas
|
|
package, um, both NumPy and pandas. And I've, I've, I, I used to be really good with R and I would
|
|
write things that I knew had to be in Python. I would write it in R first because I was just more
|
|
comfortable there. And I, in the last two years, I really made it an effort to really learn pandas
|
|
really well. And it's, it's been great. The things you can do with it. And now that there's another
|
|
program called DASC, which is another library where you can, one of the criticisms of Python is that,
|
|
is that a lot of times you're, you're working with the global interpreter lock and things end up
|
|
being single threaded. But with DASC, it's really easy to turn those same with the same pandas
|
|
functions that you're already running. But we're doing a read all this data, group it by and give
|
|
me the mean. It's able to do those things on, um, multicolors with the same syntax that you're
|
|
already used to using pandas. So, um, okay. So that's, that's using, uh, multiple processes
|
|
around the multiple threads. Well, DASC can do it either way. So, um, it depends, you can set it
|
|
up to say, well, for this task, if, if you leave it up to it, it will choose which one it thinks is
|
|
best by looking at the data. But you can say at the beginning, you can declare, I want to use threads
|
|
and then use threads or I want to use cores and use cores. Um, and, and so I've been messing with
|
|
those things a lot and, you know, it's been, it's, it's what I primarily do nowadays. I just gave a
|
|
talk a couple of months ago and I'll be doing that talk again a couple of months from now and I do
|
|
live coding with a Python where I'm turning, uh, I'm turning problems like business problems and a
|
|
data of business problems and turning it into actionable events, event driven process, uh,
|
|
process development and decision making. And so being able to, it's one of the things I think, um,
|
|
um, I think you talked about, um, the creator of Pearl, he came from an environment where he really
|
|
wanted to have like feedback and be able to see the results back and, um, and, and the same thing is
|
|
true on the Python side where there's an interpreter where you can just pull it up and it's kind of
|
|
its own shell and you can put in the data, you can see what the results are and you can go back
|
|
and forth with it and, and, and make changes to your code really fast. And so that's where I've
|
|
been, um, hanging out mostly is, is really in Python, but sometimes also in R when I, um,
|
|
when projects require it. That's very cool. Yeah, yeah. I, uh, I, as I said, I did most of my data
|
|
manipulation in Pearl finding it to be an amazing way, amazingly better than anything else
|
|
it was available at the time, but, but yeah, I can see that I should really be, be rethinking
|
|
that sort of stuff when I need to, need to do that type of thing in the, in the future. So yeah,
|
|
I've been looking at Python a little bit myself. And, um, in fact, I was doing a tutorial on,
|
|
on Rust the other day just to see what sort of a language it is and, and that in itself is quite good,
|
|
but it's not specifically a data manipulation language, but it's a, it's a pretty amazing
|
|
language show if you've ever dabbled with stuff like C, C will catch you out very, very easily,
|
|
but like Rust is, is protecting you a lot. And, uh, that makes it good in my, uh, it's far as
|
|
unconcerned. Yeah, I've, I've seen a lot of, uh, really interesting things going on both in the
|
|
Go and Rust communities. And I think those two languages are going to have more and more of
|
|
influence, uh, on the overall landscape of programming in the, in the near future. But, uh, you
|
|
know what? Python is not going anywhere. I don't know if you pay attention at all to like the,
|
|
either the stack exchange or the jet brains, um, surveys that they do. And Python is one of the
|
|
most popular languages in the world now, um, you know, right up there with C and Java. So,
|
|
and it's because it's being used, it's, it's a general purpose language that, um, although it,
|
|
you can, you can write things in C and have them, and have them, um, be wrapped in a Python wrapper,
|
|
which means that you can bring some of the speed that you, that you lose from having a, an
|
|
interpreted language. Like, like, Frances pandas and, and NumPy, they are written in C. And some
|
|
of, and, and SciPy is written in C and Fortran. So a lot of the power that you get from those
|
|
compiled languages you can still have access to and Python. And also, there's been like a lot of
|
|
community effort around it. Uh, it's, it's the reason why I try to standardize around it is because
|
|
it's a one of the languages that you can, that there's a huge community around both the data science
|
|
and the web development. The community. So I can write a web application and write, uh, data, uh,
|
|
analysis for that, um, for that, uh, web interface in the same language, which makes it, you know,
|
|
really easy to have a team of people all working on the same code base. Yeah. Yeah. I do see this
|
|
happening and, uh, feel like I should, I should be following, following your lead. That's considered,
|
|
I just, um, um, reluctant to start learning anything else in some respects. Yeah. So, well, you
|
|
know, without, without a reason to, it's kind of hard to just pick up something just for fun. Yeah.
|
|
Yeah. Yeah. It's, we're redesigning some of, uh, EHPR and stuff, um, going to have a different
|
|
database, which I'm putting a prototype together for now. And, uh, we're going to move to a static
|
|
website, um, eventually. And there's quite a lot of tools involved in keeping, uh, the HPR stuff
|
|
going, many of which I've written and, uh, cross a rule in Pearl. So, uh, when I, when I, uh,
|
|
resign, when I bow out, I get my, my pension from HPR, I'll, uh, I don't, I don't envy the
|
|
poor soul has to take it over because if you're not into Pearl, then Pearl can be a little bit
|
|
scary. I don't know. Try to write it so it's not, but there you go. Yeah. I'll just take it out.
|
|
Maybe, maybe we should get on that sooner rather than later. Yeah. Yeah. Yeah. It's something I'd
|
|
think about at the moment. In fact, um, how would you, uh, how, given a, given a, a concept
|
|
which you would implement in Pearl, how would you better implement it in something else? I was looking
|
|
at, um, um, what's the, what's the other language that's equivalent to Pearl I'm having,
|
|
having brain parts at the moment? It's one Python. Uh, not Python, but the other one, the,
|
|
small, small talk. No, I'm thinking of the one that was effectively derived from, uh, from Pearl.
|
|
Takes quite a lot of its ideas from it. Uh, you got me on that one. Just, uh, it'll come back to me
|
|
just when it's inconvenient to, to say it, but, but yeah, there, there is, there are other languages
|
|
which, uh, is right there, but I think, I think the point you're making is that Python's really, uh,
|
|
where at the, at the front in, in terms of a lot of this type of stuff. So makes,
|
|
makes sense to, to be moving in that direction. Well, if you don't, if you already know Python,
|
|
and you want to learn, and you want it to do something with, uh, data manipulation, you,
|
|
you don't have to go anywhere else. I think that's definitely clear. One of the things about Pearl,
|
|
though, is that, uh, and this is less of an argument now because a lot of the, the modules and
|
|
libraries that have been available for Pearl, and there's vast numbers up, and they're very easily
|
|
accessible. Um, a lot of them are not being maintained as much as they were. So, but it was the
|
|
argument at one point, if you needed to do a particular thing, there was always a module that
|
|
would do it, you know, um, for example, I did quite a lot of manipulation of, uh, MP3 in
|
|
organ, and audio files are very, very, so it's putting tags on them. Uh, you know, the ID3 tags in
|
|
MP3 and that sort of stuff. And it's not easy to do in Python. I tried to write a Python, uh,
|
|
version to do this using a thing called taglib, but the interface to taglib from Python is quite
|
|
basic. Whereas the one from, uh, Pearl is, is a lot more sophisticated. So, um, you know, but, um,
|
|
you know, there was, there was another library that I was using recently for doing that same thing,
|
|
and, and you know what, I'll, I'll look for it and send you to you an email.
|
|
All right. That would be good. That would be good. Yeah. Yeah. Taglib is quite popular, though. It's,
|
|
it's embedded in a lot of tools that you'll, you'll find out there. But, um, it's, uh, yeah.
|
|
Yeah. The one I'm thinking of, I think it has ID3 in the name somewhere. I gotta go. Right.
|
|
Right. But this, um, taglib will handle any audio format. Everything from Opus through to
|
|
Flak through to MP3 and even speaks. Um, so that, that, it's written in C++. So I do want to be
|
|
writing C++, but I'm not going to do that. Not in my time of life. Yeah. That's understandable.
|
|
Um, shall we, shall we just talk about what in org we've left out? Or did you have more to say on
|
|
them? Yeah, let's do that. Yeah. Okay. So I'd put together a list of things that, that sprang to
|
|
mind when there, there's sort of pre, prewritten notes for this episode, which I jotted down,
|
|
which will probably get changed a bit before we release them. But, um, we reached the point where
|
|
we could have been talking about how you write your own functions in the org. But the way it's done
|
|
is so weird and clunky. Even the author says, uh, says something. Not very, not very nice about his
|
|
own implementation. So, uh, I think we, we agreed both of us that we wouldn't go down that
|
|
particular road. Yeah. And I've, and I've, like I said, whenever I had to go down that way,
|
|
I would choose a different language to do it. So I think we've made the right choice. And,
|
|
you know, balancing what we think would be useful to people versus, um, what we had time to do.
|
|
I think we've made a good decision there. Yeah. Yeah. It's a shame because we begin to write
|
|
functions in a, in a language is, is a cool thing to be able to do. But they are not nice to,
|
|
to use in an org by my reckoning. Anyway, all the authors, for that matter, that's Brian Conegan,
|
|
who, who's saying that? So, oh, oh, just so, just so it's on the record. The, the file, uh, I just
|
|
found it while we're, while we're talking mutagen is the name of the, is the name of the program,
|
|
mutagen. Okay. It's the sound of me writing it down on this paper. Okay. Okay. Oh, and the language,
|
|
I was trying to remember, and isn't memory weird? Uh, is Ruby? Have you looked at Ruby? Oh, yeah,
|
|
yeah. Ruby's, Ruby's great. Um, I'd have done some things nowadays. A lot of the Ruby's,
|
|
like, Ruby on Rails. But I do see a lot of, uh, programs, especially on the Linux side where
|
|
there's a lot of, uh, you, you look at, uh, you go down, you know, app install something,
|
|
and then if you do a witch on where it came from, it's actually a Ruby library. So there is,
|
|
yeah, it's pretty, it's still pretty popular. Yeah. Yeah. There's quite a lot of power in that,
|
|
and I quite like it's Polish nature as well. And it also uses, uh, curly brackets, which,
|
|
which is cool, not indentation. Yeah. Uh, we don't have to, we don't have to get into that argument,
|
|
right? I'm avoiding it. I'm avoiding it. I'm not going to, I'm not going to go that, just,
|
|
just look that way, but I'm not going to go that way. Yeah, that's right.
|
|
The other thing I'd written down here was multi-dimensional arrays. If you want to get into
|
|
some quite complex data manipulation, it's quite nice to be able to build matrices,
|
|
or three or four dimensional arrays, or really complex structures, arrays of hashes,
|
|
or hashes of arrays and that type of thing, which you can do easily in Perl. I think you can do
|
|
something similar in Python as well, can't you? Yeah, that's what, that's what NumPy is all about,
|
|
that libraries is for multiple dimensional arrays. And also, there's also, if you really want to
|
|
get really big ones, that's what things like tensor flow for. Yeah, yeah. I tend to do that a lot,
|
|
a quite like concept of being able to have a, have build structures where you start off with a,
|
|
with an array, and then the elements of the array are arrays, or hashes, or whatever,
|
|
or procedures, or whatever else you want to do, which is something you can do in C, and to
|
|
some extent, other languages, Pascal, was one I used to use a log, you can do that type of thing.
|
|
But, yeah. Is that even something I knew you could do in Perl? I mean, in, in, in,
|
|
arc? I don't, you could do multi-dimensional arrays, but you can't do anything very sophisticated
|
|
with them. Okay. But, GNU Walk has got a lot better than it was in the early days when I was using
|
|
it as far as I was concerned. So, my notes say other languages can do this better. So, I think I
|
|
would keep away from it personally. As you said, choose another language. If you hear that. You can
|
|
get into internationalization, which is all about, you know, writing stuff, which will, is, is
|
|
sensitive to what language you're currently using, what environment you're in. But, that's,
|
|
it does exist. Somebody has put a lot of effort into adding that to GNU Walk. So, when you say
|
|
internationalization, is that mean, for instance, when you're doing a print, a print F using
|
|
commas instead of periods in vice versa? That type of thing. But also, I think you can do things
|
|
like change to different character sets or different languages. So, you can, you can build something
|
|
which has got multiple translations of a given message and switch between them depending on what
|
|
your environmental settings are as regards language. Oh, wow. I think so. But, like you can in
|
|
many languages will let you do that. I think that's available in, in a walk. But, I just
|
|
looked at it and ran away. So, I might have misinterpreted that. Yeah. And I think once again,
|
|
it's one of those cases where there's, there's plenty of other languages that you could use to do
|
|
that too. Yeah. There is a debugger in the walk. And it looks pretty good. I've not tried it
|
|
myself. But, it wasn't something that we wanted to get into. So, like I said, to go and look at it
|
|
and thought, wow, again, somebody's put a lot of effort into writing this. But, who's using it?
|
|
Who's going to use it? I feel that it's, it was maybe a little bit misguided. I don't want to
|
|
criticize all, because it's, it's a wonderful thing. But, but in these areas, it's maybe, maybe
|
|
overkill. Well, I think the idea is to turn it into a full, a full-blown language. And to do that,
|
|
you have to do things like internationalization, make functions. And I think the next topic as well,
|
|
which is writing extensions and CNC plus plus. Yeah. There's a full API that you can write to.
|
|
And they do come with the installation of GNU org. And there's some quite nice things in there. I
|
|
did spend a half an hour looking at them. In fact, I think I installed one or two just to see
|
|
how they would behave. So, you can put quite high-speed additions into the into the language.
|
|
But, again, not really a thing for this series, I feel. Yeah, I think that's right.
|
|
So, that was the end of my list. Was there anything else you wanted to say under this heading?
|
|
I think that's pretty much it for me. I think, like you said, there was a lot of things that we could
|
|
have covered. And we didn't. But I think this is a good place to stop. Yeah. I think we've reached
|
|
the point where we've described, hopefully, enough of org to make it really useful in the sort of
|
|
command line bash script type context, but not to give you something that you could sit down
|
|
and write some really complex program in. You could do if you really wanted to. You can do it
|
|
instead if you really want to, but why, why bother? How do I ever want to do that?
|
|
Wasn't there recently a claim that said is a complete programming language?
|
|
Yeah, I think it is. Yeah, I did a series on said quite some time ago and did look at some of the
|
|
example programs that come with it. And some of those, they just, you can do some very powerful
|
|
things, but understanding is really hard to do. I certainly had some trouble working at what
|
|
it did. Yeah, writing a web server and said, I can just imagine what that looked like.
|
|
It'd be locked up at the end of it. I think you'd need to be.
|
|
So once next was my last point, and I'm doing all the talking at this point, but one of the things
|
|
I want to say was we had said along the way that we were going to try and turn these notes into
|
|
some sort of a combined document that we could put on the HBR site and archive.org. And I've
|
|
started on this while back actually as consolidating all the notes and stuff. And so I'm going to
|
|
carry on with that because I think it'd be a useful resource just to have everything together in
|
|
one place with an index. So you can find stuff if you want to. But there's no time scale at the
|
|
moment for this, but I would imagine within the next sort of year or something like that,
|
|
where you should be able to do it. I'm happy to do most of it, but it'd be quite nice if we could
|
|
bounce a few ideas off one another along the way. Hey, yeah, no, whenever you get to a point where
|
|
you want to share it, just, yes, I don't know over to me. And you can go through it together.
|
|
Yeah, that would be fun. That would be fun. Because it's been an interesting voyage this. I
|
|
never had originally considered doing a joint series, but it just sort of worked out this way.
|
|
And I think it was your idea to do the series, wasn't it? Yeah, it was, but I knew I definitely
|
|
could not do it myself. There's some topics. And actually, I've learned a lot along the way about
|
|
off that I didn't that wouldn't have learned without having this experience. So it's been good.
|
|
Yeah, yeah, it's been a lot of fun. It's been a lot of fun. I definitely, if the opportunity arose
|
|
again, and you know, and somebody who wanted to do a joint show, I'd be very keen to get involved
|
|
because it's been been fun to do. So yeah, it's been been good. So thanks very much for everything
|
|
you've done. It's been been a great time. And thank you, Dave. I mean, you've been kind of a mentor
|
|
to me on HBR. And so thank you for all of your work that you've done on HBR. And I could
|
|
help to continue to listen to you and then we'll continue to have conversations behind the scenes.
|
|
Yeah, that's great. Okay, then we'll say goodbye at this point then. All right, bye everybody.
|
|
Bye, this would be easy signing out and I continue to hack.
|
|
Yep, definitely do that. Bye-bye.
|
|
You've been listening to Hacker Public Radio at Hacker Public Radio. We are a community podcast
|
|
network that releases shows every weekday Monday through Friday. Today's show, like all our shows,
|
|
was contributed by an HBR listener like yourself. If you ever thought of recording a podcast,
|
|
then click on our contributing to find out how easy it really is. Hacker Public Radio was found
|
|
by the digital dog pound and the infonomican computer club and it's part of the binary revolution
|
|
at binwreff.com. If you have comments on today's show, please email the host directly, leave a comment
|
|
on the website or record a follow-up episode yourself. Unless otherwise status, today's show is
|
|
released on the creative comments, attribution, share a like, 3.0 license.
|