Episode: 675 Title: HPR0675: Python Response to Bad Apples Podcast 5x18 Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0675/hpr0675.mp3 Transcribed: 2025-10-08 00:44:09 --- . Hi, this is Doug, finally responding to Clathon's challenge to respond to his season 5 episode 18 podcast where he prompted me to make my own podcast to explain the Python code that I sent to his show. I've been bugging Clathon for a while to get into more Python and I thought this was a perfect opportunity to translate his bash script into a Python script. His original bash script would read the first line of a group of files in a directory and write those first lines to another file which you could then use later. My Python code does the exact same thing. Now, before we get started, I want to say that I have nothing against bash, I use it in my work all the time. I just prefer to work in Python if my programs get beyond a few lines. Well, let's take a look at the Python code. One of the things that you'll notice if you look at the code of episode 18, you'll see that it begins like other scripting languages that should bang line, user bin Python and Python telling it that the Python interpreter will read the rest of the code and do the work. Now, the second line is a comment which is preceded by the pound symbol. This is also like other scripting languages. Now, the third line, import glob, what does that mean? Well, glob, I don't know why in the world the Python developers named it glob, I think they've meant globbing file names, but glob is a library that comes with Python. Now, if you do any research into Python at all, you'll find that they often say things like batteries included. What they mean by that is that Python, the baseline, which comes with a lot of libraries that do other work, glob is one of those libraries. Now, we're going to use glob in our code to get the list of files that we want to get the first lines from. Now, the next thing we do, we have another comment that says create and open the output file for writing. The very next thing, out file equals open parentheses, talk.output, comma, w, closed parentheses. What this does is it opens an output file into which we're going to write the first lines of the files we're reading. And because it's got the w, it's going to mean we can write to this file. The open command is Python's way of opening a new file in our current directory into which we can write output. And that out file is named out file. The next line is another comment where I say iterate over the files that match the star dot text. And this is where we're going to start using the glob module that we imported on the third line. Now, the next line says four file name in glob dot glob open parentheses star dot text, closed parentheses, colon. This line is a loop. In Python, the simplest loop is this for loop. And it loops over a collection of things or a list. In most things when you're doing code in in any kind of programming language where you want to loop over something, you're looping over a collection of things. Python simplifies this idea by always saying I'm going to iterate or loop over a collection of things. Well, what's the collection? If you look at the glob dot glob star dot text, we're using the glob library and then we're using the glob function of that library to gather all of the text files in the current directory that match the unix pattern star dot text. This will create a list of all the file names in the current directory that match that pattern. The four file name, what that does is it, one at a time, it loops over that list, takes each file name in turn, assigns it to the file name variable of the four loop. And now we can use that file name inside the loop. The next line you'll notice has a tab outfile.write, open parentheses, the word open, open file name, closed parentheses, dot red line, open closed parentheses, closed parentheses. Now I know that's kind of hard to follow. I probably should have split that up into multiple lines. I'm using a compound statement there. The first thing to notice is that the indentation in Python indentation is significant. Another language is like C and Perl and JavaScript. They use the curly braces to indicate scope. In Python, white space, the tab in this case is significant and indicates the contents of the body of the four loop. Anything that's indented under the four loop is going to be included in the processing of the four loop. Now in this case, what we're doing is we're passing the file name, which was one item from our list generated by the glob module to, if you look at them, intermost parentheses, open file name. Open file name says, okay, take that file name, open it, the default case is open for read mode, open is an object, we take a method of that object, which is dot red line, and we read the first line of that open file. So all we're doing inside the intermost parentheses, we're opening the file name and reading the first line. Then we pass that to the right method of our previously opened out file, which causes that top line of the file name to be written the out file of our output file. At this point, we're done with the body of our four loop and we go back for the next item from the list. This will continue until all of the file names that were gathered by our glob start out text function to have been, had their first lines read and written to our out file, at which point the program ends. Now that file can be used for any other kind of processing and will contain all the first lines of all the files that are read. Well, that's it. I hope that was helpful. I'd like to hear back if anybody's interested in getting some more help on Python, and I hope Class 2 takes this as an opportunity to do more Python. Thanks a lot. Bye bye. . .