Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

267 lines
31 KiB
Plaintext

Episode: 2948
Title: HPR2948: Testing with Haskell
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2948/hpr2948.mp3
Transcribed: 2025-10-24 13:46:15
---
This is HPR Episode 2948 entitled, testing with Haskell, and in part on the series, Haskell,
it is hosted by Tuku Toroto, and in about 43 minutes long, and Karimaklin flag.
The summary is, Introduction on HPEC and QuickCheck.
This episode of HPR is brought to you by archive.org.
Support universal access to all knowledge, by heading over to archive.org, forward slash, donate.
Support universal access to all knowledge, by heading over to archive.org, forward slash, donate.
Hello, you're listening to the Hakepafler Gradio and Digital Touroto.
This time I'm going to talk about a little bit about testing and Haskell.
So testing is a software testing that is a really big domain.
There's lots of things to test and lots of different kinds of tests,
but lots of different kinds of approaches to.
Approaches is to testing, and it's something dear and dear to my heart.
I did my master's thesis about this, for example.
But today I'm going to focus on automated testing, and on the automated testing,
usually done by developers, and use a very small part of it.
So couple of reasons for testing, like I'm going to explain next is that.
First of all, you can check that your code works now, and at a later point of time.
And it also helps you to clarify your thoughts when you're writing those tests.
You have to think about what you're writing from a little bit of different, little to different point of view.
And it also gives you a nice executable specification.
Anyone reading those tests can try and figure out how the software is supposed to work,
because it does test out there for that reason.
The tools I'm going to use this time are HFEC and QuickCheck.
So HFEC is automated testing.
I mean, it's a testing framework. It has an automatic detection.
So if you have a your test in a correct data, or renamed in a correct way,
it will detect them for you, and then you can run those.
And it allows you to specify hierarchy, so you can have a, for example,
you can have a few feet, and under that feature you can have a test, and then you get a report.
When the, when the tool is run, you get a nice report that shows that for this feature, this test run.
And, well, and you also get a report that this is the, this is the, this is a test that we run.
This one, all the results of the each and test them in the end of test run, you get also a list of tests that failed.
So I'm trying to avoid talking about code too much, but if you have access to the show notes,
now we'll be a good time to have a quick glance at them.
So in the HFEC to specify a pile of tests, you create a function called spec,
that has a type of spec.
And instead of that function, you just write basically like describe very important feature.
Do it, execution should be error free.
Do and then the test part and the next test, test comes,
it, uh, flux capacitors can be charged and testing code.
And then you will start a next feature like describe somewhat less important feature.
So, you, you have a nice, uh, domain specific language for writing out the test, test case structure.
And, uh, quick check, I'm going to talk a little bit later.
That's a really, really cool tool.
I'm going to talk about it during this episode, but now you focus on the age, age for a little bit.
Uh, unit tests, these are the basically the smallest test you can write.
As, as the name implies, they test a unit, unit is of course a very abstract concept,
but it's a something not too, something not too big.
And these are written as a using the, if I'm using the age spec, I usually write these as a single example.
So, there's a single case that I have a, for example, I have some specific function.
When I give it a certain parameters, I get certain answer and the test verifies that.
And for the different case, I have different tests.
So, these are really small and they test one point in the, in the, uh, parameter, parameter space.
So, uh, in the show notes, I have a example about testing those mark of change that I wrote, not too long ago.
And there's two tests, two tests.
So, first is it adding new starting element to empty configuration creates items with frequency of one.
So, when we are, when we have an empty configuration and we add starting element there,
we should get a configuration that start starting element has a frequency of one.
And this is written as a config equals at start AA, empty config.
So, we are adding an AA to the empty config and they're getting a new, and that creates a new config.
And then we are just verifying it.
Uh, I'm in the code I'm using lenses.
I don't talk about this much, but these are basically just a layer of reading and modifying nested data structures.
So, uh, then I'm checking the value rise as a corner pick.
This is the config that we just created.
configstarts.head.itemfrequency should be just one.
So, we are taking the config, we are going to the starting elements, we are taking the first element with the head,
and then we are taking from that element, we are taking the frequency and it should be one.
That's, that's it.
And then the next one is the config, configstarts, head, item, item, just should be just AA.
So, same thing, but instead of frequency, we are grabbing the actual element.
And that should be the AA, that's the one we added.
And that's outest, that's outest.
This verifies that when we have an empty configuration, we add one element, the starting element.
It should be, it should be added there, and it should have a little frequency of one.
Simple as that.
For example, is the little bit more complicated, it reaches it, adding the same element twice to empty configuration,
creates item, it's frequency of two, basically the same thing, but not that we are adding it twice,
it should have weight of twice, weight of frequency.
Same thing, we are starting with the empty configuration, which is just a config equals at start AA,
empty configuration.
So, we take an empty configuration, add AA there, that creates us a new configuration,
and add AA there again as a starting element.
And this is our data that we know, we are verifying.
And then there's the same, same check as earlier, the first item there,
in the starting element should be AA, but this time it should have a frequency of two.
Simple as, simple as that.
Okay, and really pretty few part of those lenses.
So, for example, if you have a memory, could you read it?
Where are we?
If you have this config grave question mark, open band, config start L, dot underscore head,
dot item item L, dot other underscore, just close terrain,
should be just AA.
So, what this actually really means is that we are starting with the config,
and then we are applying a lens that produces maybe,
that's why we are using the grave question mark to apply it.
We are going to read from that value, or we are going to read the value that this lens produces.
We could also modify, modify that value.
We could just say that we are going to change this value, but now we are reading.
Hence, grave question mark.
Config start L is a lens that focuses on the starting element of in the config, which is a list.
underscore head, we are composing that with a dot operator,
so we are taking out one lens and putting another lens after it.
underscore head takes the first element of the list.
item item L is a lens that takes the element part from that element, the item part.
The elements that, in those starting elements, have a frequency and then the actual element.
So, this lens focuses to the item part.
And it is a maybe pipe.
So, then we use the underscore just to focus on the just size.
So, now we have maybe text as a result.
When we use this lens that we just build out of smaller lenses,
and used that grave question mark to read the value, we are going to get just AA back.
Okay, so that's how the lens is working this example.
Not more than you can do, but that's enough for us to do now.
So, talk to the test.
So, we had these two unit tests that checked that the adding starting elements were as intended.
Or some specific cases were as intended.
If we really wanted to check the code more true, true only.
We would have a lot more tests like we would also check that if we had AA and BBS starting element,
both of those would be present as the starting elements with frequency of one hand.
And adding AA and BB and BB would result AA being present with a frequency of one and BB being present with the frequency of two.
And so on.
Because this test only focuses on single points in the parameter space.
So, I like unit tests.
They are easy enough to write usually.
And if they are not easy to write, that might mean that I don't really understand what I'm doing.
If I cannot really write down what the code should be doing, how can I write code that actually does that well.
So, they, as I said, they verify one specific case.
They are usually very fast to run because they are not reaching out and touching to the database or talking over network or doing things like that.
They are usually really fast to run.
And I don't always write these before writing the code, but sometimes I do.
It's really helpful to just sit and think at what kind of function I would want and how would I want to use this specific thing that I'm going to write next.
So, you're starting from the point of view of the user of the function and then you're writing down how you're going to use it and then you're going to write the implementation.
So, as I said, this test only a very specific case, single point.
And if you wanted to test a bigger case, you would have to write a lot of tests.
Oh, you can do property based tests, which is what I'm going to talk about next.
This, as the name implies, testing that a property holds for a function with a generated parameters, proper meaning that the function, some specific feature of property of the output of the function is the always present there.
And you are calling it with some random set of parameters.
So, in the code, these are, these look a little bit shorter.
So, yeah, these are, again, in the show notes, there's two tests.
These are, these are from the game again. These are about food production.
So, it says, describe planets to, describe food to eat.
Food requirement for positive amount of population is more than zero.
And then for all positive population, lambda, food requirement is created and for results zero.
So, this means that if you, if you have any, any population on a planet, one or more, then they are going to require some food.
That's basically what we're saying.
So, if you have anything, any, any population present, they are going to, the food requirement is going to be created and zero.
Apparently, I don't have robots in my game.
And the next one is a, eat food-based production for farms is equal or created and they amount.
And these four or farms, lambda, some F-map, food-based production is created in a resource of length X, meaning that if you have some amount of farms,
and you calculate all, and you calculate the total food-based production, that is going to be created than the amount of farms.
So, every, essentially, every farm is going to produce at least two units of biological resources.
So, how does this work?
Let's, I'll focus on the first case, that for all positive population, lambda, X, food requirement is created and zero.
So, positive population here is a generator.
I have written down in the so-notes.
And positive population has a type of chain, list of planet population, so it can generate a list of planet population.
And it's written as a K, positive population equals to K, left arrow, arbitrary, such that lambda X is created and zero.
So, K is something that is created and zero, it's a number.
And then, vector of K single population, this is what we are going to prepare, two lines.
So, we are going to, first we are picking an arbitrary number that is created and zero, and then we are going to create a vector, or a list that has a K element, and each element is a single population.
And a lot of these single population, that's another generator.
Single population has a type of chain, planet population, so positive population has a, generates a list of populations, and single population generates, as the name implies, just single population.
And it's written as a single population equals to, let planet E, the ID equals to SQL key zero, let arrays ID equals to SQL zero.
So, we are creating a database primary keys to that zeroes for the planet ID and race ID.
This is not touching to the database, we are just creating that primary keys, values in the code.
And then we are saying that population, left arrow, arbitrary, such that lambda X is created and zero.
So, again, we are picking up random numbers, arbitrary numbers, that is more than zero.
And then we are returning planet population, planet ID, race ID population.
So, we are constructing planet population, data that has a three parameters, the planet and race IDs and population.
Those planet and race IDs are, primary keys, sorry, foreign, foreign means, the referring to something, but they are not, we are not really dealing with the data phase here, they are just having values, and that's why they are setting them to the zero.
And the population is that, arbitrary number, that is created and zero.
And these two functions look together.
So, when we are saying for all positive population, then the quick check, which is the testing tool, we are using here, is understands that it's going to call that positive population.
Generate and generate some arbitrary positive population, that has some arbitrary amount of planet populations, that have arbitrary amount of population, I mean the inhabitants.
And then it's going to check that it's going to call the food requirement and check that that is created and zero for the arbitrary data that we just generated.
And generators are, they can be really simple, they can be, just a single line, if you want to just generate some arbitrary number, that has some specific characteristics, like it's criterion zero, or it's a between 1 and 10, or even some what you want to do.
Or they can be really complex, like if you, if you had a space game and you wanted to test that galaxy has some specific properties, then you would have a generator that creates a galaxy that has stars, that has planets, that has people and that has what not.
And yeah, more complex, more complex generators are often defined in a nested fashion, so you build a smaller, smaller generators that can generate some smaller amounts of data, and then you combine them to generate a bigger, more complex data.
I like these, a lot, they are quite a bit harder to write, because you need to find that general property that you are going to test, and it's not always easy.
Sometimes it's, it's easy and sometimes it's hard and sometimes you feel that this is so obvious thing that doesn't make sense to test this at all.
But why I like this is that it can check, this test can check multiple cases of same kind, they might detect eight cases, or they can detect performance problems, I had actually once I made a tiny change in my code, and suddenly all tests relating to that, that part of the code starts running really, really slow.
Oh, well, quick, all the quick check tests, those, because they were generating a big data structure, so they changed the bad made, decreased the performance a lot, you know, big data structure, so this, they, these ones helped me a lot.
They take a longer to run, depending on your computer and depending on the tests, because, for example, that, that full requirement that I have been talking about, with that default settings, the quick check is going to run it with 100 different inputs, so it's going to, it's going to call that positive population generated 100 times, and it's going to check that for each of those times.
The property holds through, so it's going to generate 100 different dead data and verify that with all these different cases, this still, this property still holds 100 doesn't sound much, but when you have 100 tests, and every, every one of them is 100 times, that's what, 10,000 tests, suddenly.
But, that's how you, that's how you get a, that's how you might want to find some cases that are lurking there.
And, well, we basically just need something to test, then we need something to tell us our data and something to verify the result, so these aren't actually that complex in the end.
They're just a bit more, a bit more difficult to write than unit tests, so, if you are starting, starting the right test, it might be a good idea to start with the unit test, and when you get a feel of that, how, how that thing works, and how you can do things there, then you can start doing this property test, and see how, how this compared to the unit test.
Or you can start directly from the property test, and then try to write unit test for the specific cases that you want to cover.
And, of course, things aren't, these are, these are, these have been just tests that lurking isolated, they have a single function that they call with some data, and then, they verify the output.
The more, more, more sort of bigger, complex, different, different point of view, looking at the ones that are lurking with the database, and I have a database in the game, and there's some data, and I would like to verify that something's worked with that too.
So, photos, again, it's a spec, but this time, this time, at the beginning of the spec, we are going to write with app, and then describe status handling, describe planet statuses, then it, it's by planet statuses are removed and is created.
And the tweet app is a function that, that there comes with the yes, and that is a, that, that class are quite a bit of things, like it, it initializes a test database, it automatically cleans the test database for you, and it even has a, it even reads the configuration, and like you basically have a running system.
At that point, yes, yes, application, yes, that is that web framework I'm using, and the assistant is the database layer that I'm using.
So, it's completely old, that you basically have a fully running, yes, application that you can test, and in them, this one I'm not going to read completely, but this, this test is expired planets, fatuses are removed and is created.
So, first we are using a run dp to insert some data via inserting a new star system, we are inserting a faction, this group of people working together, and then we are inserting a planet, and that faction we are inserting because that, because that planet has a own owner, and that's a faction.
And then we are inserting a planet status, we are going to say that this planet has a good harvest on a year, and we are setting the status expiration to 2020 January.
And then we are setting the status of the application, we are inserting that, that's just saying that the year of the simulation is 2020 January.
Okay, so we have a status that should be expired today.
And then we are going to say that news, left-around run dp, remove expert statuses, simulation current prime status, so we are going to call, we are calling remove expert statuses function with the current, current started.
And it will produce a list of news that we are going to give hold of.
Then, to get the news variable, then we are just saying that statuses left-around run dp, select list, planet status, planet ID equals to planet ID that we created earlier, so we are loading statuses of that planet, and then we are loading all the news from the database, load that news, left-around run dp, select list.
And then we are going to ask the news date, so we are ordering the news date.
So, now we have the news that we created, then we removed expert statuses, then we have status planet statuses after, expert one has been hopefully removed, and then we have the news that we loaded from the database after we removed the expert statuses.
So, we are just checking that statuses should satisfy lambda x, length x equals 0, so statuses should have a length of 0, this should have a length of 1, and loaded news should have a length of 1.
But this is a bit loose test, we are just checking that when we removed the expert statuses, we got a one news article, that probably is a fault, that good harvest boost has ended, and we are checking that news that we loaded from the database, that we have also the news about,
that good harvest boost ending, and then we should check that the statuses of the planet, if that there is no any statuses of the planet left, so that the one was removed.
So, we could make this a more specific by checking the actual values of those loaded items, but I am not doing this, that's yeah, I am sort of trusting that this is okay.
Okay, so characteristic characteristics of the statuses is that there are loads of things to write, I have to set up the system, I have to create the status system and the action and the planet status, and insert them into the database and set the simulation time to move to what I want to add, so there is a quite a bit of things to set up.
There are slower, even if I remember the database is used, so of course when you are reaching, if you are reaching over the network to a database and then writing and reading there, it's even slower, so it's a good idea to give the database that you are using for this kind of test as close to the,
as close to the your machine as possible prefer to prefer to reduce the in-memory database that doesn't write into the disk, but otherwise works exactly same as the database.
They are more error-prone because we are doing input-output-off here, so there are loads of things that can be wrong, network can be down, the Mathia communication error, the Mathia some glitch in the database, and set up Mathia,
set up might leave the database in a funny state, or the Mathia some conflict somewhere, and all those things can break this test, and then when the test breaks then you have to analyze it.
Is the problem in the code that I am testing, or is the problem somewhere else, and is this a really problem, or is this that somebody just accidentally turned down the database?
That has happened too.
But this can test, this can test our parts that cannot be tested, but unit tests, so these are the important things to test too.
And the last thing I am covering is already half an hour, this is starting to get bit long again, so the last thing I am talking about is testing an API,
so in my game there is a rest API, so that you can just make HTTP request to load information and modify that information, and tell the game what you want to do, and stuff like that.
So I want to test those things too, and this again, this spec starts with that function, so we have a fully, fully, full sort of application of that weekend test.
And so this one says that describe message handling to it, authenticated use, unauthenticated user can access messages, and then it adds underscore, left arrow, get API message R, this is we are going to do a HTTP get to the API messages.
API message resource, this is our rest resource, and the next line is status is 401, so we are going to do a HTTP request, and we are going to verify that we are getting 401, so we are not authenticated, so we shouldn't be getting any data back.
The next one is it, sending messages are loaded, so this one has a bit more, first we are going to call setup version, that creates a person and faction for us, then we are going to run TP in search, research completed, 25, 25, 0, faction ID, high-sensitive sensors.
So we are making a news entry about faction that we created completing research of high-sensitive sensors on a year, 25, 25.
Then we create a user called feed, create user feed, just a person ID, so this creator user named feed is the actual user of the game, or player of the game, and also set them with that person we created, that's the avatar they are playing as the person.
Then authenticate as user, so we authenticate feed, and this is authenticate as is a helper function provided by the method that in development mode, authenticate use, you cannot use that in a production code, it is a flag that when the application is compliant in the development mode it enables this,
because this is basically a pack draw.
So in the development mode that's available in a production code, it's not available, and usually of course trusted, you would verify that it's not in a production code, it's after deployment it's not there.
So we are authenticate as a user, then we are calling the getApimessageA, so we are going to do an HTTP get to the API messageR resource, and then Rasp left arrow get response.
So we are going to take the body of the response and hold it, and then we are going to do some passing.
JSONM equals join, decode, $dimate, simple body $dimate, so we are our response, we are going to grab the body of the response, and we are going to decode it, as a JSON, and we are going to, because now we have nested maybe, so we are going to join to, to twist them to the one maybe.
So we, now we have a maybe value, value is a JSON value here, and then we are going to use some assert, assert equal message tag, and then this lens again, JSONM cravequestmark.arrow.head.key.strain, so we are going to,
of that JSON that we got as a body, we are going to, we are expecting it to be array, we are going to take the first element of it, and we are expecting that to be an object, and then we are going to take the tag key of tag, and that actually have a value of string.
And, then we are comparing that to, it will be just research completed, so in our JSON structure, in this specific spot, there will be a research completed string, and then we are going to do the same with stardate, you know, doing the assert equal stardate.
JSONM cravequestmark just array head key stardate index, so we are going to, again, we have array, we are going to craft the first element of the array that we expect to be on.
JSON object, we are going to craft the stardate value of it, and it should be an index, and it should be just 25.0, that's the, that's the time we put in our research completed message, and then we are going to do one more, one more assert, assert equal technology, JSONM cravequestmark just array head key contents key technology.
So, again, array of JSON data, we are taking the first element from there, we are going to craft content value from it, that is again an object, and we are going to craft technology value of that, that should have a string value associated with it, and it should be a high sensitivity sensor.
So, we have three checked checks into the JSON content that we are checking, and then the last one is status is 200, so when we are doing, when we have authenticated user, we are loading the pending messages, we are asserting that the JSON content has three values that we are expecting to have, it might have some more.
Of course, we are only, we are only checking these three things, and then we are checking that the, the, our request has a, the, has resulted to a response with the HTTP status 200, that is, means okay.
And, like with previous, previous database tests, database is wiped out, medically at the end of the test, or at the beginning of action with 100% sure, when it does that, but the important thing is that when, when the test, the test starts, it has a clean slate, it has a database that it can modify, and it cannot mess with other tests.
The database is always, you know, when the, when the test starts, the database is always a clean slate, and you should be very careful not to accidentally run these against production, you should use your database because of that lighting, you can want to light your production database.
Okay, character, characteristics of these API tests, there's a lot to write, again, because of the database setup, these are, again, slow and error form, because we are doing IO, yeah, reading and writing the database, you are doing HTTP requests.
And, these are really cool, because they actually test, complete, complete end to end like they start from the message comes to our, our system that we are testing, message comes from the outside of the world into the system, and it goes through the software, reaches to the database, comes back to the software and goes back to the caller, so it's the whole end to end.
And, that's, that's what the customer of user or whoever is calling is interested of, they are interested of that, the whole thing works together, all the parts will combine together, works, works well.
And, this can also be, we're testing here, that JSON's faster, the message is faster, because we could test, it generates a HTML2, we could check that, that HTML that was returned, if it, if it, if it, the, the resource that returns the HTML, we could test that it has some, some elements and some values that we expect to be there.
But, I wouldn't, for example, I wouldn't want to test that, user, user interaction with this, like if, if, if you wanted to test something that you have an application, where user clicks around in the browser and then makes requests to the server and then comes back, I wouldn't, I wouldn't use this for that, I would use something like, robot, robot framework for that, for example.
Okay, so, in closing, there's a lot of things that I didn't cover, there was, there's a, this is only a 40 minute episode, so there's, there's not enough time to cover much, but this is a lot, just a scratch in the surface a little bit.
We skipped UI testing, performance testing, security testing, long run testing and, who knows what not accessability testing and, the list just goes on and on and on.
We also didn't talk about formal proofs, that's something I have been reading about and trying to teach myself, which is really, which is really interesting, because they, you don't, you don't, you don't write tests,
to verify that your program is working correctly, you're just mathematically proving that your program is working correctly, which is completely different approach.
So, yeah, that's about it, uncle.
So, the best way to reach me is either email or in the fediverse, and to tour at Master Runs at Social, or questions, comments and feedback are welcome, even, even cooler is, if you decide to write a record, your own hug about the radio podcast.
Talk to you later.
You've been listening to heckaPublicRadio at heckaPublicRadio.org.
We are a community podcast network that releases shows every weekday, Monday through Friday.
Today's show, like all our shows, was contributed by an HPR listener like yourself.
If you ever thought of recording a podcast, then click on our contributing to find out how easy it really is.
HeckaPublicRadio was founded by the digital dog pound and the infonomicom computer club, and is part of the binary revolution at binrev.com.
If you have comments on today's show, please email the host directly, leave a comment on the website, or record a follow-up episode yourself.
Unless otherwise status, today's show is released under Creative Commons, Attribution, ShareAlive, 3.0 license.