Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

239 lines
17 KiB
Plaintext

Episode: 3020
Title: HPR3020: Validating data in Haskell
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3020/hpr3020.mp3
Transcribed: 2025-10-24 15:15:07
---
This is Hacker Public Radio, episode 3,020 for Friday, 28 February 2020.
Today's show is entitled, Validating Data in Haskell
and as part of the series, Haskell, it is hosted by Tukutoro Toe
and is about 25 minutes long and carries a clean flag. The summer is.
Tuto talks about where to validate incoming HTTP requests before acting on them.
This episode of HBR is brought to you by Ananasthos.com.
Get 15% discount on all shared hosting with the offer code HBR15
that's HBR15.
Better web hosting that's honest and fair at Ananasthos.com.
Thank you very much.
Hello, you are listening to Hacker Public Radio and this is Tuto Ruto talking about how to validate data with Haskell.
So, a little bit of background. I have that space game that I have been working for and I need an admin interface for it.
Then that admin interface is for examining and modifying the world.
The idea is that the computer, while the computer will find a simulation for the players,
sometimes there's a need for the game master, the person who is an admin assistant,
to have a little bit of bulk there.
For example, create new planets or new stars or new people,
modify little bit of things so that the game will stay interesting.
So, for that reason, I need the admin interface.
And first up is for creating viewing and modifying people.
Later on, I will add more, but people are now what I have been concentrating on.
As usual, I won't go through the code line by line anymore.
I will concentrate on the most interesting points.
And on the show notes, there will be a little bit of more code than in the actual episode.
So, there will be three end points, HTTP end points.
So, one for is for retrieving a list of people, one is for viewing or modifying a single person,
and one is for creating a completely new person.
So, I'll focus on adding a new person on this episode.
So, I looked at all about types and parsing first.
So, there's two important approaches that I have learned recently.
So, the first one is that you've used types to,
and actually you should write your program in a way that the illegal state is understandable.
So, if it's not possible to create data that is illegal, invalid, you don't have to validate it.
If you can only create a data that is valid, you just don't need to validate that.
So, for example, instead of checking that the index is always 0 or more in some context,
you can just use data type called natural, natural.
Because the data type can represent only from index numbers starting from 0 and going up.
So, now when you are given such a data, you can be sure that it's at least 0 or greater.
It cannot be negative. You don't have to, you don't have to separately validate it anymore.
The card system does that for you.
You have to remember that you have to be careful when you are doing subtractions.
Because if you subtract 5 from 1, you will normally end up with a negative number.
Because natural cannot represent that negative number, you would get a runtime error.
So, you would have to check if you can do the subtraction before you do that.
This is a limitation of natural and it's kind of true to people.
It's kind of annoying. I don't like runtime exceptions at all.
Another example is a non-empty, which is a list that will always contain at least one item.
When you are given a data that is non-empty, it's always safe to check that there's a first element there.
It's not safe to check for the second element of subsequent elements.
So, if you are given a non-empty and you take a tail that's everything for the first element, you end up with a regular list.
So, and if you want to check the first element you have to do, if you want to use the first element of that,
you have to check that the tail actually is.
So, use types to make invalid state, invalid state under 100 points.
Another one is the new one that I learned is a pass, don't validate.
But, when you are given some data that you transform into different data,
for example, if you are given a JSON object and then you pass it and construct a Haskell data drive out of that.
Another Haskell data drive, you will build your system in a way that you accept everything and then validate it.
You should build it in a way that the pass itself makes sure that it reacts data that is invalid.
You cannot always ensure it in every invariant, so some validation is probably still needed.
But, so, instead of reading an after a JSON data and then checking that a record of fields are present, as an example,
you have a data that matches to records, requires an optional field and just pass the JSON into that.
So, if it succeeds, you have data that you want, the different tails, you either have a missing field or they have wrong kind of data.
And, it's good to clearly express what kind of data the system objects accepts and operates over.
So, how does this relate to our space game?
We are creating that interface for creating new persons.
So, there's a function signature, generate person M.
This is the actual thing that creates a new person, generates it.
It is a random Gen.C, fat arrow, star date, arrow, person options, arrow, danji, person.
So, meaning that if you are given a current date and options that deliver what kind of person you are after,
you are given a random moment, which can be then run to get that person that is generated.
The person options is currently very bare bones, because the people in the game are very bare bones.
There isn't that much of detail into them yet.
I will add more options later.
So, there's only one optional field that's therefore that is maybe, and that's for specifying age options.
So, better than options has a one field, maybe age options.
And again, age options is an arcade value like data type that has two possibilities.
It's an age bracket that takes two ages or exact age that takes a one age.
So, if you are given an age bracket, you are especially telling the island a person who's age is between these two numbers.
And if you are giving an exact age, you are saying that you want a person that has this exact age.
And finally, the age is a new type that runs natural that I talked about earlier.
So, it means that age can never be negative.
It can never be functional or decimal.
You cannot have a 1.5 years old things.
Age is always whole years, so it's one or two years.
So, I decided to test that this because for kids, those months in your age are very important.
It's a big deal that you are 5.5 year old, instead of 5 or 5.25.
But for adults, it's not that important.
We count age in whole years or even decades.
Or the older you get, the bigger you need to start using for counting the age.
So, our code for this is in the show notes if you want to have a look.
But it's not that important.
And the important is to remember that there's a person option, have an option field.
Age options that has two possibilities.
Age has egg, age or age bracket.
And age is always a zero or more.
I wrote a parsing JSON by hand, because in case that the number is less than zero, we will react it.
So, this is our parser that actually checks that if it's zero, like a less than zero, we will react it.
We cannot parse it.
And that will instruct whoever is using that JSON parser that now you got invalid data.
So, all this means that when you are creating a new version, you have different kinds of options.
You can say that I don't care about age at all, then computer will pick some for you.
You can say that I want specifically 25 year old, for example, specific age.
Then computer will use the current date of the simulation to calculate the date of birth of that person.
That's why we need the current date.
Or you can give a age bracket in which games the computer can calculate the date of birth based on the current date and the bracket.
And age is always in there that is zero or more.
So, these things are insured by the type system.
Okay, there's still possibility of an error.
So, nothing ensures that the age bracket makes sense.
We could have a bracket from 10 to 5, and that does not make sense.
It should be from 5 to 10.
So, we still need a little bit of validation here.
I mean, in theory, you could just think that in that case, you just flip them around and then you end up with a bracket from 5 to 10.
But I like to build systems in a way that they don't assume that a column made some sort of mistake and try to fix it.
I would like systems to work in a specific manner that has been agreed upon.
And if incorrect data is given, then they are reacted and say that, hey, there's a problem, fix, fix.
Because that in the long run leads into less problems in my opinion.
So, no, so we still need a little bit of validation here.
And here we are going to use a library called data.validation.
That is quoting from the instructions manual.
Data.validation is a data type like either, but we're then accumulating applicative.
So, that doesn't really say anything if you're not...
If you haven't used this before, so what this means to me is that I can validate multiple aspects of data and collect errors in a list.
So, the validation doesn't stop on the first error.
So, it's really handy for getting all the problems at once.
Instead of ending up in the situation where you fix one error, send the data and then you get a message that, hey, this is another poll.
The thing is incorrect when you fix that and then try that.
And it's nice that you're given a whole list of errors that these things are from.
So, you know, go, we have this validate active person.
This is a function that we use to validate age options.
This is the options that tell the game of the system, what kind of person to create.
So, validate active person has a data signature of person options, arrow, validation, list of error code, person options.
So, when you are given...
When this function is given a person options, it will give back a list of error codes and original person options.
And that if validation succeeds, that list of course will be ending.
It can be combined for more complex validations, so you don't have to write all the validations into single function.
It's actually a very, very good idea to break it into multiple functions.
So, in our case, we have a validate age options that take person options, returns validation error code, list of error code, person options.
So, this will take the person options data, validate the age options part of it and return the result call.
And I tend to call this into... in show notes, I might have a look at it.
But the thing is that in our validate at person that takes a parameter of version options, we are just saying pure opt.
This pure takes the version options and transforms it into validation, list of error code, person options.
And the error code in this list of error codes is empty.
So, it just transforms into it into the different form.
And then it calls a less than asterisk, validate age options.
So, here we are validating the age options part of it.
And that less than asterisk will combine the list of error codes.
So, pure opt turns a person options into validation result, and then validate and then less than asterisk operator will go, will combine the result of validate age options with that.
And then validate age options, as I said, will check that the...
Basically, it says that if there is no age options given, it sucks it.
If there is an age, it sucks it.
And if there is an age bracket, it checks that the first number is created.
Sorry, smaller or equal and the second number, then it sucks it.
If the check fails, it returns failure age bracket, start with created and end.
Okay, and like I said, you can combine these things.
And then, if we had more options to present in our personal options, we could just add validation photos through.
So, in the validate age person, we would just use that less than asterisk and add another function.
And add another function, another function to chain all those checks together.
If we had a really complex validation, we could have some sub...
I think, for example, for name, because names are in the game pretty complex, you have different kinds of formats for the names.
And sometimes you have a first name, and sometimes you have a first and last name, and you can have a cognoment and a regnol number and whatnot.
So, you could have a validate name options that is called in the validation of the person option, and in that validate name options, you could have multiple steps.
Because the signatures of the functions are the same, so you can just easily build more complex and more complex checks.
And then you can just... when you are validating, you are just calling it one validation function.
So, I hope you could follow that probably a bit unclear, but if you have a look at the show notes, it probably makes more sense.
So, then we want to put all these things together.
So, we have a function called post admin API at person of that is of type handler value.
This is the handler function that handles post actions to the admin API at person root.
And it returns a value, meaning it returns a JSON data.
Another reason to call out, but I'm just going to walk through what it does.
So, first, checks that the current user is admin.
Then, it gets the JSON content with message, left arrow, Rikaia JSON body.
And here, you are doing already a little... here, you are passing the data into JSON format.
And if somebody tries to pass an exact eight, exact eight minus one here, then the parsing fails.
And you get an error and that error is the only thing to call it.
Okay, but if parsing succeeds, then we are loading our current data from the database and validating the person options.
Again, if the validation fails, we are returning a list of error codes and descriptive strings.
I didn't talk about how to turn those error codes into descriptive strings or how to decide what kind of HTTP code use as a return code.
But those two functions are basically just...
The error message is just a function that maps from error code into the string.
And the function to get the HTTP code is just error code, list of error code to the HTTP code.
So, nothing too tricky. You probably can imagine how to do a big case study.
It says that if the error code is this, then you give this string and if you give this string.
So, then we have validated data that is in syntactical correct, validated data.
We get a new random number generator because we are creating random versions based on the options.
We need a random number generator.
We call generate person M with our current started and the past JSON data.
And then use the eval run with the random number generator to run that component computation.
Then we end up with the person.
And if you look into show notes, we are going to test this.
This means this let part means that this is a pure computation.
This always returns the same value.
Of course, this is a random one.
That value depends on the random number generator that you passed in that.
But the thing is that this pure computation doesn't touch the database.
It doesn't read system data or anything.
I like building the system in a way that is as much as possible of these computations that don't touch into the database or any external resources.
Those things are easy to write in a way that they don't accidentally fail or throw a random error.
Now that we have this generated person, we use, we just call rundbinsertperson.
So we are just inserting person into the database.
We get a primary keyback or value for the primary key of that person.
That's the person ID.
And then we just return JSON and give it the primary key and the person.
When some system calls a API, it will be even tack the ID of the newly created person and the actual person that was created.
Or they will be given back on XDP error of some kind.
But that's all about if there's nothing else else to creating a new person yet.
I'm hoping to add more options and more things here, but that's what it is currently.
And then there's that page where the administrators can view people, view persons of the game, edit them and modify them.
I mean, modify and create new ones.
So in closing, types should represent only valid states.
You should strive to build your data in a way that it's not possible to have incorrect states there.
You parsing while parsing, you should react in valid data, obviously.
And you should...
And then there's a...
You could parse in several steps and validate as you go.
In our example, we could have a browser that is from JSON to person options and from person options to the validate person options.
Then you wouldn't really need a separate validation step anymore.
And if you have questions, comments and feedback, they are very welcome.
They say to reach me if you need an email or in the MasterDont.
MasterDont.ro.
We are in Tuturdo at MasterDont.ro.
Adaster.
You've been listening to Hacker Public Radio at HackerPublicRadio.org.
We are a community podcast network that releases shows every weekday, Monday through Friday.
Today's show, like all our shows, was contributed by an HPR listener like yourself.
If you ever thought of recording a podcast, then click on our contributing to find out how easy it really is.
Hacker Public Radio was founded by the Digital Dove Pound and the Infonomicon Computer Club.
And it's part of the binary revolution at binrev.com.
If you have comments on today's show, please email the host directly.
Leave a comment on the website or record a follow-up episode yourself.
Unless otherwise stated, today's show is released on the Creative Commons, Attribution, ShareLite, 3.0 license.