- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
206 lines
18 KiB
Plaintext
206 lines
18 KiB
Plaintext
Episode: 2918
|
|
Title: HPR2918: Selecting random item from weighted list
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2918/hpr2918.mp3
|
|
Transcribed: 2025-10-24 13:15:15
|
|
|
|
---
|
|
|
|
This is HBR episode 2918 entitled Selecting Random Item from Weighted List and in part
|
|
of the series, Haskell, it is hosted by Tuku Toroto and in about 27 minutes long and
|
|
Karima Clean Flag.
|
|
The summary is how to Selected Random Item from Weighted List using Haskell.
|
|
This episode of HBR is brought to you by An Honest Host.com.
|
|
With 15% discount on all shared hosting with the offer code HBR15, that's HBR15.
|
|
Better web hosting that's Honest and Fair at An Honest Host.com.
|
|
Hello, you're listening to the Hacker Public Radio and this is Tuku Toroto talking about
|
|
picking random items from Weighted List using Haskell.
|
|
As a title suggests, I'm going to talk about how to select random items for Weighted List.
|
|
There is too much of code this time, but it certainly took quite a while to get this working correctly.
|
|
The Weighted List, we have a list of items that have some weight and then you'll see how
|
|
the algorithms will pick one item from the list and those items with higher weight should
|
|
be more likely to be picked.
|
|
An analogy I came up with is that we have a stack of building blocks of different sizes
|
|
and that height of the building block is the likelihood of it getting selected.
|
|
Then you stack them on top of each other or that doesn't matter and then you random it
|
|
show a stick so that its length is minimum of one and a maximum of the heightness of the stack
|
|
and then you put the stick next to the stack and where it reaches is the one that you're going to pick.
|
|
That's what we are going to do and on the code side we have a list of items and those items
|
|
are defined as a Frequence A and the Frequence A is a type that has a wallware constructor of
|
|
Frequence in A so it can be A can be anything and int is that date and since A can be anything
|
|
we can have we can use this algorithm or system of whatever for with any kind of data.
|
|
Of course they have to have to be same for each of the items that you put in the same list
|
|
and the total sum of those weights is the tollness of the stack in the analogy.
|
|
We need to select a random number between one and total. I'm going to look more closely into that in a moment
|
|
and then we have a little help of function called pick that actually does the comparing and picking
|
|
the element but before we look into implementation of this we have to make a quick tickle into the
|
|
random number generators. I talked about them not too long ago but the thing is that they
|
|
in Haskell functions are pure so same input same output always so if you call a function or apply
|
|
a function twice with exactly same parameters you are going to get the exactly same result which
|
|
kind of means the randomness on a first glance is extremely hard but this is sold in a way that
|
|
you are passing in a random number generator and that random number generator is
|
|
but purely deterministic meaning that if you call the same random number generator
|
|
twice it will give you the same answer but it will also give you the new random number generator
|
|
so when you call a random number generator and say that give me a number between one and ten you
|
|
are going to get a number and you are going to get a new random number generator and then you call
|
|
that one again and say that give me number to you one and then you are going to get a different answer
|
|
and yet again and yet a new random number generator but passing around that new random number generator
|
|
and you have to remember which one is the newest one you have to always pass that around the one
|
|
that has been has is the result of the latest computation it gets tedious to pass it around
|
|
and you have to remember to return the correct one because for example if if our
|
|
our picking item from the list wouldn't return the random number generator
|
|
we could call that list and call that a function that gets the item from the list twice
|
|
using the same random number generator and we could get two two same results and that
|
|
amivia current it gets two same results and that's kind of not what I would call random
|
|
so you have to always remember to return that new generator and that gets tedious
|
|
but luckily this solution to this and call monads I'm going to do a tutorial about them
|
|
because while I know how to use them in some context I don't I can't explain them very
|
|
enough and there's so many tutorials that internet already about them that if you're interested
|
|
on that theory behind them you can go and find a suitable one to work on that is
|
|
but I can show what you do what to do with them especially in this case so we have a
|
|
monarch called monadrandom that's a type class and there's a it has several functions
|
|
we are only interested in one in this case the basic idea is the same with everything but there's
|
|
a function called getrandom r that has a type of random a set arrow double of a a arrow ma so
|
|
a is anything that has a random instance basically anything that you can generate random number
|
|
random values of it doesn't have to be a number you could have a function that the
|
|
for example could generate your random booleans you just say that I want random boolean and it
|
|
basically seems to slip so coin which one you're going to get through of all but anyway random a
|
|
set arrow double a a arrow ma is the type signature so a is something that has a random
|
|
instance couple of couple a a is a couple with two values lower bound and upper bound you're
|
|
asking value between these two pounds inclusively and result is ma and ma here is the monadrandom
|
|
r actually it's a run and a is our value so if it gives a it gives you a value between lower
|
|
and higher bound and that value is returned in the context that carries the random number
|
|
of generator so now that you have the not that the m holds the random number of generator actually the monadrandom
|
|
context you can use this one as a basis of a next operator operation when you are when you are
|
|
operating in a within the monadrandom in your computation you can call get random a multiple times
|
|
and you don't have to worry about passing in the random number generator or taking the new value
|
|
with you just kea but the result of the random call and the monad x ke of the
|
|
trading that new random number generator along the computation so we just use just the get random
|
|
and that's it but in the end if you have a big computation doing some randoms but we are left with the
|
|
ma and we have to somehow turn this ma into a and for that there's a function from run run that has
|
|
a touch signature of random gen g sat arou rank kea arou g arou ag
|
|
double ag sorry okay might be a good idea to look into show note at this point what that what
|
|
does this mean is that the g is a random gen meaning that g is something that can generate random
|
|
values for you that has a specific specific functions specific interface random gen is a
|
|
tag class okay g is that so we have when we are calling run run we have to give it a run kea
|
|
this is what we get from the as a result of the computation that uses monadranda
|
|
a is our a is the result that we want a g is that random generator coming from somewhere and
|
|
result is double of ag so we get a couple of the first element is our n's and second element is
|
|
that new random generator so essentially we when we are using that that random r we are constructing
|
|
a computation that results to run kea that is that is that is what we want and then we run this
|
|
computation with run grant and we get double of ag as a result first element is being that what
|
|
we are after and second element is the new random number generator we can also do a eval
|
|
that is similar that has a similar signature it has a random gen g set our grant g a our g our a
|
|
exactly same as previously except that the final that the result will be just a instead of double
|
|
ag so we are getting a value that we are after as a result but we are discarding the new state of the
|
|
random number generator sometimes this is what we want sometimes we need that new random number
|
|
generator so it depends on the case which one to use so if you if you got the well the basic idea is
|
|
that we are using that monocrandom to construct a computation that depends on a some
|
|
something that can generate random values for us and then we use that something to give us
|
|
the final result then leave the trick here is that since that our computation is not necessary
|
|
but it can be usually it is it can be pure meaning that same data goes in same data goes out
|
|
so we can have a really big computation that uses randomness and if we call it with the exact same
|
|
values we get it's the same result that every single one and the only only thing that changes here
|
|
is that runs g that that I'm sorry front random ten g that is the what generates as random values
|
|
so if you use the same random number generator twice to call our computation that ought to run our
|
|
computation we get the same result back okay let's look into actual into the actual implementation
|
|
hopefully this make this will make my explanation a little bit clearer so first we have a
|
|
frequency which is for expressing weight of individual item in a list it's parameterized so you can
|
|
use it with any data you can have a frequency int you can have frequency pool you can have frequency
|
|
your own data data byte so this one works with everything and it's divided as data frequency a
|
|
equals frequency int a and then deriving show it equal so you can have a one-value of a in that
|
|
then next is a little function that is used to determine to determine which item to choose from the
|
|
list based on the wage based on based on the list and the random number generator random number
|
|
that has been already picked so and in case the value is outside of the
|
|
valid range so it's a less than one or it's a
|
|
created and the length of the or the created and the total sum sum of the wage some of the total
|
|
wage some some of the wage then it also doesn't require anything so it returns mapping otherwise
|
|
it returns just a so it always returns something but it can be just a in case where the values are
|
|
reasonable or it can be nothing than some parameter values okay and our definition for that is
|
|
pick has a has a signature of list of frequency a arrow int arrow may be a so given a list of
|
|
frequencies and the number it returns a may be a pick a empty list underscore equals nothing so
|
|
if you calling this with an empty list you are going to get nothing it does nothing you can
|
|
pick from the empty list the more interesting case is pick open patterns frequency x item
|
|
colon xs close burn i so we are deconstructing the parameter and i is the index to pick from or
|
|
then random number to use to picking on and frequency x item colon xs means that frequency x item
|
|
is the first item of the list and xs is rest of the list then there's a five i less or equals
|
|
an x equals just item five otherwise equals pick xs open by an i minus x close by so what happens
|
|
yeah there's a case of there's a this is called card i should explain them more closely but
|
|
hope you you can follow them maybe to a proper episode later i'm doing this is incompletely
|
|
from order apparently so there's two cases one is that the i our random number is less or equal
|
|
to wait of the first item in the list then we are returning i sorry so i is yeah i is less or
|
|
equal to the weight of the first item of the list then we are going to return that first item
|
|
otherwise we are going to pick item and we are so we are recursively calling we are going to
|
|
use pick xs so rest of the list we are dropping the first item of the list and we are
|
|
subtracting the weight of the our first item from the total weight so basically you could
|
|
visualize this as a taking the bottom cube from the stack and shorten in the stick with the height
|
|
of that height of that cube or building block the end result is that you have a
|
|
loose stack where the which has one item less and which has a foot where the stick still reaches
|
|
exactly same point and then this repeats until you find the find the item where the top of the stick
|
|
reaches or if something goes wrong and the stick is longer than the stack you end up with the empty
|
|
list and then the first case pick empty empty list underscore equals nothing gets in and you get
|
|
nothing as a result so this is how this is how you pick the item and then we need the
|
|
finally we need the calculating the total of weights for frequencies and choosing the random
|
|
number and we are using that random g maybe a as a as our pipe here so we might or might not
|
|
get our result for example like I said empty list would complete into the nothing and since we are
|
|
dealing with random numbers and don't want to pass around that random number generator we are
|
|
going to use that monarch that I talk about moment ago monarch random so the signature for our
|
|
function choose this the function that you use to actually choose the random item from the list
|
|
uh that signature is a random chain g set arrow list of frequency a, arrow, and g maybe a
|
|
and two cases again choose empty list equals further on nothing
|
|
this is the in case of the empty list doesn't matter what you what random number you have for
|
|
anything just get nothing as a result I said result and notice that because we are using that
|
|
monarch random and we are not just saying choose empty list equals nothing we have to say choose
|
|
empty list equals return nothing return is a function that rushes out nothing into that rank g
|
|
so it's not a return as in pretty much any every every other languages it's a trap our result in
|
|
is out of the trunk g then the more interesting case choose two items so we have a list equals two
|
|
uh let's total equals some dollar f map uh lambda frequency x underscore equals uh arrow x items
|
|
so what we are doing here we are using a we are creating a random numbers function that
|
|
given a frequency of frequency a will return you the weight of it that in and then we are
|
|
applying that to every item in the list so we get a list of of the free one of the wage and then
|
|
we are using some to add all those together so now total is the total sum of those wage next time
|
|
and left arrow arrow get random a open bar and one comma total close by so here we are using
|
|
that get random ah that I mentioned earlier we are picking a random number between one and the total
|
|
and because we are using a monarch we have to use the left arrow notice so that first line we
|
|
have the same up things that's a that's a regular function called that was let total equals something
|
|
here and left arrow get random ah is a you is using this specific properties or rules of what
|
|
is a often often monatrandom meaning that this left arrow notation causes causes the
|
|
causes the monatrandom to trade in that random number generator so here we are using the random number
|
|
generator even when we are not talking about it and we are also taking it the new random number generator
|
|
and passing it along when we are not mentioning it this is really convenient like we don't have to
|
|
worry about that at all and the last line is return dollar tick items in so we are calling
|
|
how a tick function with all the items and with the random number that we choose and returning
|
|
the value rather into random so that's that's all all that they is so they isn't that much of the code
|
|
but there's a relatively lot of things happening and now we can randomly now we can have a
|
|
weighted list of items and we can randomly choose items from there and for example if we have a
|
|
list of two items one with a weight of one one with a weight of two and we pick 100 times from
|
|
that list we should end up having the one item as 33 and pizza percent of times and the second item
|
|
66 percent of time and a bit because roundings so it probably sounds a lot more complicated than
|
|
the actual list I arrived to the threshold after quite many details like I had a I had a first
|
|
I had a version that just took a random number generator never returned it then I realized that I
|
|
can't just that this a part of a ticket computation without losing the random number generator or actually the state of the new random number generator
|
|
then I had a version that returned the couple of of the value that I wanted the new random number generator
|
|
and that get tedious to pass around so after some reading I found that there's this
|
|
monotrandom that I can use that the front GA to return any value and have to have the systems to take care
|
|
of the passing around that random number generator so it was it just meant that I tried a lot of things
|
|
until I found one that looked reasonable for what it's working and then did quite a cleaning and
|
|
it probably took a couple of months even though I wasn't actively working on it I will
|
|
for the one version and then work on some other stuff and learned new things and then came back and fixed things
|
|
but anyway so it's not it's not case that you just sit down and think that okay I need this kind of function
|
|
and then you write it from starting from the beginning and going to the end you and well at least I
|
|
have to do a lot of trial and error and experimenting and refining refining but now that we know
|
|
how to pick random numbers sorry random items from the way that we can do some nifty things
|
|
but I'm going to talk about those next times I'm going to do some some practical to some values
|
|
of practical things with the choose function so into to fund so what we got here is a choose that
|
|
has a signal signal of random gen g cut out of frequent list of frequency a are all run g may be a
|
|
so we can given a list we can pick one item from there and in the meantime questions comments
|
|
and feedback welcome best way to reach me now this is the email or in the fediverse where I am to
|
|
put about master on social or even cooler you could record your own HBA episode catch you later
|
|
you've been listening to hecka public radio at hecka public radio dot org we are a community
|
|
podcast network that releases shows every weekday Monday through Friday today's show like all our
|
|
shows was contributed by an hbr listener like yourself if you ever thought of recording a podcast
|
|
and click on our contributing to find out how easy it really is hecka public radio was founded by
|
|
the digital dog pound and the infonomican computer club and it's part of the binary revolution
|
|
if you have comments on today's show please email the host directly leave a comment on the website
|
|
or record a follow up episode yourself unless otherwise status today's show is released on
|
|
the creative comments attribution share a light free dot org license
|