hpr-knowledge-base/hpr_transcripts/hpr3107.txt

Episode: 3107
Title: HPR3107: Generating comfortable passwords
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3107/hpr3107.mp3
Transcribed: 2025-10-24 16:58:18

---

This is Hacker Public Radio Episode 3107 for Tuesday, 30 June 2020. Today's show is entitled,
Generating Comfortable Passwords. It is hosted by CRVS,
and is about 30 minutes long,
and carries an explicit flag. The summary is,
Generating Passwords to Be Comfortably Typeable.
This episode of HPR is brought to you by An Honesthost.com.
Get 15% discount on all shared hosting with the offer code
HPR15, that's HPR15.
Better web hosting that's Honest and Fair at An Honesthost.com.
Music
Hello everyone, this is CRVS for Hacker Public Radio.
And I'm going to talk about today.
I'm going to talk about password generation.
So recently I had to change a password,
and given the fact that I was sort of prompted to change it not long after
having changed it into first place,
it seems like this is a password that should be discardable.
And one reason why I wouldn't want to use a password manager for this
is that it is actually the password I need to use to log into a laptop
that is issued by my employer.
And so it is a password that I need to know and need to be able to type.
And so I wanted a way of generating passwords that are nice to type anyway.
There are many ways of doing this,
and I just wanted to set out and do it by myself really.
And so I started with a template,
which is having a bunch of letters and followed by a bunch of numbers.
So that was version one of the,
that's version one of my password generator,
would have a six-letter password followed by an eight-nate digit number.
And so I did this fairly quickly in Python,
just import random,
set a variable letters that contains the letters from eight to z,
all lowercase,
and then I set a password list.
I set my password as a list of characters in this case,
and then just do a simple loop for I in range six.
I append to password random choice from my array of letters,
and then I append a random choice from the numbers,
the digit string from zero to one to nine.
And that is it, and the end I just do a join.
And this sort of was a very simple generation.
However, it generated passwords
that were not necessarily particularly easy to type.
One thing that you notice quite quickly
is that it generates these passwords.
So if you're generating your, if you're using, for example, Vorak,
one of the things that Vorak is optimized for is that in English,
it will have, it will most of the time alternate hands.
So when you're typing in English,
you type one letter with your left hand,
one letter with your right hand,
and so on and so forth.
For a good chunk of words.
And so I wanted to sort of use that.
And so I went on and did another version of my password,
script, and let me just switch to that.
Yeah, so, and this one is much simpler.
I mean, it's not much simpler, it's exactly the same thing.
Except now I sort of enforce this one left hand, right hand.
And the way I do that is just I type all the characters out in an array
or in a string called left.
And then all the characters up in the string called rights.
And these are the characters that are on the left hand,
and the characters that are on the right hand.
And other than that, the loops are all the same.
There's still six letters followed by eight digits.
And then you just take your, and then within the loops,
you just do this.
If the, if it's the, you iterate six times, right?
And you have your first, second, third, fourth, et cetera,
to the sixth iteration.
And if the, if you're in an odd iteration,
then you take your, I guess you take your left,
your characters from the left hand.
And if you're in an even iteration, you take your characters from the right hand.
And then you do the same for the numbers.
Again, you sort of split them from which side of the keyboard they fall on.
And then finally, you join that, all of that into a string.
So that was the second version.
And then I noticed, let me see if I noticed it already for the third version.
So let me just go and check the third version.
Because this, this went into, I think, into the fourth version or fifth.
I'm not sure.
I'll, we'll, we'll see in a second.
Amazing production value going on here.
Anyway, so on the third version, one thing that is kind of,
you would have noticed thing is that as you're splitting this into left hand,
right hand, and you have this strict, this, not strict,
but like you have this rigid structure of going, going from one hand,
another hand, one hand, another hand.
And then, and then the numbers, you end up being able to sort of,
you reduce the complexity of the password quite a lot.
Because this is a 14 letter, a 14 digit password.
So, if I'm mistaken, so yes, 6 plus 8, that is 14.
But it is actually just three letters from each hand,
and we know where they are, and four digits from each hand,
and we know where they are.
So that is a much, much lower number of, of combinations that can be made.
So, each of these elements is well delimited where they are.
So, you have like combination of so many combinations of right hand,
of left hand characters, so many combinations of right hand and digit characters,
so many combinations of left hand digit characters,
and so many combinations of left hand non-digit characters,
right hand non-digit, you get the picture.
So, I decided to sort of use an approach that will try to maintain this splitting
into left and right hand, and digit and non-digit.
Because once you go to the digits, so this, I don't use an Umpad.
I, and well if I use that, an Umpad that would be a whole different,
a whole kind of worms, but, so I use the digits on top of the keyboard.
And I use a Dvorak keyboard,
and so what I do is when I split my digits,
I instead of splitting my digits between left and right,
I want to split my digits and non-digits between left and right,
I want to split my digits and non-digits between rows of the keyboard
and then left or right.
And then I use those as states in the Markov model,
and then when I'm in a particular row, okay, so yeah, moving backwards a bit.
So, what I want to do is I want to eliminate this regularity,
and sort of have this regularity be there,
but not make it so well defined that it is predictable.
So I want this regularity to sort of be, to appear there randomly,
so that it's mostly regular, but some irregularities may appear.
And once you have these, and one way to get these irregularities is to use a Markov model.
And here the thing with the digits is that once you go to the digit row,
you don't really want to go back, because to get there you sort of need to move your wrist a bit.
And like the other ones you sort of have your fingers resting on the home row,
then you can sort of stretch your fingers a bit out,
and stretch your finger in the curl them up a bit,
and then you're in the top row, top non-digit row,
and the bottom row quite easily.
But if you go all the way up, then there you have to move your wrist a bit.
So you don't want to go there, if you go there you want to stay there,
or you want to not be moving in, out, in, out, in, out all the time.
So yeah, that's a lot of talking about this little technicality.
So yeah, so how do you go about using a Markov model?
Or what is a Markov model?
I think Tututo has spoken about this in a previous episode,
and they have, yeah, they have done this in Haskell.
Right now I'm doing this in Python,
because it was sort of a quick and dirty job.
So what, yeah, so what is a Markov model?
It's basically you can think of it as a finite automaton,
and the best way of explaining a complicated set of words,
a couple of words is to introduce another complicated couple of words.
But imagine you have some states,
and these can just like be markers, can sort of draw a graph,
and a mathematical sense of a graph that you have nodes,
and edges between the pointing from one node to another.
In this case it would be a complete graph,
so every node, from every node you have an edge to another node.
And then this edge is sort of a transition edge,
and then you weigh them, and they're directed.
So you have an edge from A to B, and the edge from B to A,
using random letters, alpha to beta, and beta to,
and one from beta back to alpha,
and then you weigh those edges with the probabilities
that you do that particular transition.
So because this is a probability,
they have to sum up to one, that's like the only thing you really need to do,
to ensure that you have your Markov model.
You need that, and you need your initial assumption.
Your initial assumption is going to be something that tells you
where your system starts.
Okay, but that's just a general representation.
So what are my states here?
My states, as I said before,
there are going to be rows and left rows of the keyboard
and left and right.
And basically you're going to put more,
what I do is I put more weight on the transitions
from the left side to the right side,
then from the left side to, from rows on the left side to rows on the left side.
And once we do that,
now it's no longer necessary,
it's no longer necessary,
the case that from one row on the left side,
I will go to, from one letter on the left side of the keyboard,
I will jump to letter on the right side of the keyboard,
or that if I switch to numbers,
now I will only type numbers,
or I will only type six numbers.
I will, now I no longer have these regularities
that I was codifying in version two.
So okay, I'll just walk through the code,
and now it would be a good time to look at the code in the show notes.
That's version three.
And it starts with this else list,
and it's a L, capital L, S.
And it's just a list where each element is a string,
and as a string,
it's just the letters in the i-th row of the,
so it's their labeled L1, L2, L3, L4,
R1, R2, R3, R4.
And L1, L2, L3,
are the first three rows of each of these sets,
the L set and the R set are the non-digit rows of the keyboard
on that Vorack keyboard, starting from the top,
and then the fourth row on each set,
and so L stands for left R stands for right, of course.
And the fourth row is just the digits,
it's just the, yeah, the digits.
So it's like 1, 2, 3, 4, 5, 6 for the left hand,
and 7, 8, 9, 0 for the right hand.
Yeah, and the reason for this symmetry,
and why it's not 5, 1 to 5 on the left,
and 6 to 0 on the right,
it's because I use an ultimate hacking keyboard,
which has this particular quirk,
when it comes to splitting your numbers,
that your 6 actually,
yeah, your number 6 actually falls on the left side,
so you end up stretching your left index finger
a bit more than usual.
But that's not that I can,
I kind of grew used to it anyway.
After that, there's a big list of lists
that I decided to call capital A,
which just says that it gives the transition probabilities.
So the h row is the transition probability
onto another one of the other columns.
So on row one,
you have 0.0, 3, 0.0, 3, 0, 0, 0, 0, 0, 1,
0, 2, 7, 0, 2, 7, 0, 0, 0, 9.
Okay, and these are the probabilities
of transitioning from something on the state,
so on state on the first one,
L1, 2, L1, L2, L3, L4,
R1, R2, R3, R4, respectively.
And if you look at it,
you'll see that the weights are larger on the Rs,
which means that you have a higher probability
of going from the left hand to the right hand.
And if you go from the fifth row onwards,
you'll see that it's in reverse.
You have a higher probability from going from the right hand
to the left hand.
And this is how I codified this idea
that it should be easier to go from one to the other,
from the left hand to the right hand,
then from, it should be more normal to go
from the left hand to the right hand,
then from the left hand, from one key in the left hand
to another key in the left hand.
And another thing that you can notice
is that on rows 4 and 8,
you will notice that the probabilities
are highest in columns 8 and 4, respectively.
So once you get to a number,
it's much more common to stay within the numbers
than without, than move out of the numbers.
And, yeah, and that's just basically
your idea behind this particular distribution.
This was made by hand.
And version 4 takes care of making
that a bit more configurable.
Next comes pi.
And pi is just the initial distribution.
And this distribution is just
higher on the home row, basically.
L1.
OK, so L1 is the home row, actually.
And I guess, yes, L1 and R1 are the home rows
in the respective keyboard halves.
Yes, yes.
So this is just setting up the probabilities.
And how do you actually,
this is the parameters of the Markov model.
So how do you actually go about using these parameters?
It's kind of easy.
First, you need to sample from these,
know how to sample from a distribution.
And the way you do that is that you sample a point,
a random number from zero,
uniform distributed number from zero to one.
And then you get the position
at which your vector is your vector,
your array of numbers.
OK, backing up a bit.
So you have your number, your array
with positions from one to eight,
because there's eight different row halves.
We call them row.
I would call them row halves in this case.
And so they're really states.
So I have my eight different states.
And if I want to,
so and I have my probabilities of being in each state,
at each point in time.
And I want to know what's the probability of going to another state.
Then I take the corresponding row on the matrix A.
And I found what's the,
and I sort of sample from that distribution
to go to the next state.
Yeah, but again, I'm getting ahead of myself.
So yeah, how do you sample from a distribution?
So the way to sample from one discrete distribution like this
is to produce the,
the CDF,
cumulative density function.
There we go.
Yeah, cumulative density function,
cumulative density.
Yeah, you could just produce the CDF of the distribution.
It is like what's the probability of getting,
of having that your,
if you sample from this distribution,
that you get a number that is smaller than one,
or smaller than two,
or like having, so having,
what's the probability of being the first state?
Being the first, one of the first two states,
one of the first three states,
so on and so forth.
And just then once you have this on,
is you generate a number from zero to one,
and then you just find the number,
the first number in the array
that exceeds this number that you sampled.
And this is what the function sample is doing.
So first you take in your distribution as L,
and then you compute the partial sums,
which is what I just described,
like you take the sum from,
so what this does is like sum L,
I put to colon I plus one,
comma zero, that's like sum all the elements,
sum up all the elements from zero to I plus one,
on the list L,
and start the sum with zero, with zero.
And then I do this for L every I,
from zero to the length of L,
so that I get every possible I,
and then I take a uniform,
a uniform with distributed number,
calling random dot uniform zero to one.
And then for each,
and then I just go over the distribution,
the CDF, the L partial,
and find the first element that is larger than you.
And that's the index of that element,
is the number I return from this function.
Okay, to generate a password,
with this is quite simple,
all right, we're generating passwords.
Yes, to generate a password is quite simple.
We take in empty list,
and we say that,
we first sample S to be from pi,
and then we add that to the password,
well, we sample S from pi,
and then we go into a loop,
where we,
yeah, we're going to,
yeah, we sample S,
so we never actually use the first state in this case.
But anyway, we sample S from pi,
we go into a loop,
once we're in this,
once we're in this loop,
we sample S from the S-th row of A,
and then we append a uniformally chosen number
from that particular element of the L's list.
And then L's stand for letters, by the way.
And then once we've looped through that enough times,
in this case, I put here 20,
we print, we just do a print,
we print the password,
so that's just quotes.join,
and then the password list,
which is just a list of strings,
and that will join into a big string.
And finally, I do away with this
with the past Gen 4,
give me one second there.
Okay, there's a password versions 5 later.
What do I do differently in this one?
Ah, right.
In password version 4,
what I did was all the characters were lowercase so far,
and I wanted to introduce some upper cases,
and so in this case,
I just introduced like what's the probability of something being uppercase,
and that's in line 30,
it's up, that's 0.1,
and so that 10% of the time a letter comes out uppercase.
And the way to make that happen is that we sample yet another number uniformally from 0 to 1,
and if it's smaller than upper,
then our probability of being an uppercase letter,
then we turn whatever character that is to an uppercase.
Not that this also affects digits,
but digits remain unaffected, really.
And that's not a problem.
And finally, version 5 is the final version so far,
and give me a second.
Okay, the version 5.
Version 5 is basically the exact same thing as before,
except that now no longer,
now the matrices are no longer created,
the matrix A and pi are no longer written by hand.
What I did was just define this probe function, ij,
that defines the probability of going from state i to state j,
which is controlled by only two numbers,
which is like the probability of going from left hand to right hand,
or right hand to left hand,
controlled by a switch hand variable,
and the probability of switching character type,
going from non-digit to digit,
or digit to non-digit.
And those are just numbers that are set at the top,
and then all the magic, so to speak,
is done on the probe function,
which just checks to see if i and j represent different hands,
or if state i and j are in different hands,
or if they're in different character types.
And then basically just makes it so that the whole thing
represents valid distributions.
Then we compute a,
and we compute pi,
and now it pi is much simpler,
which is just the one over eight,
and it's just one over eight times an array of ones.
And yeah, everything else is computed exactly the same,
as for version four, i'm pretty sure.
I could do a diff, but i'm not gonna.
Yeah, but those,
that was my,
and now basically this gives you a way of generating passwords,
that they're easy to write for if you're using VORAC.
If you're not using VORAC,
you can just change the else list,
and that will adapt to your keyboard.
And this generates passwords that are relatively nice to type
because they're left-right, left-right, left-right.
Maybe you don't want to do 20 characters,
maybe you want to do less,
maybe you want to
put more weight on the,
maybe you want to get rid of the digits,
I don't know, I mean,
the world is your oyster,
but anyway, this was just a fun little project,
or not so much a fun, but sort of a necessity.
Yeah, well, a good excuse to use Markov chains, as usual.
So yeah, that was it for today.
Thank you for joining.
Remember to tune in tomorrow for another episode of Hacker Public Radio.
You've been listening to Hacker Public Radio at Hacker Public Radio.
We are a community podcast network
that releases shows every weekday Monday through Friday.
Today's show, like all our shows,
was contributed by an HBR listener like yourself.
If you ever thought of recording a podcast,
then click on our contributing to find out how easy it really is.
Hacker Public Radio was founded by the digital dog pound
and the Infonomicon Computer Club,
and is part of the binary revolution at binrev.com.
If you have comments on today's show,
please email the host directly,
leave a comment on the website,
or record a follow-up episode yourself.
Unless otherwise status,
today's show is released on the creative comments,
attribution, share a life,
3.0 license.