Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
374
hpr_transcripts/hpr3228.txt
Normal file
374
hpr_transcripts/hpr3228.txt
Normal file
@@ -0,0 +1,374 @@
|
||||
Episode: 3228
|
||||
Title: HPR3228: YAML basics
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3228/hpr3228.mp3
|
||||
Transcribed: 2025-10-24 19:11:38
|
||||
|
||||
---
|
||||
|
||||
This is Haka Public Radio Episode 3228 for Wednesday 16th of December 2020.
|
||||
Today's show is entitled, YAML Basics and in part on the series,
|
||||
Programming 101, It is hosted by Klaatu and in about 34 minutes long and Karima Clean Flag.
|
||||
The summer is, Learn about sequence and mapping in YAML.
|
||||
This episode of HBR is brought to you by AnanasThost.com.
|
||||
Get 15% discount on all shared hosting with the offer code HBR15.
|
||||
That's HBR15.
|
||||
Better web hosting that's Honest and Fair at AnanasThost.com.
|
||||
Everyone, this is Klaatu, you're listening to Acro Public Radio.
|
||||
This episode is about YAML.
|
||||
YAML ain't a markup language.
|
||||
Well, if it's not a markup language, what is it?
|
||||
It is a data serialization format.
|
||||
That's how it describes itself.
|
||||
YAML is a, is for data serialization.
|
||||
And what that means is that it is a text format with a specific structure.
|
||||
We all know that you can have text formats with no structure,
|
||||
just plain text, type some ASCII symbols into a file, save it.
|
||||
Now you've got text without structure.
|
||||
And that works for some things, for beat poetry and for random thoughts.
|
||||
That works fine for programmatic things that you want to parse and understand and process.
|
||||
That doesn't, that usually does not work so well.
|
||||
And it is a lot easier to ingest data into a computer or an application running on a computer.
|
||||
If you have some kind of predictability, some kind of preset structure.
|
||||
You can, of course, invent your own structure.
|
||||
It's not rocket science.
|
||||
You could list, for instance, a series of animals in a text file
|
||||
by placing the animal type or family or genus.
|
||||
I don't really know scientific terms in one column.
|
||||
And then, delimited by a, I don't know, a space.
|
||||
And then, list the other thing that animals are called, like their proper name,
|
||||
their other name, their common name.
|
||||
So, for instance, you might have penguin space, emperor, penguin space, gen two,
|
||||
penguin space, rock hopper, and so on.
|
||||
And then you might switch over to a different species.
|
||||
For instance, cat space, house, cat space, lion, cat space.
|
||||
I'm quickly realizing that I know nothing about animals or how they're categorized.
|
||||
But anyway, the point is you've got this column of an animal type,
|
||||
and then some delimiting character, and then the more common name for that animal.
|
||||
That seems pretty simple, and you could do that,
|
||||
and parse that pretty quickly, probably with cut or awk or whatever you want to use.
|
||||
And that would work.
|
||||
That would be fine.
|
||||
But the problem is that when you're inventing your own schema like that,
|
||||
you do frequently find that you've not accounted for something.
|
||||
And so, in the moment, you're thinking, okay, well, I'll just make this up as I go along,
|
||||
and it seems to be working for the first five, and then suddenly you hit the thing where it says,
|
||||
penguin space, little space blue.
|
||||
And now suddenly, your parser, which believes that it delimits each field by a space,
|
||||
thinks that this penguin is simply called little, because it sees the space
|
||||
between little and blue, little blue is a name of a New Zealand penguin.
|
||||
And it throws the rest of it out, because it doesn't know,
|
||||
it wasn't told about a third potential field.
|
||||
So, it takes the first two and moves on to the next line, and so on.
|
||||
YAML or any structured data format that is widely known helps you prevent those kinds of mistakes.
|
||||
You can, because you learn that schema, you learn that method of serializing data,
|
||||
and then you use it, and then you are able to leverage libraries and
|
||||
applications, and other things that other people have developed to help you
|
||||
parse the data that you've entered, or that you want to ingest one way or another.
|
||||
That's the main, I think, advantage to, for instance, YAML.
|
||||
There are other advantages, but I mean, that's kind of just the fact that it exists,
|
||||
and that there are other people using it is one of the main advantages.
|
||||
People also like YAML, because it appears to be relatively intuitive,
|
||||
and that's kind of what I wanted to talk about today.
|
||||
It's actually deceptively not as intuitive as you think, or said a different way.
|
||||
It's very intuitive, but the thing that you figured out through your intuition might be wrong.
|
||||
I'll start then with something that I rarely do start with, which is the wrong way to interpret,
|
||||
YAML, and I don't generally like to talk about the wrong way of doing something,
|
||||
because that generally just confuses people.
|
||||
But in this case, I kind of feel like it's important to get this out in the open,
|
||||
because when you look at YAML, you think that it looks more familiar than it actually is.
|
||||
Here's what I mean. Let's say that you've got a YAML file, and you want to list those penguins.
|
||||
You might think, okay, well, I get the gist, I get the idea here.
|
||||
I do a dash and a space, and I put an item, and then what I want to, and remember, this is wrong,
|
||||
and what I want to show a child-parent relationship, then I indent and continue with my dashes.
|
||||
It's just like a bullet list, right? No, it's not. I'm telling you the wrong thing.
|
||||
I'm telling you lies right now. Don't listen to me. Don't internalize any of this.
|
||||
So you might think that YAML is sort of a structured bullet list, and as long as you've done a bullet
|
||||
list in your notebook, like in a scrap piece of paper, when you're going out to do groceries or
|
||||
something, you might think that it's essentially the same thing. It is not, though.
|
||||
If you were to do that, if you were to, for instance, make a YAML file that opened with dash,
|
||||
space penguins, and then the next line, space, space, so you're indenting now, dash, space,
|
||||
emperor, next line, dash, or space, space, dash, space, gender, next line, space, space,
|
||||
dash, and so on. So you're indenting things as you would for, you know, your own little,
|
||||
if you were going out to the store, you would, you know, bullet point hardware store,
|
||||
and then under that, you would put a little dash, maybe hammer, dash, nails, dash, screwdriver,
|
||||
dash screws, and so on. And you would know, when you look at that, you think, okay, well,
|
||||
I see that the heading here is hardware store, and under that are all the things that I need to get
|
||||
from the hardware store, from within the hardware store. And then the next heading, which I'll
|
||||
de-dent, will be the grocery store. And so then I'll put under that, all of the things that I need
|
||||
within the grocery store, and so on. That's a very intuitive kind of natural way that most of us
|
||||
have learned how to list things. Heading, sub-item, sub-item, sub-item, heading, sub-item, sub-item,
|
||||
sub-item. That is not the structure of a YAML file though. So let's talk about YAML correctly.
|
||||
Let's talk about what it actually is now that we know what it is not. There are, luckily,
|
||||
you'll be happy to know two data structures in YAML. That's all there are. Now you can embed
|
||||
those data structures within one another, but the building blocks that you have for YAML files,
|
||||
there are only two of them. For this process, for this exercise, or when working with YAML,
|
||||
you want to have two things available to you. One, you want YAML Lint. YAML Lint is, I think it's
|
||||
a Python script, and that looks at a YAML file and tells you whether or not it is valid. But more
|
||||
important than that, because you might be thinking, well, I can just throw my YAML file at my
|
||||
application, and if my application crashes, then I'll know that it's not valid, right? Or my parser,
|
||||
or whatever. YAML Lint gives you really good description of what is wrong. It tells you,
|
||||
generally, it tells you exactly what's wrong with your YAML syntax. You can install it with a PIP
|
||||
install YAML Lint, and that's a double L there in the middle YAML L-I-N-T YAML Lint. So install that,
|
||||
you'll do yourself that favor, you'll thank yourself later. You'll also need a text editor.
|
||||
That's how you make YAML, you type it into a text editor, so that's pretty simple as well. Now
|
||||
there's another thing that I tend to use, which you're welcome to use yourself. It is called
|
||||
YAML2JSON. This is a Python script that I wrote myself for my use. It's online. I'll link to it
|
||||
in the show notes. You can use that one, or probably I imagine there must be half a dozen other
|
||||
ones out there online that you could find. But I do like to use this because YAML, again,
|
||||
intuitively, you look at it, and it may look to you like it's one thing. But seeing it with the
|
||||
limiters or rather scope characters to define the scope of things sometimes changes the data structure
|
||||
or makes it a little bit more apparent. Now that might just be me. It might just be what my
|
||||
eyes prefer to see, so that may or may not be important to you. But I do find it useful to have
|
||||
a YAML2JSON parser. JSON is a subset of YAML, technically speaking. I mean, I don't know that
|
||||
the creators of JSON talk to the creators of YAML. I don't know what the overlap there is. But
|
||||
when looking at the structure of data serialization, I guess people generally consider JSON a subset
|
||||
of YAML. But it is kind of wildly different when you are looking at it. Visually, it's quite
|
||||
different. But the two translate, or well, at least YAML translates to JSON more or less naturally.
|
||||
And certainly with the Python, YAML library, it's literally just one method that you call or
|
||||
function that you call. And then you have all your data in JSON. So you can kind of compare the two
|
||||
and make sure that you're logic and the way that you're sort of thinking of your data is reflected
|
||||
in the way that you have structured it in YAML. Okay, let's talk YAML. So first of all, YAML,
|
||||
as I said, two data structures. There are sequences and there are mappings. And I guess I should
|
||||
really say those as singular. So there is a sequence and there is a mapping. So let's talk about
|
||||
each one. A sequence is exactly what it sounds like. It is a sequential list. It is a list of items
|
||||
in a sequence. These items are indicated to YAML by a dash and a followed by a space. So it's a
|
||||
dash space and then some string or some value, I should say, doesn't have to be a string. That's a
|
||||
sequence. So in a YAML file, if you wanted to create a list, a sequence of things you could do,
|
||||
for instance, let's do a emax list dot YAML. Well, now that we're typing, I have to mention that
|
||||
the first file that the first line of a YAML file needs to be three dashes. That is the, what they
|
||||
call a YAML document delimiter. When there are three dashes, when YAML sees three dashes all on one
|
||||
line, it knows that this has started a new record, essentially. Okay, so we've got our three dashes
|
||||
on its own line and then I'll hit return. And then we'll just do, like I say, a sequence. So we'll do
|
||||
dash space, emperor, and then a new line, dash space, gen 2, new line, dash space, little blue.
|
||||
And that's where we'll end it, sort of. Now YAML really, really likes to see a new line character
|
||||
at the end of a record. So I'm going to not do that right now, just to kind of prove to you how
|
||||
useful YAML Lint is. And then I'm going to run my new file through YAML Lint, that's YAML Lint.SpaceList.YAML.
|
||||
And it gives me an error. It says no new line character at the end of the file, new line at
|
||||
end of file. So there, it's telling me exactly what the problem is. And it's giving me the
|
||||
opportunity to fix that. So, well, it's not giving me the opportunity. It's refusing to continue
|
||||
until I fix it. So I'm going to open the file back up. I'm just going to hit return at the end
|
||||
at the, on the very last line. And then I'll run it through YAML Lint again. And it says that it's
|
||||
valid. So now I know that I've got good YAML. And that's the sequence. That's a list of items
|
||||
in a YAML document. That's valid YAML. Looking at it, honestly, it is pretty obvious as to what
|
||||
that is. We understand that that's a list. But if we wanted to see it in a different structure,
|
||||
just to kind of really, really drive home this point, we can do a YAML to JSON list.YAML.
|
||||
Remember, this is that little YAML to JSON converter that I wrote. But like I say, there's probably
|
||||
a dozen others out there online. And I'll also link to this one. So on this, the output of this
|
||||
script shows my YAML in JSON format. And it kind of confirms what I said. This is a list,
|
||||
right? Well, as it happens in JSON, this looks exactly like, well, to my eyes, it looks exactly like
|
||||
a Python list, square bracket, quote, inferr, quote, quote, comma, quote, gen2, quote, quote,
|
||||
comma, quote, little space blue, quote, quote, quote, square bracket. There you go. It's a list.
|
||||
Each item is a distinct element in this simple array. So if I were to take that data and pass it
|
||||
to something like Python or Java or Lua and say, hey, treat this as an array and give me the, I
|
||||
don't know, first element, then I would, I am pretty confident that I would get gen2 back. I said
|
||||
the zero with element, I'm pretty confident I get emperor back and so on. And that's what we want.
|
||||
So now, just to prove to you that a list is very specific in how it can be structured in YAML,
|
||||
I'm going to, I'm going to second guess ourselves. And I'm going to go back into this list.YAML.
|
||||
I've got my three dashes. That's good. We'll keep that. We know that's necessary.
|
||||
And then now this is wrong. So be, be alert that this is incorrect. I'm going to do a dash
|
||||
space penguins. So in my incorrect thinking, I'm pretending like this is my header penguins. It's
|
||||
not. This isn't going to work. And then I'll go to the next line and indent emperor, gen2,
|
||||
and little blue. So now I've got penguins on its own line. And then indent it. I've got a
|
||||
dash-based emperor, dash-based gen2, dash-based little blue. I'm going to save that. We're going to
|
||||
run that through YAML length. It says it's valid YAML. So I guess we're good to go, right? I could
|
||||
end the demonstration here. Well, no, I can't. So I'm going to do YAML 2json list that YAML again.
|
||||
And now the data structure looks really different. I've got the square brackets and I've got the
|
||||
quotes. And inside the quotes, I've got penguins, space dash-based emperor, space dash-based
|
||||
gen2, space dash-based little space blue. So now my array, my list, contains exactly one
|
||||
item, which is penguins' emperor, gen2, little blue. That's the item that it contains, right?
|
||||
So if I said, if I passed this to a parser of some sort or to a language of some sort and said,
|
||||
hey, give me the first element of this array, it would tell me that there was an index error. There
|
||||
is no first element. If I asked it the zero-eth element of this array, it would return the entire
|
||||
string, penguins, dash, emperor, dash, gen2, dash, little blue. So these are not distinct elements
|
||||
any longer. These, these, as far as YAML and now translated to JSON, as far as those two markup,
|
||||
not markup languages, data, serializers, understand. This is all one element and that is not what we want.
|
||||
So a sequence is exactly, it is exactly a single, I guess, column of items, delimited by a dash
|
||||
space. You cannot just indent things, willy and nilly, as they say, in order to sort of suggest, as
|
||||
your brain wants to do, is to suggest a parent-child relationship between those items. That's not what YAML,
|
||||
that's not actually talking to YAML the way that you think it's talking to YAML. So that is one
|
||||
kind of data element, a sequence. And that is, that's as simple as it gets. It's just a sequence of
|
||||
items with a dash space in front of each item. Each item is on its own line and there is no
|
||||
indentation happening here. So that seems a little bit too simple. Luckily, there's another kind
|
||||
of data and it's called a mapping. So I'm going to emacsmap.yml. Let's do that. And we open with
|
||||
three dashes. If you said that, then you have learned the first important stage of YAML. So that's
|
||||
good. Three dashes on its own line. And then we'll do something like, I don't know, penguin,
|
||||
and I'm just typing, just literally, just, I'm starting with p-e-n-g-u-i-n, and then colon,
|
||||
emperor. And that's sort of, that's an element. Penguin, colon, emperor. That is a mapping
|
||||
element, according to YAML. It has a key, which is penguin, and a value, which is emperor. Now,
|
||||
what if I went to the second line? Well, actually, let's, let's stop there. So I'll go to the
|
||||
second line just to get that new line character. Remember, I said that YAML really likes those new
|
||||
line characters at the end. We'll run that through YAML int map.yml. And I get no errors. We'll run
|
||||
it through my fancy little YAML to JSON conversion program. YAML to JSON map.yml. And I get a
|
||||
JSON data structure back that is a brace with quote penguin, closed quote colon, and then quote
|
||||
emperor, closed quote, closed brace. So to me, that looks more or less like a Python dictionary,
|
||||
for instance. You've got your key and your value. It's a key in value pair. It's a pretty common
|
||||
structure in, well, certainly configuration files. But lots, lots of different things. You use key
|
||||
in value pairs, databases, spreadsheets, and so on. That's pretty common. And that's what that
|
||||
gives you. Now, let's explore a subtlety of YAML, which is, are youably not necessarily, necessarily,
|
||||
related to this, but it's something that I want to bring up. So you've got three dashes,
|
||||
any penguin, colon, emperor, and then I'm going to do it in the next line, penguin, colon,
|
||||
gen two, and then a blank one or a carriage return of whatever it's called a new line character
|
||||
at the end of that string. Now, I'm going to pipe it through YAML int and it tells me that there's
|
||||
an error duplication of key penguin is in mapping. So what that's saying is that there are two keys
|
||||
in the same record. Now, if I pipe this through, interestingly, through my little YAML
|
||||
to JSON parser or converter, it actually doesn't error out. It just gives me the most recent key
|
||||
value pair. So I pipe that through and it says penguin, gen two. And I mean, that's not wrong. It's
|
||||
just, it, my emperor penguin got eaten. So I'm going to go back into my map here. And remember,
|
||||
I said those three dashes are really important to YAML and that they sort of delineate these YAML
|
||||
documents as in YAML lingo. They're documents. I mean, they're in the same text file. So to you
|
||||
and me, it probably feels like, well, that is one document. But YAML, you can separate it for YAML
|
||||
with these three dashes so that you can have sort of two documents in one file. So I've got a
|
||||
three dash penguin, colon, emperor, three dashes penguin, colon, gen two, and then my new line
|
||||
character. Oh, and I think I actually have to close that with three dashes. Pretty darn sure
|
||||
that I have to do that. Pipe that through YAML lint. I get no errors. And then I pipe it through my
|
||||
little converter and I get penguin, emperor penguin, gen two, and then null. So that's just kind of
|
||||
a point of order that you, that any, that any YAML document or if you want to think of them as
|
||||
records, you, you cannot double up. You cannot validly double up on your keys. There, there must be
|
||||
unique keys within each record or each document. Okay. So point is we've got this new mechanism
|
||||
available to us now, which is a word followed by a colon and then a space and then another word or
|
||||
another value. And, and this maps a value to a specific key. It's like assigning a variable
|
||||
in a programming language very, very much like that. You're just saying, well, here's, here's a term
|
||||
that I'll use broadly to represent whatever I need to represent at any given time. So we say penguin,
|
||||
sometimes we're meaning emperor, sometimes we're meaning gen two, sometimes we're meeting a little
|
||||
blue, whatever. And we can specify what kind of penguin by mapping that value, the specific value
|
||||
to this sort of generic term that that we've chosen doesn't have to be penguin. But in this case,
|
||||
it makes sense that it is penguin because that is what we're talking about. And certainly descriptive
|
||||
keys tend to, it's very advantageous. So we'll, we'll keep it at penguin, colon, space, emperor.
|
||||
And that's a mapping. That's all you need to know about mappings to be honest. That's, that's,
|
||||
again, that's as complex as it gets. The, the, the, now you know, the two data types, the two
|
||||
building blocks of yaml, you know, sequence, which is a bunch of different lines with a space in
|
||||
a dash in front of each value, and you know a mapping, which is a key, colon, and then a value.
|
||||
You're done practically. But of course, you're not really done because in fact, you can use,
|
||||
these are building blocks. These things are, are building blocks, meaning that you can, you can
|
||||
combine them in new and interesting ways. So let's say that you do want to list those penguins,
|
||||
again. So you might have, for instance, in map.yaml, three dashes, and then we'll open it up with
|
||||
just the word penguin, colon. But then instead of giving just one value, emperor, what if we fed
|
||||
it, what if we entered a sequence instead? Let's try that. So we're going to do penguin, colon,
|
||||
and then new line. So no, no value to this key. And then we're going to indense. I'm just going
|
||||
to do two, two indents. You, you, you can do more or fewer actually indentations. But I think the,
|
||||
the convention seems to be two spaces, although I don't know, maybe some Python people prefer like
|
||||
four spaces or something. I'm not sure. But I'm going to do a space space. And then I'm going to
|
||||
do my dash space to open up a sequence. And I'm going to type in the word emperor. And then I'm
|
||||
going to go to the next line, which emacs automatically indense for me. So I'll just do a dash,
|
||||
dash space, gen two, next line, dash space, little blue. So now if you can picture it, I've got
|
||||
penguin, colon, and then an indented block that is a sequence. I'm, I'm terminating that with a
|
||||
new line character. Of course, I'm going to run that through yaml lint. I get no errors. And then
|
||||
I'm going to pass it through my JSON file just so that we can kind of see it in a different context.
|
||||
And this is exactly what I'd hoped for. So now I have a JSON entity here to, again, to my eyes.
|
||||
I would call this a dictionary from Python experience. So penguin, well, quote, penguin,
|
||||
close, quote, colon. And then within this, within this, this bracketed section, or this race,
|
||||
the curly brace section, I have embedded square bracket lists, emperor, comma, gen two,
|
||||
comma, little blue. So I have a dictionary element, penguin, colon, some value. But the value is
|
||||
a list of three different values, emperor, gen two, and little blue. And those are all distinct.
|
||||
So I could identify with, again, some programming language of my choosing. I could identify the key
|
||||
of penguin, and then zero in on which kind of penguin I wanted from, from this list of all possible
|
||||
penguin types. And just to build on this example really quick, we could go back into this file and
|
||||
and add another key. Like, so we have penguins. And then the sequence of penguins,
|
||||
Imperage into a little blue. We could also do, for instance, I don't know, demon, colon,
|
||||
space, space, dash, space. We could do BST. That's the name of the BST guy, right? BST.
|
||||
And the new line, space, space, dash, space, imp, and then space, space, dash, space,
|
||||
globretzu. And then in that with a new line, then we can do a YAML lens on that,
|
||||
just to make sure that we're still valid. Yep, no problems there. And then run that through my
|
||||
little converter. And sure enough, we've got curly braces, quote, penguin, closed quote, colon,
|
||||
and then a list of all the penguins, Imperage into a little blue. And then a comma, and then a new
|
||||
key, quote, demon, closed quote, colon. And then a list assigned to that value, or to that key,
|
||||
rather, with BST imp and globretzu. So you can have different keys with a unique, with its own
|
||||
little sequence embedded into each key, all within the same YAML document. Now, if we did that
|
||||
with, for instance, penguins again, then we would have to make a new document, right? That would
|
||||
be a different data set. But because the keys are unique, penguin, demon, will to be, or
|
||||
I don't know what kind of animal a will to be stizz, or a new bovine cow. Are they cows,
|
||||
bulls? I'm not sure. Whatever. Shouldn't have chosen animals for my example set here.
|
||||
The point, though, is that a unique key, as long as you get unique keys, you can fill up your YAML
|
||||
document with whatever you need. It's just, you can't have the same key appearing more than once.
|
||||
Okay, so now let's talk about embedding, I don't know, maps into a sequence. Let's try that,
|
||||
should we? Yeah, why not? Let's try it emacsmap.yaml. This time, we're going to, let's make our key
|
||||
just like animal. And then we'll do a colon's, colon space, colon new line. And then on the next,
|
||||
wait, what am I doing? I'm embedding maps into a sequence. Okay, got it. So I've got an animal
|
||||
colon, space, space, dash, space, penguin, colon, emperor, penguin, colon, gen two, penguin,
|
||||
colon, little blue. Close it with a new line character of course, and then run it through. So
|
||||
before I hit return here on this YAML, do you think that's going to work? Or do you think
|
||||
that's going to fail? So I had animal as my key. And then as the value, I had a sequence. And in
|
||||
each sequence item I had, the key is penguin, followed by the type of penguin. There's no wrong
|
||||
answer. Well, there is a wrong answer, but you shouldn't fear. There's no penalty for having
|
||||
the wrong answer. Just kind of think about it for a moment. It is a bit tricky. Well, it turns out,
|
||||
if you hit YAML lent, it does not fail. It succeeds. That's some valid stuff there. And again,
|
||||
I feel like, I don't know, from my brain, YAML doesn't necessarily make that super obvious as to
|
||||
why. I mean, it's kind of obvious now, but when first sort of grappling with YAML, it was very
|
||||
confusing to me. And it sort of has everything to do with scope. So when you're separating the
|
||||
different keys with those three dashes, you're scoping out where that key appears. And you're
|
||||
saying, well, this key is valid here. And then I'm going to put three dashes. And now I can have
|
||||
that key again with a different value, and nobody cares, because we all understand those three
|
||||
dashes mean it. We're in a different document. Well, it's kind of similar in this setup where we have
|
||||
animal as the key, but then we embed a list into that value. So the value of this key is its own
|
||||
list. Well, that's one thing, but to make it even more complex within that list, we've embedded
|
||||
mappings, which have their own little scope, because they're each in their own little items.
|
||||
So you've got map, you've got penguin emperor enclosed in curly braces, you've got penguin
|
||||
gintu enclosed by curly braces. You've got little blue penguin little blue by curly braces.
|
||||
So the exact JSON of that, and you can try this on your own if you need to see it to kind of wrap
|
||||
your mind around it, but it's curly brace, quote, animal, quote, quote, colon space square bracket
|
||||
curly brace, quote, penguin, close quote, quote, colon, quote, emperor, quote, quote,
|
||||
close curly brace, comma, and so on. Until you get to the very end, and then you close your
|
||||
final curly brace, which would be one after little blue, you close your square bracket, which kind
|
||||
of closes the value out for the animal, and then finally, close the whole thing, the whole document,
|
||||
which is the curly brace. So I don't know how well that comes across through audio, but you can
|
||||
try it on your own, and you can kind of see, you'll see why, oh yeah, okay, I get why those
|
||||
keys can be distinct from one another in this setup while they couldn't be distinct in some other
|
||||
setup. And that's this underscores, even if you don't sort of internalize what I just said,
|
||||
this does underscore the flexibility of those simple two little building blocks that you can embed
|
||||
in one another, and it completely changes sort of the structure or the scope of the data, the way
|
||||
that the data relates to each other. In one format, you're only allowed to have one kind of penguin,
|
||||
and then you embed it into a mapping and into a sequence, and suddenly you can have all the
|
||||
penguins you want. It can be confusing to look at when you're looking at, for instance, an
|
||||
Ansible Playbook, and you're looking at this thing, and you see name, name, name, name, you know,
|
||||
you see all these these repeating keys, and you just think, why are these here? Well, they're
|
||||
there because they they've been scoped into a different data element, and so they're not
|
||||
interfering with one another. And that's a very important thing to realize when looking, especially
|
||||
I find at Ansible Playbooks, because you do see a lot of what appears to be repetition in an
|
||||
Ansible Playbook. And I think mentioning Ansible Playbooks is important in a way, because it also
|
||||
kind of it exposes the fact that they're there really isn't a right or wrong way to structure
|
||||
your valid YAML data. So there is a wrong way to structure YAML, and YAML Lint will tell you if
|
||||
you've done that. But as long as it is valid, there isn't really necessary, I mean, a data scientist
|
||||
or rather information scientists might very much argue with me on that on that concept, and I would
|
||||
gladly concede, but in general, like for your own purposes, I mean, there might be optimal ways,
|
||||
right? But as long as it's valid YAML, you can structure your data practically any way that you
|
||||
want as long as you have anticipated that in your parser, or as long as you account for it in your
|
||||
parser. So again, going back to Ansible, which is what made me think of this, there are there are
|
||||
certain structures in an Ansible Play that you look at it, and you just think, well, wait a minute,
|
||||
now that is neither a mapping nor a sequence. So why is that valid? And it kind of goes back to
|
||||
that wrong example that I did where we were mapping list items to another list item, to a sequence,
|
||||
to a sequence, which is not possible, right? You have to, you can map a key to a sequence,
|
||||
or a sequence to a key, I guess, but you can't just indent things whenever you want in hopes of
|
||||
there being some kind of suggestion of inheritance. And the reason for that is, if the reason
|
||||
that Ansible tends to sometimes allow that sort of thing is because the parser knows that that's
|
||||
what it's going to get. So in other words, if I go back to the bad YAML example here, which I think
|
||||
I call bad.YAML, if I go back to that and and type it back through my converter, remember we got
|
||||
this big huge chunk of a list that is one item, penguins, dash, emperor, dash, gentus, dash,
|
||||
little blue. Well, as long as my parser knows that I am going to feed it and array that contains one
|
||||
item, but that that one item contains four strings separated by space, dash, space, then there's
|
||||
no problem. So I'm saying this because I want you to know that there are no YAML police. They're
|
||||
not going to come knocking on your door, making sure that your YAML sort of like makes the most
|
||||
sense and is the most optimal and logical order of data that you could have possibly have written.
|
||||
If you're writing YAML for your own data from scratch, you're having to think of how you're
|
||||
structuring it yourself, then go for it. Do whatever makes sense for you and then have your parser
|
||||
process that accordingly. If you have to adapt it later or change it, you modify it later
|
||||
because you realized, well, that really shouldn't just be a list. There should be a parent item
|
||||
there. And of course, you know because you know the two different types of YAML building blocked
|
||||
data types. You'd know that, well, okay, if I want a heading item there, I need to make a mapping.
|
||||
And in that mapping, there will be a list. So I need to indent that list under this mapping
|
||||
that has a key and no value or rather a key and then the value is the sequence that is then
|
||||
indented. That sort of thing. As long as you're making valid YAML choices, then your YAML is valid
|
||||
and you will be able to parse it with lots and lots of different pre-written YAML libraries
|
||||
without fear. As long as your parser knows how to then interpret what you're giving it, you are
|
||||
good as gold. So that's it. That's YAML. I hope that really does help. YAML, like I say, can be
|
||||
deceptively simple when you look at it. Then you start thinking about it and figuring out how to
|
||||
recreate it and you realize you have no idea what the structure is. It looks like you do, but you
|
||||
don't. No worries. Now you really do. Sequence, mapping, and several combinations of those. That's
|
||||
all you need to know. Good luck. Have fun. Thanks for listening.
|
||||
You've been listening to Heka Public Radio at HekaPublicRadio.org. We are a community podcast
|
||||
network that releases shows every weekday Monday through Friday. Today's show, like all our shows,
|
||||
was contributed by an HPR listener like yourself. If you ever thought of recording a podcast,
|
||||
then click on our contributing to find out how easy it really is. Heka Public Radio was founded
|
||||
by the digital dog pound and the infonomicum computer club and is part of the binary revolution
|
||||
at binrev.com. If you have comments on today's show, please email the host directly, leave a comment
|
||||
on the website or record a follow-up episode yourself. Unless otherwise stated, today's show is
|
||||
released on the Creative Commons Attribution ShareLight 3.0 license.
|
||||
Reference in New Issue
Block a user