Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
245
hpr_transcripts/hpr1545.txt
Normal file
245
hpr_transcripts/hpr1545.txt
Normal file
@@ -0,0 +1,245 @@
|
||||
Episode: 1545
|
||||
Title: HPR1545: 32 - LibreOffice Calc - Introduction to Charts and Graphs
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1545/hpr1545.mp3
|
||||
Transcribed: 2025-10-18 04:51:17
|
||||
|
||||
---
|
||||
|
||||
Music
|
||||
Hello, this is Ahuka, welcoming you to Hacker Public Radio in another exciting episode
|
||||
in our ongoing series of Libra Office Calc.
|
||||
What I want to do today is I want to start introducing the whole idea of charts and graphs.
|
||||
One of the nice features of spreadsheets is that they come with a pretty decent capability
|
||||
for presenting your data in a good way with charts and graphs.
|
||||
It is said a picture is worth a thousand words, and a well-done graph can communicate
|
||||
a lot of information in a very concise form.
|
||||
I really appreciate good graphing, and I abhor misleading graphs which are very easy
|
||||
to create.
|
||||
For anyone who wants to become an expert in this area, you cannot do better than to study
|
||||
the works of Edward Tufty, particularly his book, The Visual Display of Quantitative Information,
|
||||
and links for all of the things that I mention are going to be in the show notes.
|
||||
For pure pleasure, if you have a few minutes, there's a video on YouTube from a channel
|
||||
called NumberFile that discusses what they call perhaps the greatest infographic ever-created.
|
||||
You probably would enjoy taking a look at that, and there is a TED Talk, the best stats
|
||||
you've ever seen, which is simply mind-blowing if you have any feel at all for visually
|
||||
displaying information.
|
||||
And finally, I want to mention the David McCandless TED Talk, The Beauty of Data Visualization.
|
||||
So if you take a little time to take a look at this, you'll start to see what good presentation
|
||||
is on this stuff.
|
||||
Now returning to the Tufty book for a moment, notice the key word in the title, Quantitative.
|
||||
Now we discuss the distinction between quantitative and qualitative data in an earlier lesson,
|
||||
but it never hurts to be clear about this, because the choices you make about the right
|
||||
chart or graph depend crucially on knowing what is appropriate for the data you have.
|
||||
Quantitative data is data that is measured in terms of numbers, and the measurements make
|
||||
sense of numbers.
|
||||
If I ask you how many apples you have, and you say I have three apples, I could record
|
||||
that data, and it would be quantitative.
|
||||
But if I asked you what apartment number you live in, and you said three, I could record
|
||||
that data, but it is in no sense quantitative.
|
||||
Three is just a label in this instance, and the data is actually qualitative, which means
|
||||
it can be used to distinguish your apartment from the one next door that is apartment 2,
|
||||
but you would never claim that apartment 3 is 50% more apartment-y than apartment 2.
|
||||
The number in this case has no meaning as a number, and the software won't prevent
|
||||
you from making a mistake here.
|
||||
You can create a graph using this data, but it may be completely useless if you make the
|
||||
wrong choice.
|
||||
So what is qualitative data, and how do you make charts out of this?
|
||||
Qualitative data measures a particular quality of each object, hence the name.
|
||||
And these are very common in social science research.
|
||||
People can be divided by sex, male versus female, by religion, by race, by nationality,
|
||||
by province, etc.
|
||||
And I like to think of these qualitative variables as being buckets into which the data
|
||||
is sorted.
|
||||
If I am sorting by sex, I have buckets for male and female, and each person gets placed
|
||||
in the appropriate bucket.
|
||||
And when I have finished sorting them into buckets, I can do one meaningful mathematical
|
||||
measurement, and that is to count the number in each bucket.
|
||||
But these counts, I do have numbers that can be placed into a chart.
|
||||
So I have several options here.
|
||||
First is a column.
|
||||
A column chart lets you display a column whose height is proportional to the number in each
|
||||
bucket.
|
||||
If your data had 20 men and 10 women, the column for men should be twice as high as the column
|
||||
for women.
|
||||
If this is not the case, you may very well be lying with data, whether deliberately
|
||||
or inadvertently.
|
||||
This came up recently when a major television network in the United States had to apologize
|
||||
and correct a column graph that made the difference between 6 million and 7 million
|
||||
look like the difference between 1 and 7.
|
||||
Not good.
|
||||
Bar, a bar graph, what's that, or a bar chart?
|
||||
This is just a column chart turned on its side.
|
||||
Instead of columns going up, you have bars going from left to right.
|
||||
There really is no other difference.
|
||||
But there are reasons why you choose one or over the other.
|
||||
If your chart has both positive and negative numbers, that will be much clearer using columns.
|
||||
And if you have a lot of bars and they have long names, a horizontal bar graph is probably
|
||||
clearer.
|
||||
And none of these involve any actual change in what they display just in making things
|
||||
easier to read, but that is a good consideration.
|
||||
And there is the pie chart.
|
||||
This is the chart to use when you want to discuss relative percentages within each bucket.
|
||||
The entire pie is 100% and each bucket gets a slice of the pie proportional to its percentage
|
||||
within the total.
|
||||
The use case here is for qualitative data where the number of categories is fairly small.
|
||||
I find if there are more than about a half a dozen slices of the pie, it becomes progressively
|
||||
harder to read and understand the chart.
|
||||
And that last point raises a general point about all qualitative charts.
|
||||
You really don't want a huge number of buckets or categories in these charts.
|
||||
Even with a bar graph that can, in theory, accommodate more categories, you can have a chart
|
||||
that is hard to make sense of.
|
||||
A good way to resolve this is to broaden your categories.
|
||||
For example, suppose you are doing an analysis of the proportion of evangelical Protestants
|
||||
in different parts of the United States.
|
||||
You might start with a data breaking this down by each state, but there are 50 states in
|
||||
the United States so you would have 50 buckets.
|
||||
None of these charts would work well, in this case.
|
||||
But if you group the states into regions such as East Coast, Midwest, South, West Coast,
|
||||
something like that, you can get it down to a manageable number and produce a chart that
|
||||
makes sense and is easy to understand.
|
||||
Now onto the quantitative analysis.
|
||||
This is where we get to more complicated mathematical analysis and the nature of the charts and
|
||||
graphs available changes as a result.
|
||||
The most interesting cases are ones where you have several quantitative variables interacting.
|
||||
For example, a typical economics question might be how the unemployment rate has varied
|
||||
over time.
|
||||
The unemployment rate is one quantitative variable and time itself is another.
|
||||
But you can come up with examples in many other fields as well.
|
||||
In chemistry, you might measure the rate of reaction as the concentration varies, for
|
||||
instance.
|
||||
In these types of analysis, each variable needs to be graphed on an axis that has numbers
|
||||
arranged in order.
|
||||
Given the limitations of the human brain, that generally means no more than three axes
|
||||
if you're trying to do a graph, although there have been some clever ways to get around
|
||||
this limitation, and the resources I have given at the beginning of this tutorial will give
|
||||
you some wonderful examples of multiple variable graphing.
|
||||
This question you need to consider is what kind of relationship do you think exists in
|
||||
this data?
|
||||
In scientific analysis, there are in general two kinds of variables in any analysis.
|
||||
They can be called independent versus dependent.
|
||||
And sometimes the independent variable is instead referred to as the explanatory variable.
|
||||
However you call them, the basic idea is that one variable is explaining the other.
|
||||
To take an example from medicine, you might want to examine the idea that there is a relationship
|
||||
between age of death and body weight, and collect data from a group of individuals to examine
|
||||
this idea and see what relationship exists.
|
||||
I hope you will agree that it makes no sense to think of body weight, that it makes sense,
|
||||
that it makes sense to think of body weight as being something that helps to determine
|
||||
the age of death, but it makes no sense to think of the age of death determining body weight.
|
||||
So if you're graphing this particular data set, you would always put body weight on the
|
||||
horizontal axis and age of death on the vertical axis.
|
||||
This is a convention, it's not a scientific necessity, perhaps, but conventions are important
|
||||
since that governs how people will read the graph.
|
||||
You should never violate conventions without a very compelling reason, since in most cases
|
||||
it will cause people to misinterpret your graph.
|
||||
Okay, so what kind of options do we have?
|
||||
The line graph, this is the most basic type of quantitative chart.
|
||||
It places one variable on the horizontal axis, conventionally called the x-axis, and the
|
||||
other on the vertical axis, usually called the y-axis.
|
||||
Each point of the graph represents a particular data point, or observation as scientists refer
|
||||
to it, which is entered by selecting the correct values for each axis.
|
||||
The last step is to draw a line that connects each of these data points.
|
||||
Line graphs carry certain implications, though.
|
||||
First of all, for each value on the x-axis, there should be one and only one value on the
|
||||
y-axis.
|
||||
A variable that changes over time is an excellent case in point.
|
||||
If you have a graph of gross domestic product, also known as GDP, for each year, a line graph
|
||||
would be perfectly appropriate since you can have only one value of GDP for any given year.
|
||||
In this type of analysis called time series and statistics, the convention is to always
|
||||
place the time variable on the x-axis and the corresponding measurement on the y-axis.
|
||||
This type of graph generally presumes an orderly progression, therefore, across the x-axis.
|
||||
Now, sometimes we don't have those presumptions, and that's where something like the x-y or
|
||||
scatter diagram becomes useful.
|
||||
This is the preferred graph to use when there can be more than one value on the y-axis,
|
||||
for any given x-axis value, or where there is not yet a presumption of orderly progress
|
||||
on the x-axis.
|
||||
As an example, consider a graph that relates body weight to height for a group of individuals.
|
||||
Even if you presume some general relationship here, it is clear that for any given height,
|
||||
you could have multiple weights, and for any given weight, there can be multiple heights.
|
||||
So the scatter graph is often used to see whether a relationship might exist between these
|
||||
variables.
|
||||
Area graph.
|
||||
This is a way to combine multiple series of data that could just as well be displayed in
|
||||
a line graph.
|
||||
The idea here is to fill in the area under the graph.
|
||||
Bubble.
|
||||
This chart is designed to show the relationship between three variables.
|
||||
One is on the horizontal axis, another on the vertical axis, and the third is shown by
|
||||
the size of the bubble that's drawn.
|
||||
This could be drawn for three quantitative variables, or it could be a hybrid graph where
|
||||
one variable is qualitative.
|
||||
This is a clever alternative to drawing a perspective rendering of a 3D graph.
|
||||
The limitation is you cannot display zero or negative numbers in the bubble.
|
||||
The next one is called in LibraOffice a net graph, or a net diagram.
|
||||
It's a very odd choice.
|
||||
I'm not quite sure because as far as I can tell, no one other than LibraOffice uses this
|
||||
terminology.
|
||||
Now, if you go looking for this on the web, you will find it referred to most commonly as
|
||||
a radar chart or a spider chart.
|
||||
For radar chart and bubble chart, by the way, there's links in the show notes to Wikipedia
|
||||
so you can read up more about this.
|
||||
The radar chart has spokes that represent different variables which radiate out from a common
|
||||
center.
|
||||
Along each spoke, the distance from the center represents a measurement.
|
||||
Then you connect the dots going around the spokes to form a very irregular shape.
|
||||
By repeating this process for a number of selected objects, you can do a comparison.
|
||||
So, if you go to the Wikipedia article that I linked, they give example of several different
|
||||
cars, and for each you do a measurement of variables such as price, mileage, headroom, etc.
|
||||
And by comparing the shapes you get for each automobile, you can kind of do a quick comparison
|
||||
among them.
|
||||
Then there's the hybrid charts and graphs.
|
||||
Sometimes you need to combine both kinds of variables in a single analysis, and for that
|
||||
it helps to have a hybrid graph that combines both types of data in a good way.
|
||||
Now, technically you can view bar charts, column charts, and pi charts as hybrids in that
|
||||
the count of members in each bucket is actually a quantitative measure, but that is not how
|
||||
most people think of it.
|
||||
A stock chart, this is a specialized type of column graph that essentially combines
|
||||
three different numerical measures on one diagram.
|
||||
The height of each column generally represents the closing price, but it adds both the high
|
||||
and the low for the day.
|
||||
Now, this can be used for more than stock prices.
|
||||
You could use it to display the average, minimum, and maximum for a group of measurements,
|
||||
or in statistics, maybe the mean of the measurements combined with the standard deviation or estimated
|
||||
error, and so on.
|
||||
There's a chart called column and line, which combines two types of data in a single chart
|
||||
presumably because they were related.
|
||||
For example, you have data on how many cars were sold in a dealership broken down by model.
|
||||
You might display this as a column or a bar chart if you were just displaying this by
|
||||
itself, but suppose you wanted to add in a related quantitative variable such as the
|
||||
amount of display space each model was given or the price of each model.
|
||||
You could do this by putting the cars sold into a column graph and then adding a line graph
|
||||
on top of it to show the amount of display space each model received.
|
||||
So the point of this analysis is to understand that choosing a graph should not be random.
|
||||
You should have a reason for your choice, and the graph you choose should be a good fit
|
||||
for the point you are trying to communicate.
|
||||
I regularly see examples in the media of graphs that are not appropriate, that are done incorrectly,
|
||||
or that violate the conventions.
|
||||
At best these are just stupid mistakes, but at worst they are examples of what to me
|
||||
is just plain lying.
|
||||
As an example, just in the last few days I saw an example of a graph where the vertical
|
||||
axis was reversed, so that the lower numbers were on top and the numbers increased as you
|
||||
went down.
|
||||
This is of course the exact opposite of what convention tells all of us to expect, and
|
||||
I believe it was done deliberately to mislead people.
|
||||
When I was not totally surprised, this happened to be a contentious political issue.
|
||||
To hear defenders say, well if you are too stupid to read a graph, but I am firmly in the
|
||||
school that says clear and honest communication is the point in using graphs properly as essential
|
||||
to that communication.
|
||||
To see some examples of creative ways of lying with graphs, and therefore things you should
|
||||
avoid, there is an article from simply speaking.
|
||||
It goes through some of these, and that is also in the show notes.
|
||||
So wrapping up now, this is Huka, signing off, and reminding you as always, don't forget
|
||||
to support FreeSoftware.
|
||||
Bye-bye.
|
||||
You have been listening to Hacker Public Radio, or TechUpublicRadio.
|
||||
We are a community podcast network that releases shows every weekday Monday through Friday.
|
||||
Today's show, like all our shows, was contributed by a HBR listener like yourself.
|
||||
If you ever consider recording a podcast, then visit our website to find out how easy
|
||||
it really is.
|
||||
Hacker Public Radio was founded by the Digital Dark Pound and the Infonomicom Computer
|
||||
Club.
|
||||
HBR is funded by the binary revolution at binref.com, all binref projects are proudly sponsored
|
||||
by Luna Pages.
|
||||
For shared hosting to custom private clouds, go to LunaPages.com for all your hosting
|
||||
needs.
|
||||
Unless otherwise stasis, today's show is released under a creative comments, attribution, share
|
||||
a like, free dose of license.
|
||||
Reference in New Issue
Block a user