hpr-knowledge-base/hpr_transcripts/hpr1655.txt

Episode: 1655
Title: HPR1655: 43 - LibreOffice Calc - Creating Pivot Tables
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr1655/hpr1655.mp3
Transcribed: 2025-10-18 06:26:56

---

It's Friday 5th of December 2014.
This is an HBR episode 1655 entitled 43 Libra Office Calk, creating pivot tables and
in part of the series, Libra Office.
It is hosted by AYUKA and is about 16 minutes long.
Feedback can be sent to Wilnicat and Wilnic.com or by leaving a comment on this episode.
The summary is how to create a pivot table.
This episode of HBR is brought to you by an honesthost.com.
With 15% discount on all shared hosting with the offer code HBR15, that's HBR15.
Better web hosting that's honest and fair at An Honesthost.com.
Hello, this is AYUKA, welcoming you to Hacker Public Radio and another exciting episode
in our ongoing series on Libra Office.
We are continuing on our investigation of Calk and getting to something that I think
will be of some interest to a lot of people.
It's something that spreadsheets have made very easy to do but can be a little bit frightening
to people and that is pivot tables.
On this one I want to talk about how you create a pivot table.
Pivot tables are extremely useful.
I have seen people say that that was one of the big reasons that Excel replaced Lotus
1-2-3 as the spreadsheet of choice.
I think there were probably several things going on there, but still it does illustrate
how important they are.
They're also very confusing the most spreadsheet users.
I think part of the reason for that is that they are very flexible and powerful, which
means that it is easy to get confused.
The term pivot is actually a clue to how you can use them.
Imagine a table of summary data where you can rotate the table so that rows become columns
depending on your needs.
That's what the pivot part refers to.
The other thing you want to pay attention to here is that the power of pivot tables comes
from using raw data, not summarized data.
As an example, the biostatistics data we used in the previous tutorials was raw data.
You had an individual row for each observation and multiple measurements were made for each
individual covering a variety of variables.
But for this analysis, I'm going to use another data set which I found at the University
of South Florida.
It contains fictional data of sales at a company and this is the canonical example of using
pivot tables.
Note that each row of the data set is for an individual order.
I have a link to download this that is in the show notes, so please feel free to download
the data set and take a look and follow along with what we're doing.
Now the rules for using pivot tables are not too bad, but pay attention.
First, you cannot have any empty rows or columns.
These people insert blanks for formatting purposes.
In fact, I've often done that myself on spreadsheets, but if you wish to use pivot tables, they
must be removed.
The reason for this is that count searches in all four directions from the cell you selected
to locate your data.
And if it encounters a completely blank row or column, it treats that as a signal of the
end of the data.
Secondly, always select only one cell when starting the pivot table.
The program will automatically infer the whole range using the algorithm I just mentioned.
It's going to look in all four directions.
If you select more than one cell, it will assume you're putting in a list and the sorting
will be mixed up.
Finally, the data must have simple, linear structure, in other words, the normal form.
For instance, you cannot have data divided into different columns that is essentially the
same data.
In our data set, we would not want to have sales for the Northeast region in one column,
sales for the South in another column, et cetera.
You put all the sales in one column and then have a variable that says what region in
a different column.
Also while it is possible to use external data sources for this kind of analysis, I will
stick to doing this with data that is already in a spreadsheet.
Starting to access external data sources is a more advanced topic we may get to that
at some future date.
Now creating the pivot table, you've got a spreadsheet, you've got a bunch of data.
Click on a cell anywhere in that data, just one cell, and then go to the data menu, then
select pivot table, then select create.
And this is going to pop up a window that is going to allow you to start sticking your
variables in various rectangular spaces here.
Now note that along the right side of this window is a list of all of your fields, generally
based on columns.
But for some reason you don't get all of your fields, highlight all of the columns you
want before you create the pivot table, and that should take care of it.
Now this window is called the data pilot dialogue.
Now to set your different fields in the pivot table, you just drag and drop each field to
the appropriate area in this window.
Page fields is the first category.
This is a place to potentially limit the data to one value in one of your columns.
For instance, if we look at the region field, we may want to look individually at each region,
so putting this field in page fields will let us do that.
Then data fields.
This area must contain at least one variable.
columns in this area are aggregated, and the obvious candidate for that in our sample
data set is the total field that records total sales on each order.
The idea is that you are going to add up the sales for some set of values to be defined
in the row and column fields.
Column fields, whatever you put here, will be a column in the resulting pivot table.
As a general rule, if you have two possible fields to use, make one a column field and
one a row field.
And finally, row fields will be rows in the resulting pivot table.
So here is an example.
You can take a look, and I created a spreadsheet that has this, which you can download, and you
can see what I'm referring to, and it uses that data set that I told you I got from
University of South Florida.
So I created with region as the page field, total as the data field, and rep and item as
the row fields.
And no column fields were used in this particular example, and you can see I get a particular kind
of a pivot table.
With little practice, you can see how to use these fields.
The page field allows you to select one, several, or all categories of the selected field.
Since looking at regions is a pretty reasonable thing to do, it makes sense to use this.
Total is the only field that you would want to aggregate.
By this, I mean that you are adding together all of the individual orders that fit the classification.
As an example, looking at representative Andrews, we see a total of $298.65 in pencil sales.
But looking at the raw data, we see that this came from three separate orders.
I don't see any other field here where you want to do that kind of a calculation.
That's why we put total into that data field.
Note also that you could have gotten this answer laboriously by going through the orders,
sorting them, and adding up the individual orders to get the totals.
But the pivot table does this almost effortlessly.
Now the remaining questions about row and or column fields.
The fields that are suitable for this kind of thing are in one sense similar to the page field options.
You will note that an old friend of ours has returned, and that is the distinction between qualitative and quantitative variables.
Data fields will pretty much need to be quantitative variables, and we only have two of those here, total and units.
Either one can be used in some types of analysis, but that depends on the rest of your fields as to which makes sense.
Given the page field and row fields we already chose, units would make no sense because the numbers are not comparable.
How do you add the number of pencils to the number of binders and get a meaningful total you can't?
But that is partly because we started out to compare sales by region anyway.
But page fields, row fields, and column fields will all be qualitative.
We have three good candidates here, region, rep, and item, and we used all of them in our analysis.
But they could be interchanged depending on the analysis we want to focus on, as to whether they should be rows or columns that is simply a matter of presentation.
For example, I did a second view of the pivot table, except that now I've moved item into being a column field.
This probably is a better presentation than the first one.
It's definitely easier to read.
And the second reason is to get sub-totals by both of your fields.
You can see total sales for each rep.
And since I left rep as a row field, the very last column gives the total for each representative.
And then the total sales for each item.
Now since item is column fields now, if I look at the very last row, it gives the total for each of those columns.
I can see the total sales for binders, the total sales for desks, the total sales for pens, and so on.
So that's a good thing to remember.
The final option here, creating a pivot table with both ref and item as column fields produces a pivot table that is very wide and just about unreadable.
Now, suppose you put fields in the wrong area and did not realize it until after you create the pivot table.
Well, that's an easy fix.
Note that in the top left cell, which on my sample spreadsheet is cell A1, it has the word filter.
If you right-click on that cell, a menu of options opens, and the very first item in this menu is edit layout.
If you click on this, the data pilot window will open again, and you change things by simply dragging fields to where you want them.
You can drag a field from one layout area to another.
You can add a field you didn't have before, or you can remove a field by just dragging it outside the layout area.
So you have a lot of flexibility here.
OK, so this is kind of hard.
I realize a lot of this stuff is hard to visualize just listening to an audio program.
And that's why I created these sample spreadsheet files that you can download.
So if you take a look at the show notes, you can see that I have created one that has these pivot tables, the different variations that I talk about in this particular podcast.
You can see as separate tabs in the spreadsheet, so you can compare them and take a look at them and see why.
For instance, making both of those fields as row fields was not as good as making one row and one column.
And I think when you look at that, you'll understand right away why that makes so much sense.
So I encourage you to download and play with it.
And remember that pivot tables really are a very important and powerful use of spreadsheets.
And it's probably worth taking a look at it.
So this is Ahuka for Hacker Public Radio, signing off and reminding you as always to support free software. Bye bye.
You've been listening to Hacker Public Radio at Hacker Public Radio dot org.
We are a community podcast network that releases shows every weekday, Monday through Friday.
Today's show, like all our shows, was contributed by an HPR listener like yourself.
If you ever thought of recording a podcast, then click on our contribute link to find out how easy it really is.
Hacker Public Radio was founded by the digital dog pound and the Infonomicon Computer Club.
And it's part of the binary revolution at binrev.com.
If you have comments on today's show, please email the host directly, leave a comment on the website or record a follow-up episode yourself.
On this otherwise stated, today's show is released on the creative comments, attribution, share a like, 3.0 license.