Pandas Pivot Table Example Explained

Q: What is the default aggregation function in pd.pivot_table?

The default aggregation function is mean. Always specify aggfunc explicitly to avoid unintended results.

Q: How do I handle NaN values in a pandas pivot table?

Use the fill_value parameter to replace NaN with a specific value, such as fill_value=0.

Q: How do I add totals to a pandas pivot table?

Use margins=True and optionally margins_name='Total' to add grand total rows and columns.

Q: Can I use multiple aggregation functions in a pivot table?

Yes. Pass a list to aggfunc like aggfunc=['sum', 'mean', 'count'] to apply multiple aggregations at once.

If you have ever used pivot tables in Microsoft Excel, you already know how powerful they are for summarizing and analyzing data quickly.

Now imagine having that same power but with the full flexibility of Python.

That is exactly what pandas pivot tables give you.

In this guide, we will break down everything you need to know about pandas pivot tables. What they are, how they work, and how to use them with real, practical examples. By the end, you will be able to build your own pivot tables confidently in any data analysis project.

What Is a Pivot Table?

A pivot table is a data summarization tool that lets you reorganize, group, and aggregate data from a larger dataset into a more meaningful summary.

Instead of scrolling through thousands of rows trying to find patterns, a pivot table condenses that data into a clean, structured summary showing totals, averages, counts, or any other aggregation you need.

Simple Analogy

Imagine you have a spreadsheet with 10,000 rows of sales transactions, each row showing the salesperson, region, product, and amount sold.

A pivot table can instantly answer questions like:

What is the total sales amount per region?
What is the average sale per salesperson?
Which product sold the most in each region?

Without a pivot table, answering these questions would require sorting, filtering, and manually calculating which is slow and error-prone.

Setting Up the Dataset

Before we dive into examples, let us create a simple dataset to work with throughout this guide.

python

import pandas as pd

data = {
    'salesperson': ['Alice', 'Bob', 'Alice', 'Bob', 'Alice', 'Bob', 'Alice', 'Bob'],
    'region': ['North', 'North', 'South', 'South', 'North', 'South', 'South', 'North'],
    'product': ['Laptop', 'Laptop', 'Phone', 'Phone', 'Phone', 'Laptop', 'Laptop', 'Phone'],
    'quantity': [5, 3, 7, 2, 4, 6, 3, 5],
    'sales_amount': [25000, 15000, 21000, 6000, 12000, 18000, 9000, 10000]
}

df = pd.DataFrame(data)
print(df)

Output:

salesperson	region	product	quantity	sales_amount
Alice	North	Laptop	5	25000
Bob	North	Laptop	3	15000
Alice	South	Phone	7	21000
Bob	South	Phone	2	6000
Alice	North	Phone	4	12000
Bob	South	Laptop	6	18000
Alice	South	Laptop	3	9000
Bob	North	Phone	5	10000

This is our raw data. Now let us start building pivot tables from it.

Basic Syntax of pd.pivot_table

python

pd.pivot_table(
    data,          # Your DataFrame
    values,        # Column to aggregate
    index,         # Row grouping
    columns,       # Column grouping (optional)
    aggfunc,       # Aggregation function (default: mean)
    fill_value,    # Replace NaN values (optional)
    margins        # Add row/column totals (optional)
)

Let us go through each parameter one by one with practical examples.

Example 1: Basic Pivot Table (Total Sales per Salesperson)

The simplest pivot table groups data by one column and aggregates another.

python

pivot1 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
)

print(pivot1)

Output:

salesperson	sales_amount
Alice	67000
Bob	49000

What This Tells Us: Alice has generated a total of $67,000 in sales while Bob has generated $49,000. This is already a valuable business insight from just three lines of code.

Example 2: Pivot Table with Rows and Columns

Now let us add a columns parameter to break the data down further showing sales per salesperson per product.

python

pivot2 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='product',
    aggfunc='sum'
)

print(pivot2)

Output:

salesperson	Laptop	Phone
Alice	34000	33000
Bob	33000	16000

What This Tells Us: Now we can see that Alice performed similarly across both Laptop and Phone sales, while Bob sold significantly more Laptops than Phones. This kind of breakdown is where pivot tables really shine.

Example 3: Using Different Aggregation Functions

By default, pandas pivot tables use the mean as the aggregation function. But you can use any function that makes sense for your analysis.

Sum

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='sum')

Mean (Default)

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='mean')

Count

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='count')

Max and Min

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='max')
pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='min')

Multiple Aggregation Functions at Once

python

pivot3 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc=['sum', 'mean', 'count']
)

print(pivot3)

Output:

salesperson	sum	mean	count
Alice	67000	16750	4
Bob	49000	12250	4

This gives you a complete picture in a single table which includes total sales, average sales, and number of transactions for each salesperson.

Example 4: Pivot Table with Multiple Index Columns

You can group by more than one column by passing a list to the index parameter.

python

pivot4 = pd.pivot_table(
    df,
    values='sales_amount',
    index=['salesperson', 'region'],
    aggfunc='sum'
)

print(pivot4)

Output:

salesperson	region	sales_amount
Alice	North	37000
Alice	South	30000
Bob	North	25000
Bob	South	24000

What This Tells Us: Now we can see performance broken down by both salesperson and region. Alice performed better in the North while Bob was more evenly split. This kind of multi-level grouping is very powerful for business analysis.

Example 5: Aggregating Multiple Value Columns

You can also summarize more than one value column at the same time.

python

pivot5 = pd.pivot_table(
    df,
    values=['sales_amount', 'quantity'],
    index='salesperson',
    aggfunc='sum'
)

print(pivot5)

Output:

salesperson	quantity	sales_amount
Alice	19	67000
Bob	16	49000

Now you have both total quantity sold and total sales amount in one clean summary table.

Example 6: Handling Missing Values with fill_value

When you create a pivot table with rows and columns, some combinations might not have data. Pandas fills these with NaN by default. You can replace NaN with a specific value using fill_value.

python

pivot6 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='region',
    aggfunc='sum',
    fill_value=0
)

print(pivot6)

Output:

salesperson	North	South
Alice	37000	30000
Bob	25000	24000

Using fill_value=0 ensures that any missing combinations show zero instead of NaN, which makes the table cleaner and easier to work with mathematically.

Example 7: Adding Totals with margins

The margins=True parameter adds a grand total row and column to your pivot table just like the totals row in an Excel pivot table.

python

pivot7 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='product',
    aggfunc='sum',
    fill_value=0,
    margins=True,
    margins_name='Total'
)

print(pivot7)

Output:

salesperson	Laptop	Phone	Total
Alice	34000	33000	67000
Bob	33000	16000	49000
Total	67000	49000	116000

This is extremely useful when presenting data to stakeholders or building summary reports.

Example 8: Sorting a Pivot Table

Once you have your pivot table, you can sort it just like any other DataFrame.

python

pivot8 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
)

pivot8_sorted = pivot8.sort_values('sales_amount', ascending=False)
print(pivot8_sorted)

Output:

salesperson	sales_amount
Alice	67000
Bob	49000

Sorting helps you quickly identify top and bottom performers.

Example 9: Resetting the Index

By default, pivot tables use the index column as the row label. If you want to convert it back to a regular column, use reset_index().

python

pivot9 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
).reset_index()

print(pivot9)

This is especially useful when you want to export your pivot table to a CSV or use it as input for another operation.

Pandas Pivot Table vs GroupBy — What Is the Difference?

A common question beginners ask is: when should I use pivot_table vs groupby?

python

# GroupBy equivalent
df.groupby('salesperson')['sales_amount'].sum()

# Pivot Table equivalent
pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='sum')

Both produce similar results for simple cases. But pivot tables are more powerful when you need:

Two-dimensional grouping (rows AND columns)
Multiple aggregation functions at once
Automatic handling of missing combinations
Margins and totals built in

Feature	GroupBy	Pivot Table
Single dimension grouping	Yes	Yes
Two-dimensional grouping	Limited	Yes
Multiple aggfunc at once	Possible but complex	Easy
Built-in totals (margins)	No	Yes
Handles missing combinations	No	Yes (fill_value)
Best for	Simple aggregations	Complex summaries

Real-World Use Cases of Pandas Pivot Tables

Sales and Revenue Analysis

Summarize total revenue by region, product, and salesperson to identify top performers and underperforming areas.

HR and Workforce Analytics

Analyze employee headcount, average salary, and tenure across departments and job levels.

E-commerce Analytics

Break down order volumes, average order value, and return rates by product category and customer segment.

Financial Reporting

Summarize income, expenses, and profit by month and department for financial dashboards.

Healthcare Data

Aggregate patient counts, average treatment duration, and readmission rates by ward and diagnosis category.

Advantages and Disadvantages of Pandas Pivot Tables

Advantages

Summarizes large datasets quickly and efficiently
Supports multiple aggregation functions simultaneously
Handles two-dimensional grouping naturally
Built-in support for totals and missing value handling
Output is a standard DataFrame and is easy to export or visualize

Disadvantages

Can be confusing for beginners due to multiple parameters
Large pivot tables can be slow on very big datasets
Multi-level index output can be tricky to work with
Default mean aggregation catches many beginners off guard

Common Mistakes to Avoid

Forgetting the default aggfunc is mean — Always specify aggfunc explicitly to avoid unexpected results
Not using fill_value — Missing combinations will show NaN, which can break downstream calculations
Confusing index and columns parameters — Index controls rows, columns controls the column breakdown
Not resetting the index — If you need a flat DataFrame for export or visualization, always call reset_index()
Using pivot_table when groupby is enough — For simple single-dimension aggregations, groupby is faster and simpler
Ignoring multi-level column headers — When using both values and columns, pandas creates multi-level headers that need to be flattened before further processing

Pandas pivot tables are one of the most powerful and practical tools in any data analyst or data scientist’s toolkit.

They let you go from thousands of rows of raw data to a clean, meaningful summary in just a few lines of code.

Here is a quick recap of everything we covered:

A pivot table summarizes and reorganizes data using grouping and aggregation
Use index to control row grouping and columns for column breakdown
Use aggfunc to specify how data is aggregated — sum, mean, count, max, min
Use fill_value to handle missing combinations cleanly
Use margins=True to add grand totals
Pivot tables are more powerful than groupby for two-dimensional summaries

Once you get comfortable with pivot tables, you will find yourself reaching for them constantly whether you are doing exploratory data analysis, building reports, or preparing data for visualization.

FAQs

What is a pandas pivot table?

A pandas pivot table is a function that summarizes and aggregates data from a DataFrame into a structured summary table, similar to pivot tables in Excel.

What is the default aggregation function in pd.pivot_table?

The default aggregation function is mean. Always specify aggfunc explicitly to avoid unintended results.

What is the difference between pivot_table and groupby in pandas?

GroupBy is better for simple single-dimension aggregations. Pivot tables are better for two-dimensional summaries, multiple aggregation functions, and built-in totals.

How do I handle NaN values in a pandas pivot table?

Use the fill_value parameter to replace NaN with a specific value, such as fill_value=0.

How do I add totals to a pandas pivot table?

Use margins=True and optionally margins_name='Total' to add grand total rows and columns.

Can I use multiple aggregation functions in a pivot table?

Yes. Pass a list to aggfunc like aggfunc=['sum', 'mean', 'count'] to apply multiple aggregations at once.

Pandas Pivot Table Example Explained

What Is a Pivot Table?

Simple Analogy

Setting Up the Dataset

Basic Syntax of pd.pivot_table

Example 1: Basic Pivot Table (Total Sales per Salesperson)

Example 2: Pivot Table with Rows and Columns

Example 3: Using Different Aggregation Functions

Sum

Mean (Default)

Count

Max and Min

Multiple Aggregation Functions at Once

Example 4: Pivot Table with Multiple Index Columns

Example 5: Aggregating Multiple Value Columns

Example 6: Handling Missing Values with fill_value

Example 7: Adding Totals with margins

Example 8: Sorting a Pivot Table

Example 9: Resetting the Index

Pandas Pivot Table vs GroupBy — What Is the Difference?

Real-World Use Cases of Pandas Pivot Tables

Sales and Revenue Analysis

HR and Workforce Analytics

E-commerce Analytics

Financial Reporting

Healthcare Data

Advantages and Disadvantages of Pandas Pivot Tables

Advantages

Disadvantages

Common Mistakes to Avoid

FAQs

What is a pandas pivot table?

What is the default aggregation function in pd.pivot_table?

What is the difference between pivot_table and groupby in pandas?

How do I handle NaN values in a pandas pivot table?

How do I add totals to a pandas pivot table?

Can I use multiple aggregation functions in a pivot table?

Leave a Comment Cancel Reply

Copyright © 2025 codewithfimi.com - All Rights Reserved