Pandas Pivot Table Example Explained

Pandas Pivot Table Example Explained

If you have ever used pivot tables in Microsoft Excel, you already know how powerful they are for summarizing and analyzing data quickly.

Now imagine having that same power but with the full flexibility of Python.

That is exactly what pandas pivot tables give you.

In this guide, we will break down everything you need to know about pandas pivot tables. What they are, how they work, and how to use them with real, practical examples. By the end, you will be able to build your own pivot tables confidently in any data analysis project.

What Is a Pivot Table?

A pivot table is a data summarization tool that lets you reorganize, group, and aggregate data from a larger dataset into a more meaningful summary.

Instead of scrolling through thousands of rows trying to find patterns, a pivot table condenses that data into a clean, structured summary showing totals, averages, counts, or any other aggregation you need.

Simple Analogy

Imagine you have a spreadsheet with 10,000 rows of sales transactions, each row showing the salesperson, region, product, and amount sold.

A pivot table can instantly answer questions like:

  • What is the total sales amount per region?
  • What is the average sale per salesperson?
  • Which product sold the most in each region?

Without a pivot table, answering these questions would require sorting, filtering, and manually calculating which is slow and error-prone.

Setting Up the Dataset

Before we dive into examples, let us create a simple dataset to work with throughout this guide.

python

import pandas as pd

data = {
    'salesperson': ['Alice', 'Bob', 'Alice', 'Bob', 'Alice', 'Bob', 'Alice', 'Bob'],
    'region': ['North', 'North', 'South', 'South', 'North', 'South', 'South', 'North'],
    'product': ['Laptop', 'Laptop', 'Phone', 'Phone', 'Phone', 'Laptop', 'Laptop', 'Phone'],
    'quantity': [5, 3, 7, 2, 4, 6, 3, 5],
    'sales_amount': [25000, 15000, 21000, 6000, 12000, 18000, 9000, 10000]
}

df = pd.DataFrame(data)
print(df)

Output:

salespersonregionproductquantitysales_amount
AliceNorthLaptop525000
BobNorthLaptop315000
AliceSouthPhone721000
BobSouthPhone26000
AliceNorthPhone412000
BobSouthLaptop618000
AliceSouthLaptop39000
BobNorthPhone510000

This is our raw data. Now let us start building pivot tables from it.

Basic Syntax of pd.pivot_table

python

pd.pivot_table(
    data,          # Your DataFrame
    values,        # Column to aggregate
    index,         # Row grouping
    columns,       # Column grouping (optional)
    aggfunc,       # Aggregation function (default: mean)
    fill_value,    # Replace NaN values (optional)
    margins        # Add row/column totals (optional)
)

Let us go through each parameter one by one with practical examples.

Example 1: Basic Pivot Table (Total Sales per Salesperson)

The simplest pivot table groups data by one column and aggregates another.

python

pivot1 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
)

print(pivot1)

Output:

salespersonsales_amount
Alice67000
Bob49000

What This Tells Us: Alice has generated a total of $67,000 in sales while Bob has generated $49,000. This is already a valuable business insight from just three lines of code.

Example 2: Pivot Table with Rows and Columns

Now let us add a columns parameter to break the data down further showing sales per salesperson per product.

python

pivot2 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='product',
    aggfunc='sum'
)

print(pivot2)

Output:

salespersonLaptopPhone
Alice3400033000
Bob3300016000

What This Tells Us: Now we can see that Alice performed similarly across both Laptop and Phone sales, while Bob sold significantly more Laptops than Phones. This kind of breakdown is where pivot tables really shine.

Example 3: Using Different Aggregation Functions

By default, pandas pivot tables use the mean as the aggregation function. But you can use any function that makes sense for your analysis.

Sum

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='sum')

Mean (Default)

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='mean')

Count

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='count')

Max and Min

python

pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='max')
pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='min')

Multiple Aggregation Functions at Once

python

pivot3 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc=['sum', 'mean', 'count']
)

print(pivot3)

Output:

salespersonsummeancount
Alice67000167504
Bob49000122504

This gives you a complete picture in a single table which includes total sales, average sales, and number of transactions for each salesperson.

Example 4: Pivot Table with Multiple Index Columns

You can group by more than one column by passing a list to the index parameter.

python

pivot4 = pd.pivot_table(
    df,
    values='sales_amount',
    index=['salesperson', 'region'],
    aggfunc='sum'
)

print(pivot4)

Output:

salespersonregionsales_amount
AliceNorth37000
AliceSouth30000
BobNorth25000
BobSouth24000

What This Tells Us: Now we can see performance broken down by both salesperson and region. Alice performed better in the North while Bob was more evenly split. This kind of multi-level grouping is very powerful for business analysis.

Example 5: Aggregating Multiple Value Columns

You can also summarize more than one value column at the same time.

python

pivot5 = pd.pivot_table(
    df,
    values=['sales_amount', 'quantity'],
    index='salesperson',
    aggfunc='sum'
)

print(pivot5)

Output:

salespersonquantitysales_amount
Alice1967000
Bob1649000

Now you have both total quantity sold and total sales amount in one clean summary table.

Example 6: Handling Missing Values with fill_value

When you create a pivot table with rows and columns, some combinations might not have data. Pandas fills these with NaN by default. You can replace NaN with a specific value using fill_value.

python

pivot6 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='region',
    aggfunc='sum',
    fill_value=0
)

print(pivot6)

Output:

salespersonNorthSouth
Alice3700030000
Bob2500024000

Using fill_value=0 ensures that any missing combinations show zero instead of NaN, which makes the table cleaner and easier to work with mathematically.

Example 7: Adding Totals with margins

The margins=True parameter adds a grand total row and column to your pivot table just like the totals row in an Excel pivot table.

python

pivot7 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    columns='product',
    aggfunc='sum',
    fill_value=0,
    margins=True,
    margins_name='Total'
)

print(pivot7)

Output:

salespersonLaptopPhoneTotal
Alice340003300067000
Bob330001600049000
Total6700049000116000

This is extremely useful when presenting data to stakeholders or building summary reports.

Example 8: Sorting a Pivot Table

Once you have your pivot table, you can sort it just like any other DataFrame.

python

pivot8 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
)

pivot8_sorted = pivot8.sort_values('sales_amount', ascending=False)
print(pivot8_sorted)

Output:

salespersonsales_amount
Alice67000
Bob49000

Sorting helps you quickly identify top and bottom performers.

Example 9: Resetting the Index

By default, pivot tables use the index column as the row label. If you want to convert it back to a regular column, use reset_index().

python

pivot9 = pd.pivot_table(
    df,
    values='sales_amount',
    index='salesperson',
    aggfunc='sum'
).reset_index()

print(pivot9)

This is especially useful when you want to export your pivot table to a CSV or use it as input for another operation.

Pandas Pivot Table vs GroupBy — What Is the Difference?

A common question beginners ask is: when should I use pivot_table vs groupby?

python

# GroupBy equivalent
df.groupby('salesperson')['sales_amount'].sum()

# Pivot Table equivalent
pd.pivot_table(df, values='sales_amount', index='salesperson', aggfunc='sum')

Both produce similar results for simple cases. But pivot tables are more powerful when you need:

  • Two-dimensional grouping (rows AND columns)
  • Multiple aggregation functions at once
  • Automatic handling of missing combinations
  • Margins and totals built in
FeatureGroupByPivot Table
Single dimension grouping Yes Yes
Two-dimensional groupingLimited Yes
Multiple aggfunc at once Possible but complexEasy
Built-in totals (margins) No Yes
Handles missing combinations No Yes (fill_value)
Best forSimple aggregationsComplex summaries

Real-World Use Cases of Pandas Pivot Tables

Sales and Revenue Analysis

Summarize total revenue by region, product, and salesperson to identify top performers and underperforming areas.

HR and Workforce Analytics

Analyze employee headcount, average salary, and tenure across departments and job levels.

E-commerce Analytics

Break down order volumes, average order value, and return rates by product category and customer segment.

Financial Reporting

Summarize income, expenses, and profit by month and department for financial dashboards.

Healthcare Data

Aggregate patient counts, average treatment duration, and readmission rates by ward and diagnosis category.

Advantages and Disadvantages of Pandas Pivot Tables

Advantages

  • Summarizes large datasets quickly and efficiently
  • Supports multiple aggregation functions simultaneously
  • Handles two-dimensional grouping naturally
  • Built-in support for totals and missing value handling
  • Output is a standard DataFrame and is easy to export or visualize

Disadvantages

  • Can be confusing for beginners due to multiple parameters
  • Large pivot tables can be slow on very big datasets
  • Multi-level index output can be tricky to work with
  • Default mean aggregation catches many beginners off guard

Common Mistakes to Avoid

  • Forgetting the default aggfunc is mean — Always specify aggfunc explicitly to avoid unexpected results
  • Not using fill_value — Missing combinations will show NaN, which can break downstream calculations
  • Confusing index and columns parameters — Index controls rows, columns controls the column breakdown
  • Not resetting the index — If you need a flat DataFrame for export or visualization, always call reset_index()
  • Using pivot_table when groupby is enough — For simple single-dimension aggregations, groupby is faster and simpler
  • Ignoring multi-level column headers — When using both values and columns, pandas creates multi-level headers that need to be flattened before further processing

Pandas pivot tables are one of the most powerful and practical tools in any data analyst or data scientist’s toolkit.

They let you go from thousands of rows of raw data to a clean, meaningful summary in just a few lines of code.

Here is a quick recap of everything we covered:

  • A pivot table summarizes and reorganizes data using grouping and aggregation
  • Use index to control row grouping and columns for column breakdown
  • Use aggfunc to specify how data is aggregated — sum, mean, count, max, min
  • Use fill_value to handle missing combinations cleanly
  • Use margins=True to add grand totals
  • Pivot tables are more powerful than groupby for two-dimensional summaries

Once you get comfortable with pivot tables, you will find yourself reaching for them constantly whether you are doing exploratory data analysis, building reports, or preparing data for visualization.

FAQs

What is a pandas pivot table?

A pandas pivot table is a function that summarizes and aggregates data from a DataFrame into a structured summary table, similar to pivot tables in Excel.

What is the default aggregation function in pd.pivot_table?

The default aggregation function is mean. Always specify aggfunc explicitly to avoid unintended results.

What is the difference between pivot_table and groupby in pandas?

GroupBy is better for simple single-dimension aggregations. Pivot tables are better for two-dimensional summaries, multiple aggregation functions, and built-in totals.

How do I handle NaN values in a pandas pivot table?

Use the fill_value parameter to replace NaN with a specific value, such as fill_value=0.

How do I add totals to a pandas pivot table?

Use margins=True and optionally margins_name='Total' to add grand total rows and columns.

Can I use multiple aggregation functions in a pivot table?

Yes. Pass a list to aggfunc like aggfunc=['sum', 'mean', 'count'] to apply multiple aggregations at once.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top