If you are working with pandas in Python, one of the most common tasks you will face is applying a function to your data whether to a single column, an entire DataFrame, or element by element.
Pandas gives you three main tools to do this:
- apply()
- map()
- applymap() ( now called map() on DataFrames in newer versions)
At first glance, they seem to do the same thing. But they work very differently, operate on different structures, and are suited for completely different situations.
Choosing the wrong one can lead to errors, unexpected results, or unnecessarily slow code.
In this guide, we will break down exactly what each function does, how it works, when to use it, and how they compare with clear, practical examples.
Quick Summary Before We Dive In
Before going deep, here is a one-line summary of each:
- map() — Applies a function or mapping element by element on a Series (single column)
- apply() — Applies a function along rows or columns of a DataFrame, or element by element on a Series
- applymap() — Applies a function element by element across an entire DataFrame (deprecated in pandas 2.1+, replaced by DataFrame.map())
Now let us go through each one in detail.
Setting Up the Example Dataset
We will use a consistent dataset throughout this guide so you can follow along easily.
python
import pandas as pd
import numpy as np
data = {
'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'department': ['Engineering', 'Marketing', 'Engineering', 'HR', 'Marketing'],
'salary': [95000, 62000, 88000, 55000, 71000],
'experience': [8, 4, 6, 3, 5]
}
df = pd.DataFrame(data)
print(df)
Output:
| name | department | salary | experience |
|---|---|---|---|
| Alice | Engineering | 95000 | 8 |
| Bob | Marketing | 62000 | 4 |
| Charlie | Engineering | 88000 | 6 |
| Diana | HR | 55000 | 3 |
| Eve | Marketing | 71000 | 5 |
Part 1: pandas map()
What Is map()?
map() is a Series method. This means it operates on a single column (Series), not an entire DataFrame.
It applies a function, dictionary, or another Series to each element in the Series one at a time.
Syntax
python
Series.map(arg, na_action=None)
- arg — A function, dictionary, or Series to apply to each element
- na_action — If set to
'ignore', NaN values are left unchanged
Example 1: Using map() with a Function
python
# Convert salary to thousands
df['salary'].map(lambda x: x / 1000)
Output:
0 95.0
1 62.0
2 88.0
3 55.0
4 71.0
Name: salary, dtype: float64
Example 2: Using map() with a Dictionary
One of the most powerful uses of map() is substituting values using a dictionary. This is perfect for encoding categories.
python
# Map department names to department codes
dept_codes = {
'Engineering': 'ENG',
'Marketing': 'MKT',
'HR': 'HR'
}
df['department'].map(dept_codes)
Output:
0 ENG
1 MKT
2 ENG
3 HR
4 MKT
Name: department, dtype: object
Example 3: Using map() with Another Series
python
# Create a bonus Series indexed by name
bonus = pd.Series({
'Alice': 10000,
'Bob': 5000,
'Charlie': 8000,
'Diana': 4000,
'Eve': 6000
})
df['name'].map(bonus)
Output:
0 10000
1 5000
2 8000
3 4000
4 6000
Name: name, dtype: int64
Example 4: Handling NaN Values with map()
python
# What happens when a key is missing from the dictionary
dept_codes_incomplete = {'Engineering': 'ENG', 'Marketing': 'MKT'}
df['department'].map(dept_codes_incomplete)
Output:
0 ENG
1 MKT
2 ENG
3 NaN ← HR was not in the dictionary
4 MKT
When a value is not found in the mapping dictionary, map() returns NaN. Always make sure your dictionary covers all possible values, or handle NaN values afterwards.
When to Use map()
- You are working with a single column (Series)
- You want to substitute or encode values using a dictionary
- You want to apply a simple element-wise transformation
- You want to map one Series to values from another Series
Part 2: pandas apply()
What Is apply()?
apply() is the most flexible of the three functions. It can operate on:
- A Series — applying a function to each element
- A DataFrame — applying a function to each row or each column
Unlike map(), which only works element by element on a Series, apply() can pass an entire row or column as a Series to your function. This makes it much more powerful for complex transformations.
Syntax
python
# On a DataFrame
DataFrame.apply(func, axis=0)
# On a Series
Series.apply(func)
- func — The function to apply
- axis —
0or'index'to apply function to each column (default),1or'columns'to apply to each row
Example 1: apply() on a Single Column (Series)
python
# Apply a function to the salary column
df['salary'].apply(lambda x: 'High' if x > 80000 else 'Medium' if x > 65000 else 'Low')
Output:
0 High
1 Low
2 High
3 Low
4 Medium
Name: salary, dtype: object
Example 2: apply() Column-Wise on a DataFrame (axis=0)
When axis=0, the function receives each column as a Series.
python
# Get the max value of each numeric column
df[['salary', 'experience']].apply(max, axis=0)
Output:
salary 95000
experience 8
dtype: int64
Example 3: apply() Row-Wise on a DataFrame (axis=1)
When axis=1, the function receives each row as a Series. This is where apply() becomes truly powerful.
python
# Create a performance label based on both salary and experience
def performance_label(row):
if row['salary'] > 80000 and row['experience'] > 6:
return 'Senior High Performer'
elif row['salary'] > 70000:
return 'Mid-Level Performer'
else:
return 'Junior'
df.apply(performance_label, axis=1)
Output:
0 Senior High Performer
1 Junior
2 Mid-Level Performer
3 Junior
4 Mid-Level Performer
dtype: object
This is something map() simply cannot do because map() only sees one element at a time, while apply() with axis=1 sees the entire row, giving you access to multiple columns simultaneously.
Example 4: apply() with a Custom Aggregation
python
# Calculate salary range (max - min) for each numeric column
df[['salary', 'experience']].apply(lambda x: x.max() - x.min())
Output:
salary 40000
experience 5
dtype: int64
Example 5: apply() Returning Multiple Values
apply() can return a Series from your function, which expands into multiple columns.
python
def salary_stats(col):
return pd.Series({
'mean': col.mean(),
'median': col.median(),
'std': col.std()
})
df[['salary', 'experience']].apply(salary_stats)
Output:
| salary | experience | |
|---|---|---|
| mean | 74200.0 | 5.2 |
| median | 71000.0 | 5.0 |
| std | 16643.9 | 1.92 |
When to Use apply()
- You need to apply a function that uses multiple columns (row-wise with axis=1)
- You need to apply a function to each column for aggregation (axis=0)
- Your transformation is too complex for map() — involving conditions across columns
- You are working with both Series and DataFrames
- You need your function to return multiple values as new columns
Part 3: applymap() / DataFrame.map()
What Is applymap()?
applymap() applies a function element by element across an entire DataFrame i.e. every single cell gets the function applied to it independently.
Important Note on Deprecation
In pandas 2.1.0 and later, applymap() was deprecated and replaced with DataFrame.map(). The functionality is identical — only the name changed.
python
# Old way (pandas < 2.1)
df.applymap(func)
# New way (pandas >= 2.1)
df.map(func)
For compatibility, we will show both in the examples.
Syntax
python
# New syntax (pandas 2.1+)
DataFrame.map(func, na_action=None)
# Old syntax (still works but deprecated)
DataFrame.applymap(func)
Example 1: Apply a Function to Every Cell
python
# Convert all values to strings
df.map(str)
Output:
| name | department | salary | experience |
|---|---|---|---|
| Alice | Engineering | 95000 | 8 |
| Bob | Marketing | 62000 | 4 |
| Charlie | Engineering | 88000 | 6 |
| Diana | HR | 55000 | 3 |
| Eve | Marketing | 71000 | 5 |
Every value in the entire DataFrame is now a string.
Example 2: Format Numbers Across the Entire DataFrame
python
numeric_df = df[['salary', 'experience']]
# Add a dollar sign to every value (for demonstration)
numeric_df.map(lambda x: f"${x:,}" if isinstance(x, int) else x)
Output:
| salary | experience |
|---|---|
| $95,000 | $8 |
| $62,000 | $4 |
| $88,000 | $6 |
| $55,000 | $3 |
| $71,000 | $5 |
Example 3: Check a Condition Across All Cells
python
# Check which numeric values are above their column mean
numeric_df.map(lambda x: x > numeric_df.stack().mean())
Example 4: Handling NaN Values
python
import numpy as np
df_with_nan = pd.DataFrame({
'A': [1, 2, np.nan],
'B': [4, np.nan, 6]
})
# na_action='ignore' skips NaN values
df_with_nan.map(lambda x: x * 2, na_action='ignore')
Output:
| A | B |
|---|---|
| 2.0 | 8.0 |
| 4.0 | NaN |
| NaN | 12.0 |
NaN values are left unchanged when na_action='ignore' is set.
When to Use applymap() / DataFrame.map()
- You want to apply a simple transformation to every cell in a DataFrame
- You are doing element-wise formatting e.g. adding currency symbols, rounding numbers, converting types
- You need uniform processing across all columns simultaneously
- Your function operates on individual values and not rows or columns
apply vs map vs applymap
Example: Doubling Salary Values
python
# Using map() — on a Series only
df['salary'].map(lambda x: x * 2)
# Using apply() — on a Series
df['salary'].apply(lambda x: x * 2)
# Using DataFrame.map() — on entire DataFrame
df[['salary', 'experience']].map(lambda x: x * 2)
# Using apply() — on entire DataFrame column-wise
df[['salary', 'experience']].apply(lambda x: x * 2)
For simple element-wise operations on a single column, map() and apply() produce identical results. The difference becomes clear in more complex scenarios.
| Feature | map() | apply() | applymap() / DataFrame.map() |
|---|---|---|---|
| Works on Series | Yes | Yes | No |
| Works on DataFrame | No | Yes | Yes |
| Element-wise on Series | Yes | Yes | No |
| Element-wise on DataFrame | No | No | Yes |
| Row-wise on DataFrame | No | Yes (axis=1) | No |
| Column-wise on DataFrame | No | Yes (axis=0) | No |
| Accepts dictionary mapping | Yes | No | No |
| Accepts another Series | Yes | No | No |
| Multi-column logic | No | Yes | No |
| Returns multiple columns | No | Yes | No |
| Performance on large data | Fast | Slower | Moderate |
| Available since | Early pandas | Early pandas | Deprecated in 2.1 |
Real-World Use Cases
Using map() — Data Encoding for Machine Learning
Before training a machine learning model, categorical variables need to be converted to numbers. map() is perfect for this.
python
gender_map = {'Male': 0, 'Female': 1, 'Other': 2}
df['gender'].map(gender_map)
Using apply() — Row-Level Business Logic
When a transformation depends on multiple columns such as calculating a bonus based on both salary and years of experience — apply() with axis=1 is the right tool.
python
def calculate_bonus(row):
base = row['salary'] * 0.10
experience_multiplier = 1 + (row['experience'] * 0.05)
return round(base * experience_multiplier, 2)
df['bonus'] = df.apply(calculate_bonus, axis=1)
Using applymap() / DataFrame.map() — Data Cleaning and Formatting
When preparing data for a report or export, you might need to apply the same formatting to every cell in a DataFrame.
python
# Round all numeric values to 2 decimal places
numeric_df.map(lambda x: round(x, 2))
# Strip whitespace from every string cell
string_df.map(lambda x: x.strip() if isinstance(x, str) else x)
Performance Considerations
Understanding performance is important when working with large datasets.
Vectorized Operations Are Always Faster
Before reaching for apply(), map(), or applymap(), always ask if a vectorized pandas or NumPy operation can do the job.
python
# Slow — using apply
df['salary'].apply(lambda x: x * 1.1)
# Fast — vectorized operation
df['salary'] * 1.1
Vectorized operations work on the entire array at once using optimized C code under the hood. This makes them significantly faster than Python-level loops.
Speed Ranking (Fastest to Slowest)
For element-wise operations:
- Vectorized NumPy/pandas operations — Always fastest
- map() — Fast for Series element-wise operations
- applymap() / DataFrame.map() — Moderate for DataFrame element-wise operations
- apply() with axis=1 — Slowest because it loops through rows in Python
When apply() Is Worth the Performance Cost
Despite being slower, apply() is worth using when:
- Your logic is too complex for vectorized operations
- You need row-level access to multiple columns
- The dataset is small enough that performance is not a concern
- Readability and maintainability matter more than raw speed
Advantages and Disadvantages
map()
Advantages: Fast for Series operations, accepts dictionaries and Series for mapping, clean and simple syntax, great for encoding and value substitution Disadvantages: Only works on Series, cannot access multiple columns, cannot be used directly on DataFrames
apply()
Advantages: Works on both Series and DataFrames, supports row-wise and column-wise operations, handles complex multi-column logic, can return multiple values Disadvantages: Slower than vectorized operations, can be overused when simpler alternatives exist, axis parameter confuses beginners
applymap() / DataFrame.map()
Advantages: Simple element-wise operations across entire DataFrame, uniform processing of all cells, easy to understand Disadvantages: Deprecated name (applymap) in newer pandas versions, slower than vectorized operations, no access to row or column context
Common Mistakes to Avoid
- Using apply() when map() is enough — If you are transforming a single column element by element,
map()is simpler and often faster - Using apply(axis=1) for simple math — Operations like multiplying a column by a constant should use vectorized operations, not apply
- Forgetting axis parameter in apply() —
axis=0applies to columns,axis=1applies to rows. Getting this wrong is one of the most common pandas mistakes - Using applymap() in pandas 2.1+ — It still works but raises a deprecation warning. Switch to
DataFrame.map()for future compatibility - Not handling NaN values — All three functions can behave unexpectedly with NaN. Always use
na_action='ignore'or filter NaNs when needed - Returning inconsistent types from apply() — If your function returns different types for different rows, pandas may produce unexpected results. Keep return types consistent
Quick Decision Guide
Use this simple flowchart to decide which function to use:
Are you working with a single column (Series)?
- Need dictionary or Series mapping → map()
- Need simple element-wise function → map() or apply()
- Need complex logic using the element → apply()
Are you working with an entire DataFrame?
- Need to apply a function to every individual cell → DataFrame.map() (formerly applymap)
- Need to apply a function to each column → apply(axis=0)
- Need to apply a function to each row using multiple columns → apply(axis=1)
Can a vectorized operation do the job?
- Always check this first → Use vectorized pandas/NumPy operations
apply(), map(), and applymap() are three of the most useful and most confused functions in the pandas library. Understanding the difference between them will make your data analysis code cleaner, faster, and more intentional.
Here is the simplest way to remember the difference:
- map() — One column, one element at a time. Great for substitution and encoding
- apply() — Rows or columns of a DataFrame, or elements of a Series. Great for complex logic across multiple columns
- DataFrame.map() — Every cell in a DataFrame, one at a time. Great for uniform formatting and transformation
And always remember, before using any of these three functions, check if a simple vectorized operation can get the job done faster.
FAQs
What is the difference between map and apply in pandas?
map() works only on a Series and applies a function element by element. apply() works on both Series and DataFrames and can apply functions row-wise or column-wise, making it more powerful for complex transformations.
When should I use apply() instead of map()?
Use apply() when your transformation requires access to multiple columns at once (row-wise with axis=1), or when you need to apply a function across an entire DataFrame column by column.
Is applymap() deprecated in pandas?
Yes. applymap() was deprecated in pandas 2.1.0 and replaced with DataFrame.map(). The functionality is identical, only the name changed.
Which is faster — apply, map, or applymap?
For element-wise operations on a Series, map() is generally fastest. However, all three are slower than vectorized pandas or NumPy operations. Always prefer vectorized operations when possible.
Can map() work on a DataFrame?
In pandas 2.1+, DataFrame.map() works on DataFrames element-wise (replacing applymap). The older Series.map() only works on a Series.
Can apply() replace map() completely?
apply() can perform element-wise operations on a Series just like map(), but map() has the added ability to accept dictionaries and other Series as mapping inputs — something apply() cannot do directly.