Pandas is one of the most important Python libraries for data analysis but remembering everything is hard.
That’s why analysts rely on cheat sheets.
This Pandas cheat sheet covers the most-used commands in real data analysis work, not rare edge cases. If you master what’s below, you’ll handle most day-to-day data tasks with confidence.
Importing Pandas
import pandas as pd
Loading Data
df = pd.read_csv("data.csv")
df = pd.read_excel("data.xlsx")
df = pd.read_json("data.json")
Viewing Data
df.head()
df.tail()
df.sample(5)
df.info()
df.describe()
These help you quickly understand structure, size, and data types.
Selecting Columns
df["column"]
df[["col1", "col2"]]
Selecting Rows
df.loc[0]
df.iloc[0]
df.loc[df["sales"] > 1000]
Filtering Data
df[df["region"] == "Europe"]
df[(df["sales"] > 500) & (df["profit"] > 50)]
Sorting Data
df.sort_values("sales")
df.sort_values("sales", ascending=False)
Handling Missing Values
df.isna()
df.dropna()
df.fillna(0)
Missing data handling is one of the most common real-world tasks.
Renaming Columns
df.rename(columns={"old_name": "new_name"})
df.columns = df.columns.str.lower()
Creating New Columns
df["revenue"] = df["price"] * df["quantity"]
Grouping & Aggregation
df.groupby("region")["sales"].sum()
df.groupby("category").agg(
total_sales=("sales", "sum"),
avg_profit=("profit", "mean")
)
This is where Pandas replaces many Excel pivot tables.
Value Counts
df["status"].value_counts()
Perfect for quick distributions.
Removing Duplicates
df.drop_duplicates()
Changing Data Types
df["date"] = pd.to_datetime(df["date"])
df["price"] = df["price"].astype(float)
Merging DataFrames
pd.merge(df1, df2, on="id", how="inner")
pd.merge(df1, df2, on="id", how="left")
Applying Functions
df["profit_margin"] = df["profit"] / df["revenue"]
df["sales_level"] = df["sales"].apply(lambda x: "High" if x > 1000 else "Low")
Basic String Operations
df["name"].str.lower()
df["email"].str.contains("@")
Exporting Data
df.to_csv("output.csv", index=False)
df.to_excel("output.xlsx", index=False)
Common Pandas Mistakes Beginners Make
- Forgetting to assign results back to the DataFrame
- Mixing
locandiloc - Ignoring data types
- Overusing loops instead of vectorized operations
How Analysts Use Pandas in Real Jobs
Pandas is commonly used for:
- data cleaning
- exploratory analysis
- feature creation
- report preparation
- feeding dashboards and ML models
You don’t need to memorize every Pandas function.
If you understand:
- loading data
- filtering
- grouping
- cleaning
You’re already job-ready for most entry-level data analysis tasks.
Bookmark this cheat sheet,you’ll come back to it often.