Getting a data role at Amazon or Google is one of the most competitive processes in the industry. The technical bar is high, the interviews are structured, and SQL is not just one of many topics they assess. It is a primary filter. Both companies use SQL questions to determine whether a candidate can sit in front of a real business problem, translate it into a correct query, and handle the edge cases that would break a naive solution on production data.
The good news is that both companies follow recognizable patterns. Amazon focuses heavily on multi-table queries, aggregations, and real-world scenarios tied to e-commerce and marketplace data. Google leans toward product analytics problems, user engagement metrics, and window functions applied to behavioral data at scale. Understanding what each company tests, and practicing the specific question types they use, is the most efficient way to prepare.
This guide covers the questions that actually come up, with the SQL to answer them and an explanation of what the interviewer is really looking for behind each one.
How Amazon and Google Structure SQL Interviews
Amazon’s SQL interview process typically involves a phone screen with basic to intermediate questions, followed by one or more technical rounds that include live coding in a shared editor. For data analyst and business intelligence engineer roles, expect two to four SQL problems per round. Amazon is particularly known for multi-part questions where you answer a simple version first and the interviewer adds constraints and edge cases in subsequent steps to test how your thinking evolves.
Google’s SQL rounds for data analyst and data scientist roles tend to focus on product analytics scenarios. You will be given a table schema representing a product like YouTube, Google Search, or Google Ads and asked to compute metrics that a product manager or growth team would actually need. The questions sound simple until you start writing the query and realize there are three hidden layers of complexity in the data.
Both companies care more about your thought process than your syntax. Explain what you are doing as you write. Name the edge cases before the interviewer has to point them out. Ask clarifying questions about the data before assuming anything about its structure.
Amazon SQL Interview Questions
Amazon’s questions are almost always grounded in the context of orders, products, customers, reviews, or marketplace sellers because these are the domains their analysts work in every day. Knowing that context before you walk in helps you interpret the business logic behind each question faster.
Question 1: Average product review rating by month
This question appears in Amazon data analyst interviews regularly. You are given a reviews table with columns product_id, stars, and review_date. Write a query that returns the month as a number, the product_id, and the average star rating rounded to two decimal places, sorted by month then by product_id.
The key thing Amazon is testing here is whether you know how to extract a month from a date, apply rounding correctly, and sort by multiple columns in the right order.
SELECT
EXTRACT(MONTH FROM review_date) AS review_month,
product_id,
ROUND(AVG(stars), 2) AS avg_rating
FROM reviews
GROUP BY EXTRACT(MONTH FROM review_date), product_id
ORDER BY review_month, product_id;
What the interviewer watches for: Whether you use EXTRACT or MONTH() correctly for the database in use, whether you group by the extracted month rather than the raw date, and whether your ORDER BY matches the specified sort order exactly.
Question 2: Top two highest-grossing products per category
This is one of the most consistently reported Amazon SQL questions on platforms like DataLemur and StrataScratch. You have a product_spend table with category, product, spend, and transaction_date. Return the top two products by total spend for each category.
This is a classic top-N-per-group problem. The key is using RANK() or DENSE_RANK() inside a window function partitioned by category and then filtering in an outer query.
WITH ranked_products AS (
SELECT
category,
product,
SUM(spend) AS total_spend,
RANK() OVER (
PARTITION BY category
ORDER BY SUM(spend) DESC
) AS spend_rank
FROM product_spend
GROUP BY category, product
)
SELECT category, product, total_spend
FROM ranked_products
WHERE spend_rank <= 2
ORDER BY category, spend_rank;
What the interviewer watches for: Whether you use RANK() versus DENSE_RANK() and can explain the difference. Whether you GROUP BY before or inside the window function. Whether your WHERE clause filters correctly on the alias. A common mistake is trying to filter on spend_rank directly in the same SELECT without a subquery or CTE, which fails because window functions are evaluated after WHERE.
Question 3: Seven-day rolling revenue per marketplace
Amazon asks candidates to compute a 7-day rolling sum of daily revenue per marketplace. The table is daily_revenue with columns marketplace_id, date, and revenue. Return each date, revenue, and the rolling 7-day revenue for each marketplace.
SELECT
marketplace_id,
dt,
revenue,
SUM(revenue) OVER (
PARTITION BY marketplace_id
ORDER BY dt
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7day_revenue
FROM daily_revenue
ORDER BY marketplace_id, dt;
What the interviewer watches for: The correct ROWS BETWEEN 6 PRECEDING AND CURRENT ROW syntax. Using PARTITION BY marketplace_id so the rolling window restarts for each marketplace rather than running across all marketplaces combined. Understanding that ROWS BETWEEN is more precise than RANGE BETWEEN for date-based rolling windows.
Question 4: Customers who bought product A but not product B
This is a common filtering problem that Amazon uses to test whether candidates know multiple approaches and can explain the tradeoffs between them. You have an orders table with customer_id and product_name. Find all customers who purchased product A but have never purchased product B.
-- Approach 1: Using NOT EXISTS (most readable)
SELECT DISTINCT customer_id
FROM orders o1
WHERE product_name = 'Product A'
AND NOT EXISTS (
SELECT 1
FROM orders o2
WHERE o2.customer_id = o1.customer_id
AND o2.product_name = 'Product B'
);
-- Approach 2: Using LEFT JOIN with NULL filter
SELECT DISTINCT a.customer_id
FROM orders a
LEFT JOIN orders b
ON a.customer_id = b.customer_id
AND b.product_name = 'Product B'
WHERE a.product_name = 'Product A'
AND b.customer_id IS NULL;
What the interviewer watches for: Whether you know more than one approach and when each is appropriate. NOT EXISTS is generally more readable. LEFT JOIN with a NULL filter performs better on some databases. Knowing both and being able to articulate the difference is the senior-level answer.
Question 5: Find employees earning more than their department average
This is a classic subquery or window function problem that tests whether you can compute group-level aggregates and compare them to individual row values.
-- Approach 1: Correlated subquery
SELECT employee_id, name, department, salary
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department
);
-- Approach 2: Window function (more efficient at scale)
SELECT employee_id, name, department, salary
FROM (
SELECT *,
AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees
) ranked
WHERE salary > dept_avg;
What the interviewer watches for: The ability to write both approaches. The window function approach avoids scanning the table twice for every row, which matters at Amazon’s scale. Mentioning performance when you offer the second approach signals engineering maturity, not just SQL literacy.
Google SQL Interview Questions
Google’s questions are framed around product scenarios. The data structures represent real products such as a social platform, a search engine, or an advertising system, and you are expected to produce metrics that a product or growth team would actually use to make decisions.
Question 6: Daily active users by country
This is one of the most reported Google SQL questions. You have an events table with user_id, country, and event_date. Write a query to find the number of daily active users per country for the last 30 days.
SELECT
event_date,
country,
COUNT(DISTINCT user_id) AS daily_active_users
FROM events
WHERE event_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY event_date, country
ORDER BY event_date, country;
What the interviewer watches for: Using COUNT(DISTINCT user_id) and not COUNT(). The difference is that COUNT() counts every event row while COUNT(DISTINCT user_id) counts each user once per day per country, which is what daily active users actually means. Getting this wrong on a 10-billion-row events table produces completely wrong numbers. Google interviewers almost always follow up by asking what you would do if the query was running slowly on a table that large.
Question 7: Week-over-week user retention
Google product analytics interviews frequently test retention logic. You have a user_activity table with user_id and week. Write a query to find the percentage of users who were active in a given week and also active the following week.
WITH weekly_users AS (
SELECT DISTINCT user_id, week
FROM user_activity
),
retained AS (
SELECT
w1.week AS base_week,
COUNT(DISTINCT w1.user_id) AS active_users,
COUNT(DISTINCT w2.user_id) AS retained_users
FROM weekly_users w1
LEFT JOIN weekly_users w2
ON w1.user_id = w2.user_id
AND w2.week = w1.week + 1
GROUP BY w1.week
)
SELECT
base_week,
active_users,
retained_users,
ROUND(retained_users * 100.0 / active_users, 2) AS retention_rate_pct
FROM retained
ORDER BY base_week;
What the interviewer watches for: Using a LEFT JOIN so weeks with zero retention still appear in the output rather than being silently dropped. Using DISTINCT on user_id before the join so users with multiple events in a week are not counted multiple times. Handling division by zero if active_users can ever be zero.
Question 8: Find the third highest salary without using LIMIT or TOP
Google sometimes bans LIMIT to force you to demonstrate window function knowledge. This is also asked at Amazon.
SELECT DISTINCT salary
FROM (
SELECT
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees
) ranked
WHERE salary_rank = 3;
What the interviewer watches for: Using DENSE_RANK() rather than RANK() so ties do not skip ranks and cause the third-highest to disappear from the result. Understanding why LIMIT 1 OFFSET 2 might give a wrong answer when there are ties.
Question 9: Consecutive logins for three or more days
This is a harder problem that tests gaps and islands logic, a pattern that comes up in user engagement analysis at both companies. Given a logins table with user_id and login_date, find all users who logged in for at least three consecutive days.
WITH numbered AS (
SELECT
user_id,
login_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) AS rn
FROM (
SELECT DISTINCT user_id, login_date FROM logins
) deduplicated
),
grouped AS (
SELECT
user_id,
login_date,
login_date - INTERVAL '1 day' * rn AS grp
FROM numbered
)
SELECT DISTINCT user_id
FROM grouped
GROUP BY user_id, grp
HAVING COUNT(*) >= 3;
What the interviewer watches for: The gaps and islands technique using date minus row number to create a group identifier for consecutive dates. Deduplicating login_date before the ROW_NUMBER so a user who logged in twice in one day does not break the consecutive logic. This is an advanced question and interviewers know it. Narrating your approach before you write it is just as important as the final query.
Question 10: Sessions longer than the user’s average session length
Google asks questions like this to test whether candidates can compute user-level aggregates and compare individual values against them in a single query. Given a sessions table with user_id, session_id, and duration_minutes, return all sessions that are longer than that user’s own average session length.
SELECT
session_id,
user_id,
duration_minutes
FROM (
SELECT
session_id,
user_id,
duration_minutes,
AVG(duration_minutes) OVER (PARTITION BY user_id) AS user_avg_duration
FROM sessions
) user_sessions
WHERE duration_minutes > user_avg_duration;
What the interviewer watches for: Using AVG as a window function with PARTITION BY rather than a correlated subquery. The window function approach is cleaner and more performant. Interviewers at Google specifically look for awareness of query performance, not just correctness.
How to Approach SQL Questions in the Interview Room
Both Amazon and Google interviewers evaluate your process as much as your output. The candidates who succeed follow a consistent approach before writing a single line of SQL.
State the grain of the output first. Before writing anything, say out loud what one row in your result represents. One row per customer per month. One row per product per category. Locking in the grain before you write prevents the most common mistakes in aggregation and join logic.
Ask about edge cases explicitly. Is the join key guaranteed to be unique? Can the date column contain NULLs? What should happen in the case of a tie? Asking these questions before writing shows the interviewer you think about production data, not just tutorial data.
Write in steps using CTEs. Rather than writing one long nested query, use CTEs to break the problem into named steps. Each CTE is one logical operation with a clear name. This makes your code readable, debuggable, and easy to narrate as you write it.
Mention the approach you did not use. If you write a window function solution, briefly note that a correlated subquery would also work but would be slower at scale. Demonstrating awareness of alternatives and their tradeoffs is what separates a mid-level answer from a senior-level one.
SQL Concepts Amazon and Google Test Most
| Concept | Frequency | Example Question Type |
|---|---|---|
| Window functions | Very high | Rolling averages, rankings, retention |
| CTEs | Very high | Multi-step logic, readability |
| JOINs including self-joins | Very high | Retention, consecutive events |
| Aggregation with GROUP BY | Very high | Revenue by category, user counts |
| Filtering with HAVING | High | Departments with more than N employees |
| Date functions and intervals | High | Monthly grouping, 30-day windows |
| Subqueries and NOT EXISTS | High | Exclusion logic, above-average queries |
| DISTINCT vs COUNT(*) | High | Active users, unique purchases |
| RANK vs DENSE_RANK vs ROW_NUMBER | High | Top-N per group problems |
| NULL handling | Medium | LEFT JOIN filtering, COALESCE |
Common Mistakes to Avoid
Filtering on a window function alias in the same query level. Window functions are evaluated after WHERE, so WHERE spend_rank <= 2 fails if spend_rank is defined in the same SELECT. Always wrap in a subquery or CTE and filter in the outer query.
Using COUNT() when the question asks for unique users or unique entities. COUNT() counts rows. COUNT(DISTINCT user_id) counts users. These produce different numbers in any table where one user generates multiple rows. Read the question carefully before deciding which to use.
Ignoring ties in ranking questions. RANK() skips numbers after a tie. DENSE_RANK() does not. ROW_NUMBER() assigns unique numbers regardless of ties. Choosing the wrong one gives the wrong answer. Know the difference before the interview so you do not have to think about it under pressure.
Writing queries that would break on NULL join keys. If the column you are joining on can contain NULLs in either table, rows with NULL keys will never match and will silently disappear from the result. Know your data before you join on it.
Not mentioning performance on large table questions. Google and Amazon operate at scales where a correct but inefficient query is a real problem. When you finish a correct answer, add one sentence about what would happen to that query on a billion-row table and what you would do about it. That sentence often makes the difference between a hire and a no-hire at the senior level.
SQL interviews at Amazon and Google are not designed to trick you. They are designed to simulate the actual work their analysts do every day. The questions are grounded in business problems, the data structures reflect real product scenarios, and the edge cases they test are the same ones that break production queries. Study the patterns, practice writing queries from scratch without looking anything up, and explain your thinking out loud as you write. That combination is what gets you through.
FAQs
What SQL topics does Amazon test most in interviews?
Amazon focuses heavily on JOINs across multiple tables, aggregations with GROUP BY, window functions for ranking and rolling calculations, and filtering logic using subqueries and NOT EXISTS. Nearly 70% of Amazon SQL interview questions involve combining data across multiple tables. Time-based analysis using date functions and interval arithmetic also appears consistently across data analyst and business intelligence engineer roles.
What SQL topics does Google test most in interviews?
Google’s SQL questions are framed around product analytics scenarios involving user behavior, engagement, and retention. Window functions are tested in almost every Google SQL round. Expect questions on daily and weekly active users, retention rate calculations, session analysis, and top-N-per-group problems. Google also tests query performance awareness, asking candidates what they would do to optimize a correct query running on a very large table.
How hard are Amazon and Google SQL interviews compared to other companies?
Both companies sit at the higher end of difficulty for data analyst SQL interviews. Amazon’s difficulty comes from multi-part questions with evolving constraints and edge cases that expose gaps in JOIN and aggregation logic. Google’s difficulty comes from product analytics framing that requires both SQL accuracy and business intuition about what the metric means and why it matters. Compared to typical SQL interviews at mid-size companies, both are harder but not unreachable with consistent practice on platforms like DataLemur, StrataScratch, and LeetCode.
How should I practice for an Amazon or Google SQL interview?
Practice writing queries from scratch in a blank editor rather than choosing from multiple-choice options. Use DataLemur and StrataScratch for company-specific practice questions with verified solutions. For each question, narrate your approach out loud before writing and explicitly state the grain of the output, the join conditions you are assuming, and at least one edge case you are accounting for. Review your solutions against the official answers and understand why any differences exist rather than just checking whether your output matches.
Do Amazon and Google ask theoretical SQL questions or only coding?
Both companies prioritize live coding over theory. You will spend most of the SQL round writing queries, not defining concepts. That said, Amazon interviewers sometimes ask about the difference between RANK, DENSE_RANK, and ROW_NUMBER, or when to use a subquery versus a CTE, as follow-up questions after a coding problem. Google occasionally asks about query optimization concepts at the senior level. Know the concepts but spend the majority of your preparation time writing actual queries.