Have you ever pulled data from a database and found a column that stores multiple values crammed into a single cell — product tags separated by commas, customer interests joined with semicolons, team members listed in one field with pipe delimiters, or category paths written as slash separated strings?
In most cases, you might try to handle this in the application layer — fetching the raw data, splitting the strings in Python or JavaScript, and then processing each value separately. Or you might write a messy sequence of string functions trying to extract each position manually, which breaks the moment the number of values in any cell changes.
This is brittle, hard to maintain, and does not scale when you need to join these values against other tables, count occurrences, filter by individual tags, or aggregate across the split values.
SQL string splitting solves this directly in the query layer. You tell SQL which column contains the delimited values, what delimiter separates them, and SQL returns one row per value, expanding your original row into as many rows as there are values in the string. Every downstream operation i.e. filtering, joining, grouping, counting then works on clean individual values rather than packed strings.
In this guide we will break down SQL string splitting completely: what it is, which functions each major database provides, how to write the queries step by step, real world examples across different scenarios, and when to use SQL splitting versus handling it in your application code.
What Is String Splitting in SQL?
String splitting is the process of taking a single string that contains multiple values joined by a delimiter and transforming it into multiple separate rows i.e. one row per value so that each value can be treated as an independent data point.
It answers the question: “This cell contains five values packed together — how do I turn that into five rows so I can work with each value individually?”
Simple Analogy
Imagine you receive a paper form where someone wrote all their food allergies in one box separated by commas: “peanuts, dairy, gluten, shellfish.” To enter each allergy into a medical database correctly, you need to split that one written entry into four separate database records. SQL string splitting does exactly this automatically, for every row in your table, regardless of how many values are packed into each cell.
The Three Components of a String Split Operation
Every string split query in SQL requires three things:
The input string is the column or value containing the packed delimited data that needs to be expanded into rows. The delimiter is the character or sequence of characters separating the individual values within the string — a comma, semicolon, pipe, space, or any other separator. The row expansion mechanism is the SQL function or technique that takes the single input and produces multiple output rows — one per extracted value.
Where to Find String Splitting in SQL
Different database systems provide different tools for splitting strings into rows. Here is what each major platform offers:
| Database | Primary Function | Notes |
|---|---|---|
| SQL Server 2016+ | STRING_SPLIT() | Built in, simple syntax |
| PostgreSQL | STRING_TO_TABLE() or regexp_split_to_table() | Multiple options |
| MySQL 8.0+ | No native function | Requires JSON workaround or recursive CTE |
| BigQuery | SPLIT() with UNNEST() | Array based approach |
| Snowflake | SPLIT_TO_TABLE() | Table function syntax |
| Oracle 12c+ | REGEXP_SUBSTR with CONNECT BY | Recursive generation approach |
| SQLite | No native function | Requires recursive CTE workaround |
The syntax differs across platforms but the concept is identical in all of them. It transform one row with a packed string into multiple rows with individual values.
Step by Step: Your First String Split Query
Example 1: Split a Comma Separated Product Tags Column
Setup — a products table with packed tags:
| product_id | product_name | tags |
|---|---|---|
| 1 | Running Shoes | sport,outdoor,fitness,sale |
| 2 | Coffee Maker | kitchen,appliance,morning |
| 3 | Notebook | stationery,office,writing,school,sale |
You want one row per tag so you can count how many products have each tag, filter by a specific tag, or join the tags against a taxonomy table.
Step 1: Write the base query for SQL Server
sql
SELECT
product_id,
product_name,
tags
FROM products;
Start by confirming your source data looks correct before adding the split logic.
Step 2: Apply STRING_SPLIT
sql
SELECT
p.product_id,
p.product_name,
s.value AS tag
FROM products p
CROSS APPLY STRING_SPLIT(p.tags, ',') s;
Step 3: Understand each component
STRING_SPLIT(p.tags, ',') takes the tags column and the comma delimiter and returns a table of individual values. CROSS APPLY joins each row from products with every row returned by STRING_SPLIT for that row — expanding one product row into as many rows as it has tags. s.value is the column name STRING_SPLIT uses for the extracted values.
Step 4: Check the result
| product_id | product_name | tag |
|---|---|---|
| 1 | Running Shoes | sport |
| 1 | Running Shoes | outdoor |
| 1 | Running Shoes | fitness |
| 1 | Running Shoes | sale |
| 2 | Coffee Maker | kitchen |
| 2 | Coffee Maker | appliance |
| 2 | Coffee Maker | morning |
| 3 | Notebook | stationery |
| 3 | Notebook | office |
| 3 | Notebook | writing |
| 3 | Notebook | school |
| 3 | Notebook | sale |
Three input rows have become twelve output rows i.e. one per individual tag value. Every downstream operation now works on clean single values.
Step 5: Trim whitespace from split values
Delimited strings often contain spaces after the delimiter — “sport, outdoor, fitness” instead of “sport,outdoor,fitness”. Always trim the split values:
sql
SELECT
p.product_id,
p.product_name,
TRIM(s.value) AS tag
FROM products p
CROSS APPLY STRING_SPLIT(p.tags, ',') s;
Best practice: Always wrap split values in TRIM() as the very first operation. Spaces embedded in string values cause silent matching failures — “sale” and ” sale” look identical visually but will not match in WHERE clauses or JOIN conditions.
How SQL String Splitting Works Internally
When SQL processes a string split operation, it uses a set based expansion mechanism rather than iterating row by row the way application code would. The database engine reads the input string, scans for the delimiter character, extracts each substring between delimiters, and generates one output row per substring in a single operation optimised by the query engine.
The CROSS APPLY (SQL Server) or CROSS JOIN LATERAL (PostgreSQL) operator is what allows the split function to be called once per input row and return a variable number of output rows. This is fundamentally different from a scalar function which always returns exactly one value per input row. Table valued functions like STRING_SPLIT can return zero, one, or many rows per input, and CROSS APPLY handles that variable output correctly.
For databases without a native split function, recursive CTEs achieve the same result by repeatedly extracting the substring before the first delimiter, removing it from the string, and recursing until no delimiter remains building one row per iteration.
Real World Examples
Example 2: PostgreSQL — Split Emails by Semicolon
Goal: A contact table stores multiple email addresses per contact separated by semicolons. Split them into individual rows for an email validation check.
sql
SELECT
contact_id,
contact_name,
TRIM(email_address) AS email_address
FROM contacts,
LATERAL regexp_split_to_table(emails, ';') AS email_address
WHERE emails IS NOT NULL
AND emails <> '';
PostgreSQL’s regexp_split_to_table splits on a regular expression pattern rather than a fixed character, making it flexible enough to handle variable whitespace around delimiters:
sql
-- Split on semicolon with optional surrounding whitespace
SELECT
contact_id,
TRIM(email_address) AS email_address
FROM contacts,
LATERAL regexp_split_to_table(emails, '\s*;\s*') AS email_address;
The pattern \s*;\s* matches a semicolon with zero or more whitespace characters on either side — handling “alice@co.com;bob@co.com“, “alice@co.com ; bob@co.com“, and “alice@co.com; bob@co.com” all with the same expression.
Example 3: BigQuery — Unnest a Split Array
Goal: A BigQuery table stores product categories as comma separated strings. Split them and count how many products belong to each category.
sql
SELECT
category,
COUNT(DISTINCT product_id) AS product_count
FROM (
SELECT
product_id,
TRIM(category) AS category
FROM products,
UNNEST(SPLIT(categories, ',')) AS category
)
GROUP BY category
ORDER BY product_count DESC;
BigQuery’s approach uses two steps. SPLIT(categories, ',') converts the delimited string into an array of strings. UNNEST() expands that array into individual rows. Combining them with a comma join (implicit CROSS JOIN) produces one row per array element per product row.
Result:
| category | product_count |
|---|---|
| sale | 47 |
| outdoor | 31 |
| kitchen | 28 |
| fitness | 24 |
| office | 19 |
This query would be impossible to write cleanly against the original packed category strings. After splitting, it becomes a straightforward GROUP BY COUNT.
Example 4: Snowflake — Split Team Members and Join to Employee Table
Goal: A projects table stores assigned team members as pipe delimited employee IDs. Split them and join to the employees table to get full names and departments.
sql
SELECT
p.project_id,
p.project_name,
TRIM(s.value) AS employee_id,
e.full_name,
e.department
FROM projects p,
TABLE(SPLIT_TO_TABLE(p.assigned_employees, '|')) s
JOIN employees e
ON TRIM(s.value) = e.employee_id
WHERE p.status = 'active'
ORDER BY p.project_id, e.full_name;
Snowflake’s SPLIT_TO_TABLE is a table function that returns three columns: SEQ (sequence number of the original row), INDEX (position of the value within the split), and VALUE (the extracted string). Using VALUE gives the individual employee ID which can then be joined directly to the employees table.
Result:
| project_id | project_name | employee_id | full_name | department |
|---|---|---|---|---|
| PRJ001 | Platform Rebuild | E042 | Alice Chen | Engineering |
| PRJ001 | Platform Rebuild | E087 | Marcus Webb | Engineering |
| PRJ001 | Platform Rebuild | E103 | Sara Patel | Product |
| PRJ002 | Brand Refresh | E019 | James Okafor | Marketing |
| PRJ002 | Brand Refresh | E055 | Lisa Torres | Design |
The join brings full employee details into the result something that would require application layer processing if the employee IDs stayed packed in a single string.
Example 5: MySQL — Recursive CTE Workaround
Goal: MySQL 8.0 has no native string split function. Use a recursive CTE to split comma separated values into rows.
sql
WITH RECURSIVE split_cte AS (
SELECT
product_id,
product_name,
CONCAT(tags, ',') AS remaining,
'' AS current_tag,
0 AS iteration
FROM products
UNION ALL
SELECT
product_id,
product_name,
SUBSTRING(remaining, LOCATE(',', remaining) + 1),
SUBSTRING(remaining, 1, LOCATE(',', remaining) - 1),
iteration + 1
FROM split_cte
WHERE remaining <> ''
)
SELECT
product_id,
product_name,
TRIM(current_tag) AS tag
FROM split_cte
WHERE current_tag <> ''
ORDER BY product_id, iteration;
Breaking down how this recursive CTE works:
The anchor member (first SELECT) initialises each row by appending a trailing comma to the tags string so the recursion logic works uniformly for the last value. It sets remaining to the full string and current_tag to empty.
The recursive member (second SELECT after UNION ALL) extracts the substring before the first comma as current_tag, then removes that substring plus the comma from remaining. It continues recursing until remaining is empty.
The final SELECT filters out the empty current_tag rows from the initialisation step and returns clean individual tag values.
Note on MySQL recursion depth: MySQL defaults to a maximum recursion depth of 1000 controlled by the
cte_max_recursion_depthsystem variable. If any cell contains more than 1000 delimited values, the query will error. Increase the limit if needed:SET cte_max_recursion_depth = 5000;
Example 6: Preserving Position With Ordinal Values
Goal: Split a path string like “Electronics/Computers/Laptops” and preserve the position of each segment so you know which level of the hierarchy each value represents.
SQL Server with ordinal support (SQL Server 2022+):
sql
SELECT
product_id,
s.ordinal AS hierarchy_level,
s.value AS category_name
FROM products p
CROSS APPLY STRING_SPLIT(p.category_path, '/', 1) s
ORDER BY product_id, s.ordinal;
The third argument 1 in STRING_SPLIT(column, delimiter, 1) enables ordinal output — SQL Server 2022 added this capability. The result includes an ordinal column with the one based position of each value in the original string.
Result:
| product_id | hierarchy_level | category_name |
|---|---|---|
| 1 | 1 | Electronics |
| 1 | 2 | Computers |
| 1 | 3 | Laptops |
| 2 | 1 | Home |
| 2 | 2 | Kitchen |
| 2 | 3 | Appliances |
| 2 | 4 | Coffee Makers |
With ordinal values you can pivot the results to reconstruct the hierarchy levels as separate columns, filter to only the leaf category (the highest ordinal per product), or validate that all category paths have the expected depth.
Splitting Strings vs Storing Properly Normalised Data
Ideally, data that represents multiple values — tags, categories, team members should be stored in a separate table with one value per row from the start, not packed into a delimited string. Splitting strings in SQL is a corrective technique for data that was not stored in a normalised structure. Understanding when you are correcting a design problem versus solving a legitimate use case helps you make better decisions about when to use string splitting and when to push for a schema change.
When string splitting is the right SQL solution:
You are working with a legacy system you cannot modify. The data arrives from an external source in delimited format. The packed string represents genuinely flexible or variable length data that is impractical to normalise. You need a quick analytical query without a schema migration.
When you should advocate for a normalised schema instead:
You find yourself splitting the same string column in every query that touches it. You need to filter, join, or aggregate on the individual values regularly. The number of values per cell varies wildly making the data difficult to reason about. New values are being added to the packed string by application code rather than inserted as new rows.
| Consideration | String Splitting | Normalised Schema |
|---|---|---|
| Query complexity | Higher — split logic in every query | Lower — straightforward joins |
| Storage efficiency | Higher — repeated delimiters and context | Lower — clean individual values |
| Index performance | Poor — cannot index within a string | Good — index on individual value column |
| Flexibility | Low — delimiter must be consistent | High — any number of values per entity |
| Schema change required | No — works on existing structure | Yes — requires migration |
| Best for | Legacy data, external feeds, one off analysis | Production systems, repeated analytical use |
Common Limitations
Cannot Index Within Delimited Strings
SQL indexes work on complete column values or prefixes not on values embedded within a larger string. A query filtering by a specific tag using LIKE ‘%sale%’ cannot use an index and will perform a full table scan regardless of table size. After splitting, the individual tag values can be indexed normally. If you query a split column frequently, consider materialising the split results into a separate table with proper indexing.
Inconsistent Delimiters Break the Split
If the delimiter character appears in the actual values — a comma inside an address field, a pipe inside a product description, the split will incorrectly fragment those values. String splitting assumes the delimiter is reserved exclusively for separation and never appears in the data itself. Always audit your data for delimiter collisions before designing a split based query.
Empty Strings and NULL Values Need Explicit Handling
Most split functions return an empty string row when the input contains consecutive delimiters (like “a,,b” producing “a”, “”, “b”) and return no rows at all when the input is NULL. Always add WHERE column IS NOT NULL AND column <> '' before splitting, and filter out empty value rows after splitting.
Performance Degrades on Very Large Tables
String splitting requires the database to process every row’s string value at query time. On tables with millions of rows and long delimited strings this becomes computationally expensive and difficult to optimise. Consider materialising split results into a staging table with scheduled refreshes for performance critical use cases.
Row Count Explosion Can Surprise You
If a table has one million rows and each row’s delimited string contains an average of ten values, the split query returns ten million rows. Joins and aggregations downstream of the split operate on this expanded row count. Always check cardinality implications before running split queries in production especially in stored procedures or views that other queries build on.
Common Mistakes to Avoid
Forgetting to TRIM split values before joining. The most common silent bug in string splitting queries is joining split values against a reference table and getting zero matches because the split value has a leading or trailing space. Always wrap split values in TRIM() immediately and do the same on the column you are joining against if the reference data might also have whitespace issues.
Using LIKE to filter instead of splitting first. Writing WHERE tags LIKE '%sale%' instead of splitting and filtering on the clean value will match “wholesale” and “resale” along with “sale”. Always split first then filter on exact value matches for precision and index eligibility.
Not handling NULL inputs. Most split functions passed a NULL input return no rows rather than an error which means NULL rows silently disappear from your result set. If NULLs in the split column represent meaningful data (no tags assigned, no categories set), handle them explicitly with COALESCE before splitting or document that they are intentionally excluded.
Splitting in a subquery and then joining without aliasing carefully. When the split produces a column also named “value” or “tag” and your outer query also references columns with similar names, ambiguous column references cause errors or silent incorrect results. Always alias split output columns clearly and immediately.
Assuming ordinal position is stable without enabling ordinal output. SQL Server’s STRING_SPLIT without the ordinal argument does not guarantee the order of output rows matches the order of values in the original string. If position matters — first tag, second category, third team member — always use a function or technique that explicitly tracks position. In SQL Server 2022+ pass the third argument 1. In other databases use SPLIT_TO_TABLE’s INDEX column (Snowflake) or generate row numbers using ROW_NUMBER() partitioned per original row.
Building the split logic without testing on edge cases first. Always test your split query against strings with one value only, strings with the maximum expected number of values, strings with consecutive delimiters, strings with leading or trailing delimiters, and NULL values before deploying. These edge cases expose assumptions in your split logic that typical sample data will not reveal.
String Split Cheat Sheet
| Database | Function | Example Syntax |
|---|---|---|
| SQL Server 2016+ | STRING_SPLIT | CROSS APPLY STRING_SPLIT(col, ',') |
| SQL Server 2022+ | STRING_SPLIT with ordinal | CROSS APPLY STRING_SPLIT(col, ',', 1) |
| PostgreSQL | STRING_TO_TABLE | CROSS JOIN LATERAL STRING_TO_TABLE(col, ',') |
| PostgreSQL | regexp_split_to_table | LATERAL regexp_split_to_table(col, '\s*,\s*') |
| BigQuery | SPLIT + UNNEST | UNNEST(SPLIT(col, ',')) |
| Snowflake | SPLIT_TO_TABLE | TABLE(SPLIT_TO_TABLE(col, ',')) |
| MySQL 8+ | Recursive CTE | Recursive substring extraction |
| Oracle | REGEXP_SUBSTR + CONNECT BY | Hierarchical query approach |
| Always | Trim after split | TRIM(split_value) on every result |
| Always | Handle NULL | WHERE col IS NOT NULL AND col <> '' |
SQL string splitting transforms one of the messiest data quality problems — multiple values crammed into a single column into clean, individually addressable rows that every standard SQL operation can work with naturally.
Here is the simplest summary of everything we covered:
String splitting expands one row with packed delimited values into multiple rows with individual values using a table function and a cross apply or lateral join. The delimiter tells SQL where one value ends and the next begins. Always trim split values immediately to remove whitespace that causes silent join and filter failures. Each major database provides its own function — STRING_SPLIT in SQL Server, regexp_split_to_table in PostgreSQL, SPLIT with UNNEST in BigQuery, SPLIT_TO_TABLE in Snowflake, and recursive CTEs in MySQL. Filter NULL and empty string inputs before splitting to prevent missing rows in your output. Use ordinal output when position within the original string carries meaning. Consider whether the need to split frequently signals a schema normalisation problem worth addressing at the data model level.
Start with a simple comma separated column and STRING_SPLIT (or your database’s equivalent) returning individual values. Once that works cleanly, add a TRIM, filter out empties, and join the split values against a reference table. From that point, string splitting becomes a routine technique you reach for confidently whenever delimited data stands between you and clean analytical results.
FAQs
What is string splitting in SQL?
String splitting is the process of taking a column that contains multiple values joined by a delimiter like “sport,outdoor,fitness” and expanding it into multiple rows with one value each, so each value can be filtered, joined, or aggregated individually.
How do I split a comma separated string in SQL Server?
Use STRING_SPLIT with CROSS APPLY: SELECT p.id, TRIM(s.value) FROM products p CROSS APPLY STRING_SPLIT(p.tags, ',') s. This returns one row per comma separated value in the tags column.
Does MySQL have a string split function?
MySQL 8.0 and later does not have a native string split to rows function. The standard workaround is a recursive CTE that iteratively extracts values before each delimiter until the string is exhausted. MySQL 5.x also lacks window functions, making the recursive CTE approach the most portable solution.
How do I split a string in PostgreSQL into rows?
Use regexp_split_to_table with a LATERAL join: SELECT id, TRIM(val) FROM mytable, LATERAL regexp_split_to_table(mycolumn, ',') AS val. PostgreSQL 16 also introduced STRING_TO_TABLE which uses the same syntax as SQL Server’s STRING_SPLIT.
Why do my split results have extra spaces?
Delimited strings commonly include spaces after the delimiter — “tag1, tag2, tag3”. The split function splits on the comma only, leaving the space as part of the next value. Always wrap split output in TRIM() to remove leading and trailing whitespace from every extracted value.
Can I keep the position of each split value in the results?
Yes. SQL Server 2022+ supports an ordinal argument in STRING_SPLIT. Snowflake’s SPLIT_TO_TABLE returns an INDEX column. In other databases you can generate positions using ROW_NUMBER() OVER (PARTITION BY the original row identifier ORDER BY the split value) though this gives rank by value not original position. For true positional extraction from earlier SQL Server versions a recursive CTE with an incrementing counter is the most reliable approach.