SQL Concatenation using Case Statement

Understanding the Problem

In this blog post, we’ll explore how to concatenate data from multiple columns in SQL while handling NULL values. We’ll use two different approaches: one that utilizes a case statement and another that uses a more concise approach with concatenation functions.

Approach 1: Using Case Statement

Let’s start by examining the first approach using a case statement. The question provides an example table with several columns, including some NULL values. We’re asked to modify this table so that it looks like this:

ID	NewCol
1	abc
2	ght/cde
3	atr/dgf/aft
4	asd/rty/tyu/xyz
5	abc/pqr/xyz/rst/rty
6	qwe/rty/ghj/rty/tyu

The original query using a case statement looks like this:

SELECT ID, 
    NewCol = CASE WHEN col1 IS NOT NULL AND col2 IS NULL THEN col1 
                  WHEN col1 IS NOT NULL  AND col3 IS NULL THEN col1+'/' + col2 
                  WHEN col1 IS NOT NULL AND col4 IS  NULL THEN col1+'/'+ col2 + '/' + col3 
                  WHEN col1 IS NOT NULL AND col5 IS NULL THEN col1+'/' + col2 + '/' + col3+ '/' + col4     
                  WHEN col1 IS NOT NULL AND col5 IS not NULL THEN col1+'/' + col2 + '/' + col3+ '/' + col4 + '/' + col5     
                  ELSE NULL     
               END 
    FROM dbo.Temp

This query checks for specific conditions and concatenates values accordingly. However, this approach can become cumbersome if there are more than a few columns to check.

Limitations of the Case Statement Approach

While the case statement approach works well for small datasets, it has some limitations:

It requires checking multiple conditions for each column, making it harder to read and maintain.
If there are many columns, this query can become unwieldy.
Error handling is not as straightforward as other approaches.

Approach 2: Using Concatenation Functions

The second approach uses concatenation functions to achieve the same result more concisely. We’ll assume that col1 always has a value, which simplifies our query:

SELECT (col1 + COALESCE('/' + col2, '') + COALESCE('/' + col3, '') +
        COALESCE('/' + col4, '') + COALESCE('/' + col5, '')
       )

In this approach, COALESCE is used to return the first non-NULL value from a list of arguments. If all values are NULL, it returns NULL.

Advantages of the Concatenation Functions Approach

This approach has several advantages over the case statement method:

Concise code: The query is shorter and easier to read.
Faster execution: Since we’re using built-in functions, this query can execute faster than a case statement query with many conditions.
Easier error handling: If any of the values are NULL, COALESCE will return NULL instead of generating an error.

Conclusion

In conclusion, both approaches have their strengths and weaknesses. The case statement approach is better suited for small datasets where readability is a concern. On the other hand, the concatenation functions approach is more concise and faster while still providing good error handling.

When to Use Each Approach

Here are some general guidelines on when to use each approach:

Use Case Statement:
- Small datasets (<100 rows)
- When readability is important
Use Concatenation Functions:
- Large datasets (>= 100 rows)
- When speed and conciseness are crucial

Last modified on 2025-04-25