Smarter code to query the same parameter from multiple columns?

Question

We have this very wide table that has 25 diagnosis code columns:

diag1, diag2, … diag25.

If I want to look for diagnosis code '12345' I need to write

diag1 ='12345' or diag2='12345' or diag3='12345'... or diag25='12345'

Is there a way to do something like this:

diag1-diag25 ='12345'

Are you able to redesign your data table? These diagnosis codes should each be within their own row. — iamdave, Oct 04 '18 at 13:10
not right now, maybe in the future after ok from multiple decision makers — Lee Y., Oct 04 '18 at 14:26

score 4 · Answer 1 · answered Oct 04 '18 at 13:10

4

You can write

'12345' IN (diag1, diag2, ..., diag25)

but there is no syntax for diag1-diag25

answered Oct 04 '18 at 13:10

Denis Rubashkin

2,151
1
9
14

WOW did not know this was possible. What about if the parameter was '12345' between '12349' ? – Lee Y. Oct 04 '18 at 14:16
also -- what is this style of querying called? I tried googling WHERE condition in (column list) and no dice – Lee Y. Oct 04 '18 at 14:23
I don't think the style has any particular name - this is just "IN" syntax https://learn.microsoft.com/en-us/sql/t-sql/language-elements/in-transact-sql?view=sql-server-2017 – Denis Rubashkin Oct 04 '18 at 14:54
@LeeY. Your question of *if the parameter was '12345' between '12349'?* is where this approach has limitations. You can do this with a more flexible unpivot approach though. – iamdave Oct 04 '18 at 15:00
@DenisRubashkin , is there a function to return the first column that meets the condition? For example, '12345' in (diag1,diag2,diag3) ; which column has '12345' ? – Lee Y. Oct 05 '18 at 18:33

score 3 · Accepted Answer · answered Oct 04 '18 at 15:04

You might call the generic abilities of XML to your rescue. Not very fast with may rows, but very mighty:

(credits to iamdave for the mockup)

declare @t table(id int,diag1 int,diag2 int,diag3 int,diag4 int,diag5 int,diag6 int,diag7 int,diag8 int,diag9 int,diag10 int);
insert into @t (id, diag5) values(1,12345);
insert into @t (id, diag3) values(2,123);
insert into @t (id, diag8, diag1) values(3,123,12345);
insert into @t (id, diag9, diag2) values(4,345,678);

WITH CreateXML AS
(
    SELECT * 
          ,(
            SELECT *
            FROM @t t2
            WHERE t1.id=t2.id
            FOR XML PATH('codes'),TYPE
           ).query('/codes/*[substring(local-name(),1,4)="diag"]') AllCodes
     FROM @t t1 
)
SELECT * 
FROM CreateXML
WHERE AllCodes.exist('/*[. cast as xs:int? >=12345 and . cast as xs:int? <=12349]')=1;

The query will use a SELECT * to create an XML of all columns. The .query() will pick all elements starting with "diag". Maybe you have to adapt this to your needs.

The .exist() will check, if any value within these elements is between your borders. One match is enough.

Good Luck!

I thought the XML tricks was John Cappelletti's department ;-) This is a good option for searching a range of values. +1. — Zohar Peled, Oct 04 '18 at 15:22
Upvotes to the answer and the comment, because as soon as I started reading, I assumed it was John. — Eric Brandt, Oct 04 '18 at 15:27
@ZoharPeled yeah if there are exactly 25 columns an unpivoting with value would be better. But in cases where you must deal with unknown sets xml is very mighty... — Shnugo, Oct 04 '18 at 15:28

score 2 · Answer 3 · answered Oct 04 '18 at 13:19

An alternative and potentially more flexible solution is to unpivot the data using cross apply:

declare @t table(id int,diag1 int,diag2 int,diag3 int,diag4 int,diag5 int,diag6 int,diag7 int,diag8 int,diag9 int,diag10 int);
insert into @t (id, diag5) values(1,12345);
insert into @t (id, diag3) values(2,123);
insert into @t (id, diag8, diag1) values(3,123,12345);
insert into @t (id, diag9, diag2) values(4,345,678);

select t.id
      ,d.d
from @t as t
    cross apply(values(diag1),(diag2),(diag3),(diag4),(diag5),(diag6),(diag7),(diag8),(diag9),(diag10)) d(d)
where d.d = 12345;

Output:

+----+-------+
| id |   d   |
+----+-------+
|  1 | 12345 |
|  3 | 12345 |
+----+-------+

Upvote from my side. Probably the best way to deal with a known set of column names. And thx for you mockup. I borrowed it for my answer. — Shnugo, Oct 04 '18 at 15:33

GuidoG · Answer 4 · 2018-10-04T13:23:01.560

1

Actually all these diagxx columns should be in a seperate table. So your design should be altered.

If you cannot do that then you can make your query easier en cleaner by using an IN in your where clause.

declare @d table (diag1 varchar(5), diag2 varchar(5), diag3 varchar(5))
insert into @d values ('12345', '23456', '34567'),
                      ('45678', '12345', '56789'),
                      ('45678', '85236', '56789')
select * 
from   @d
where  '12345' in (diag1, diag2, diag3)

This will return

diag1   diag2   diag3   
-----   -----   -----   
12345   23456   34567   
45678   12345   56789

This is as close to your desired syntax as you are going to get

edited Oct 04 '18 at 13:23

answered Oct 04 '18 at 13:10

GuidoG

11,359
6
44
79

I agree. Unfortunately office politics – Lee Y. Oct 04 '18 at 14:16
Well the `where IN ()` workaround will make your work somewhat easier then – GuidoG Oct 04 '18 at 14:30
that might be what I have to do. Though I could not get a between statement in work with the where in () – Lee Y. Oct 04 '18 at 14:32
What do you mean with a between statement ? Can you explain what you want to do – GuidoG Oct 04 '18 at 14:34
something like this: WHERE '12345' between '12349' in (diag1,diag2,diag3) , the real business scenario is about 70 diagnosis codes captured between C00 and C96 – Lee Y. Oct 04 '18 at 14:36
OK I understand. I am not sure if there is a way to do that I will think about it. I am afraid you will need to do `where '12345' in (diag1,..., diag25) or '12346' in (diag1,..., diag25)` and so on – GuidoG Oct 04 '18 at 14:37

Eric Brandt · Answer 5 · 2018-10-04T15:35:51.463

Another way would be to UNPIVOT your diagnosis codes into a single column, then query the resulting data set.

For a full discussion of unpivoting options, see this question. SQL Server : Columns to Rows

With varchar diagnosis codes, BETWEEN might not behave the way you want it to, but you could dump your 70-odd codes into a temp table and join to it.

But something along these lines:

select * from 
(
  select 
    <Any Key Fields You Want To Include>,
    diagColumnName,
    diagCode
  from yourtable
  unpivot
  (
    diagCode
    for diagColumnName in (diag1, diag2....diag25)
  ) as unpiv
) as resultSet
join #tempCodes as t
  on t.diagCode = resultSet.diagCode

Edit: added the join/temp table based on comments.

Edited my answer to account for searching a pile of diagnosis codes. — Eric Brandt, Oct 04 '18 at 15:36

Smarter code to query the same parameter from multiple columns?

5 Answers5