4

I have one table that stores a range of integers in a field, sort of like a print range, (e.g. "1-2,4-7,9-11"). This field could also contain a single number.

My goal is to join this table to a second one that has discrete values instead of ranges.

So if table one contains

1-2,5
9-15
7

And table two contains

1
2
3
4
5
6
7
8
9
10

The result of the join would be

1-2,5   1
1-2,5   2
1-2,5   5
7       7
9-15    9
9-15    10

Working in SQL Server 2008 R2.

Christoph
  • 4,251
  • 3
  • 24
  • 38
  • 7
    Storing (and then working with) data like this is a nightmare in SQL. Is it too late to rework how you are storing this info? – Abe Miessler May 17 '13 at 22:40
  • 1
    Is it possible to have the application code "explode" the string before going to sql? SQL server *can* do this, but it's not ideal. – Matthew May 17 '13 at 22:41
  • Yes, too late to rework. Most of the work with the data happens in the application layer far away from SQL and storing it this way does save a lot of room and make the ORM cleaner. But yes - dealing w/ this in SQL is a pain. – Christoph May 17 '13 at 22:43
  • @Matthew - not able to make change to the application layer at this time - this is more for adhoc reporting needs. – Christoph May 17 '13 at 22:44
  • @Matthew on CLR, if there is no other way to do it in native SQL then yes – Christoph May 17 '13 at 22:45
  • FYI - I already have C# code that given the range and another integer evaluate if the given integer is in the range. Could use this in CLR if needed, but would like to see if SQLish solution is possible first – Christoph May 17 '13 at 22:48
  • 1
    I think the correct computing term is "explode", not "blow". – ErikE May 17 '13 at 23:03
  • By the way, what is the maximum value possible for the numbers, and what is the maximum length of the number-containing string? – ErikE May 17 '13 at 23:05
  • max length of string is 255 char, realistic max for int is < 1000 – Christoph May 17 '13 at 23:14
  • @ErikE: In this particular case, *blow* might be the better term. – Mike Sherrill 'Cat Recall' May 18 '13 at 23:49
  • Can you have a value like '1-2,5,17-35,42-75,37-39,4'? – Mike Sherrill 'Cat Recall' May 18 '13 at 23:54

4 Answers4

7

Use a string split function of your choice to split on comma. Figure out the min/max values and join using between.

SQL Fiddle

MS SQL Server 2012 Schema Setup:

create table T1(Col1 varchar(10))
create table T2(Col2 int)

insert into T1 values
('1-2,5'),
('9-15'),
('7')

insert into T2 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)

Query 1:

select T1.Col1,
       T2.Col2
from T2
  inner join (
              select T1.Col1,
                     cast(left(S.Item, charindex('-', S.Item+'-')-1) as int) MinValue,
                     cast(stuff(S.Item, 1, charindex('-', S.Item), '') as int) MaxValue
              from T1
                cross apply dbo.Split(T1.Col1, ',') as S
             ) as T1
    on T2.Col2 between T1.MinValue and T1.MaxValue

Results:

|  COL1 | COL2 |
----------------
| 1-2,5 |    1 |
| 1-2,5 |    2 |
| 1-2,5 |    5 |
|  9-15 |    9 |
|  9-15 |   10 |
|     7 |    7 |
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
  • @Matthew the string split part is the same and finding min and max but the join using between is not in your answer. This answer actually shows that there is no need to "explode the numbers" which is what the OP was asking for. – Mikael Eriksson May 18 '13 at 20:53
  • 1
    Yes, I see that you `JOIN`ed on the `BETWEEN` .. it's a good solution – Matthew May 18 '13 at 23:01
  • 1
    I would have solved it almost exactly the same way, down to the `+ '-'` in the `CharIndex`. +1. – ErikE May 20 '13 at 16:36
  • Thank you Mikael - sweet solution! – Christoph May 22 '13 at 23:31
4

Like everybody has said, this is a pain to do natively in SQL Server. If you must then I think this is the proper approach.

First determine your rules for parsing the string, then break down the process into well-defined and understood problems.
Based on your example, I think this is the process:

  1. Separate comma separated values in the string into rows
  2. If the data does not contain a dash, then it's finished (it's a standalone value)
  3. If it does contain a dash, parse the left and right sides of the dash
  4. Given the left and right sides (the range) determine all the values between them into rows

I would create a temp table to populate the parsing results into which needs two columns:
SourceRowID INT, ContainedValue INT

and another to use for intermediate processing:
SourceRowID INT, ContainedValues VARCHAR

Parse your comma-separated values into their own rows using a CTE like this Step 1 is now a well-defined and understood problem to solve:

Turning a Comma Separated string into individual rows

So your result from the source
'1-2,5'
will be:
'1-2'
'5'

From there, SELECT from that processing table where the field does not contain a dash. Step 2 is now a well-defined and understood problem to solve These are standalone numbers and can go straight into the results temp table. The results table should also get the ID reference to the original row.

Next would be to parse the values to the left and right of the dash using CHARINDEX to locate it, then the appropriate LEFT and RIGHT functions as needed. This will give you the starting and ending value.

Here is a relevant question for accomplishing this step 3 is now a well-defined and understood problem to solve:

T-SQL substring - separating first and last name

Now you have separated the starting and ending values. Use another function which can explode this range. Step 4 is now a well-defined and understood problem to solve:

SQL: create sequential list of numbers from various starting points

SELECT all N between @min and @max

What is the best way to create and populate a numbers table?

and, also, insert it into the temp table.

Now what you should have is a temp table with every value in the exploded range.

Simply JOIN that to the other table on the values now, then to your source table on the ID reference and you're there.

Community
  • 1
  • 1
Matthew
  • 10,244
  • 5
  • 49
  • 104
  • Thank you Matthew - I chose Mikael's answer as it didn't require the "SELECT all N between @min and @max" step, but rather used the BETWEEN. Still a great answer +1 – Christoph May 22 '13 at 23:33
0

My suggestion is to add one more field and many more records to your ranges table. Specifically, the primary key would be the integer and the other field would be the range. Records would look like this:

number    range
1         1-2,5
2         1-2,5
3         na
4         na
5         1-2,5

etc

Having said that, this is still rather limiting because a number can only have one range. If you want to be thorough, set up a many to many relationship between numbers and ranges.

Dan Bracuk
  • 20,699
  • 4
  • 26
  • 43
  • Hi Dan, have to work with the tables as is. Also, the ranges can be with much larger numbers (user defined) and I'm not sure this approach (if used as a reference table would scale). – Christoph May 17 '13 at 22:50
0

As far as I can tell you best option is something like below:

Create a table value function that accepts your ranges an converts them to a collection of ints. So 1-3,5 would return:

1
2
3
5

Then use these results to join to other tables. I don't have an exact function to do this at hand, but this one seems like an excellent start.

Abe Miessler
  • 82,532
  • 99
  • 305
  • 486