0

The task is the following: I have a table "numbers" which has only one column "number". There are numbers in this table in range a...b, each specific number appears in table only once. I need to find missing numbers in range a..b. One more thing is that range can be very big, and performance should be still good. I came up with idea to find all number pairs, between which numbers are missing, so we don't have to compare all numbers in this range and improve performance this way. The next query returns all such number pairs:

     select N1.number as num1, MIN(N2.number)as num2 from numbers N1 join numbers N2 on N2.number > N1.number 
                                            and N2.number - N1.number > 1
                                            and not exists (select * from numbers where number > N1.number and number < n2.number)
                                            group by n2.number, n1.number 

The result for set 1 2 3 4 6 8 10 is:

4   6
6   8
8   10

Now I want to list all numbers between, but I stucked, I'm failing to pass these number pairs in other select statement or whatever and make it one query. Any ideas? is it possible at all?

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
KorsaR
  • 536
  • 1
  • 10
  • 26
  • Without some constraints on the problem, it is hard to come up with an efficient solution. After all, the table could contain two values, such as 1 and 1,000,000,000. – Gordon Linoff Jul 03 '14 at 19:52
  • Yes, that's possible situation. But lets leave efficiency, the question actually is about this particular solution - how to list numbers between range pairs and make it all imn one query, and is it possible? – KorsaR Jul 03 '14 at 19:57
  • Do you really need a row for each missing value? Unless there's a function like generate_series in PostgreSQL there's no way to get an efficient solution as Gordon Linoff already mentioned. Getting the range of missing values can be done quite efficiently using some Windowed Aggregate Function instead of your complicated query :-) – dnoeth Jul 03 '14 at 21:11
  • [http://stackoverflow.com/questions/8364247/sql-find-missing-int-values-in-mostly-ordered-sequential-series/8368764#8368764](This) works for Oracle. Maybe it could help you work something out for SQL-Server. – Michael McGriff Jul 03 '14 at 21:13

4 Answers4

0

You have to have a list of ALL numbers to compare against (here's an example for that). Now, if you have those numbers in a table called "all_numbers", you can do something like this:

select a.number from all_numbers a
where a.number not in (select N1.number from numbers N1)

Optionally, you could limit the range of your "missing" numbers by adding filters like:

and a.number between (select min(number) from numbers) 
             and (select max(number) from numbers)

It just depends on how complicated you need it, but you definitely need to start with your list of all relevant numbers.

Community
  • 1
  • 1
SlimsGhost
  • 2,849
  • 1
  • 10
  • 16
  • You can also use a virtual numbers table solution like [this](http://sqlmag.com/sql-server/virtual-auxiliary-table-numbers). – mellamokb Jul 03 '14 at 20:08
  • As discussed here http://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1 ,creating a number table once instead of computing is better. – Nuri Tasdemir Jul 03 '14 at 20:08
0

I think this is the query you need:

select nstart, 
       nend
  from (select m.startNumber + 1 as nstart,
              (select min(startNumber) - 1 
                 from numbers x 
                 where x.startNumber > m.startNumber) as nend
          from numbers m 
                left join
                   (select startNumber-1 startNumber 
                      from numbers r) r 
                on (m.startNumber = r.startNumber)
         where r.startNumber is null
       ) x
 where nend is not null
 order by nstart

See it here on fiddle: http://sqlfiddle.com/#!2/720b1/4

The important thing to note is the comment made by @GordonLinoff:

Without some constraints on the problem, it is hard to come up with an 
efficient solution. After all, the table could contain two values, 
such as 1 and 1,000,000,000
Jorge Campos
  • 22,647
  • 7
  • 56
  • 87
  • That's pretty much the same what I did, but it still doesn't solve the problem of listing ALL missing numbers – KorsaR Jul 03 '14 at 22:04
0

Thank you all, guys, I kind of combined all solutions you proposed and finally came up with this solution:

    WITH
    L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
    L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
    L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
    L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
    L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
    L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L3 AS B),
    Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)

    select n from Nums as Nmbrs

    inner join

    (select N1.number as num1, MIN(N2.number)as num2 from numbers N1 join numbers N2 on N2.number > N1.number 
                                            and N2.number - N1.number > 1
                                            and not exists (select * from numbers where number > N1.number and number < n2.number)
                                            group by n2.number, n1.number) T on Nmbrs.n>num1 and Nmbrs.n < num2

    order by Nmbrs.n    

Works incredibly fast, i used Fibonacci numbers to test, 1...1 000 000 is processed in 7 seconds, 1...1 000 000 000 in 69 seconds, and what's the most important, it's a single query.

KorsaR
  • 536
  • 1
  • 10
  • 26
0

I think a recursive query using CTE could do good for you:

WITH   CTE (i, sw)
AS     (SELECT 1 AS i, 'SW_00000' -- anchor member
    UNION ALL
    SELECT i + 1, 'SW_' + RIGHT('00000' + cast(i+1 as varchar),5)  -- recursive member
    FROM   cte
    WHERE  i < 5000 -- terminator
   )
SELECT i,sw  -- Display CTE result
FROM   cte
where sw not in (  -- exclude existing values
select... <query existing stuff here> 
)

OPTION (MAXRECURSION 5000); -- default max for recursion = 200
Volker
  • 447
  • 1
  • 5
  • 19