How do I find "holes" in SQL Server tables? [duplicate]

3

I have a table with a column id (primary key, auto increment of value 1).

My application allows rows to be deleted, so the query

SELECT id FROM tbl ORDER BY id ASC

would be this:

id
---
1
2
3
4
5
6
7
...

However, some rows have been deleted for some reason, either because my user manipulated the table outside my application or the database is corrupted. The reason for the disappearance of these lines does not interest me at the moment.

Executing the same query from the beginning, my result is not what I expect:

id
---
1
2
3
6
10
12
8870
...

How can I write a query to find these spaces? I need to find where these gaps start, in the case of the above result, I need to extract something of the type:

id
---
3

Since it was from the% w / o of% value 3 spaces start.

    
asked by anonymous 04.09.2017 / 19:59

3 answers

2
  

Finding the ranges

Here are two efficient solutions found in Solving Gaps and Islands with Enhanced Window Functions .

The first solution addresses several versions of SQL Server.

-- código #1
SELECT col1 + 1 AS rangestart, 
       (SELECT MIN(B.col1)   
          FROM dbo.T1 AS B
          WHERE B.col1 > A.col1) - 1 AS rangeend 
  FROM dbo.T1 AS A
  WHERE NOT EXISTS (SELECT * 
                      FROM dbo.T1 AS B
                      WHERE B.col1 = A.col1 + 1)
        AND col1 < (SELECT MAX(col1) FROM dbo.T1);

In code # 1, replace col1 with the name of the column containing the numbering and T1 with the table name.

There is another, more efficient, suggestion that works from the 2012 (including) version of SQL Server. Uses the window function LEAD () .

-- código #2
WITH C AS (
SELECT col1 AS cur, LEAD(col1) OVER(ORDER BY col1) AS nxt
  FROM dbo.T1
)
SELECT cur + 1 AS rangestart, nxt - 1 AS rangeend
  FROM C
  WHERE nxt - cur > 1;

In code # 2, replace col1 with the name of the column that contains the numbering and T1 with the table name.

  

What is the cause of the intervals?

The reasons can be several. For small intervals it is necessary to look for the cause in the application and in the accesses made directly to the table by the users. For larger ranges (usually multiple of 1000), one possibility is that the cause is directly linked to the way IDENTITY is implemented in SQL Server. In the documentation itself, " SQL Server might cache identity values for performance reasons and some of the assigned values may be lost during a database failure or server restart. This can result in gaps in the identity value upon insert ".
Attention to the passage "This can result in gaps"!

As a solution, still in the same documentation it is quoted that "If gaps are not acceptable then the application should use its own mechanism to generate key values". That is, it's a fact that IDENTITY is not reliable to generate consecutive numeric sequences, with no ranges.

  

Deepening the theme gaps and islands

For those interested in learning more about the classic issue of gaps and islands, here are some selected articles:

05.09.2017 / 13:04
0

You can use the NOT EXISTS :

SELECT t1.*
  FROM tabela t1
 WHERE NOT EXISTS(SELECT 1
                    FROM tabela t2
                   WHERE t2.id -1 = t1.id)
    
04.09.2017 / 20:15
0

Another method, which works on any BD:

SELECT
  min(id)
FROM tabela
WHERE id+1 NOT IN
  (SELECT id FROM tabela
  )
;

If you need to know all id s that does not have the next, just do so:

SELECT
  id
FROM tabela
WHERE id+1 NOT IN
  (SELECT id FROM tabela
  )
ORDER BY id;

Note that these queries only take from the smallest id number up. If for example the id's are 5, 6 and 8 in the table, it will display the value 6, not the 0, since it does not have id 1.

    
05.09.2017 / 14:00