Since this is a one-time project, you can create a temporary table with row numbers, and then do the query that compares row N
only to row N+1
(and rely on transitivity of <
for all other rows):
SET @row_num:=0;
INSERT INTO my_temp (row_number, proc_date, ext_id)
SELECT
@row_num:=@row_num+1 as row_number
, proc_date
, ext_id
FROM original_table
ORDER BY proc_date
With row_number
in place, you can search like this:
SELECT *
FROM my_temp a
JOIN my_temp b ON a.row_number = b.row_number+1
WHERE a.ext_id >= b.ext_id
The trick to this query is to identify the next row in the table sorted in ascending order by proc_date
. But that is exactly what row_number+1
means. You may need to create an index on row_number
, or declare it a unique key in order for this query to finish in reasonable time.
I was interested to see roughly how often it happens.
I would do it in a hybrid SQL/Java solution (or used whatever other language you may prefer). Firs, load external IDs alone, ordered on the date, into main memory, i.e.
SELECT ext_id FROM original_table ORDER BY proc_date
Then I would use an O(N*LogN) algorithm for counting the number of inversions. Here is an implementation in Java.