I have a million integers in sorted order and I would like to find the longest subsequence where the difference between consecutive pairs is equal. For example
1, 4, 5, 7, 8, 12
has a subsequence
4, 8, 12
My naive method is greedy and just checks how far you can extend a subsequence from each point. This takes O(n²)
time per point it seems.
Is there a faster way to solve this problem?
Update. I will test the code given in the answers as soon as possible (thank you). However it is clear already that using n^2 memory will not work. So far there is no code that terminates with the input as [random.randint(0,100000) for r in xrange(200000)]
.
Timings. I tested with the following input data on my 32 bit system.
a= [random.randint(0,10000) for r in xrange(20000)]
a.sort()
- The dynamic programming method of ZelluX uses 1.6G of RAM and takes 2 minutes and 14 seconds. With pypy it takes only 9 seconds! However it crashes with a memory error on large inputs.
- The O(nd) time method of Armin took 9 seconds with pypy but only 20MB of RAM. Of course this would be much worse if the range were much larger. The low memory usage meant I could also test it with a= [random.randint(0,100000) for r in xrange(200000)] but it didn't finish in the few minutes I gave it with pypy.
In order to be able to test the method of Kluev's I reran with
a= [random.randint(0,40000) for r in xrange(28000)]
a = list(set(a))
a.sort()
to make a list of length roughly 20000
. All timings with pypy
- ZelluX, 9 seconds
- Kluev, 20 seconds
- Armin, 52 seconds
It seems that if the ZelluX method could be made linear space it would be the clear winner.