Find period in very large sequence of numbers

Question

Is there any possibility to find a cycle length in very large portion of numbers (more than maximum value of bigest integer type in C/C++, let's say 2^20), without involving disk to perform it? The best situation would be to analyze them sequentially, as they arrive from standard input, but I'm pretty sure that is impossible and I need to store them in memory. But I hope I'm wrong. Numbers values are integers, they are coming from standard input.

Example: Input: ( 1 2 3 ... (2^20 triples of 1 2 3) ... 1 2 3) Desired result: 3

EDIT

let's think of cycle as a period (f(x) = f(x+t) for some t) - looking for value of t

let's assume that operating memory is too few to store all numbers (numbers can be more than 2^20) and might be gmp types.

You might have to explain why "3" is the desired result in your example. Why not 1? — Greg Hewgill, Oct 08 '12 at 20:27
I don't think 2^20 would overflow the largest possible integer type. It would hardly overflow even a bog-standard off-the-shelf integer type. — Kerrek SB, Oct 08 '12 at 20:28
@Kerrek SB but can overflow for instance the available memory — deha, Oct 08 '12 at 20:29
@deha 2^20 = 1048576, that might be a problem for microcontrollers, but for desktop stuff it's very little. — Daniel Fischer, Oct 08 '12 at 20:30
Perhaps you might want to define the `cycle` a bit more cleanly. — P.P, Oct 08 '12 at 20:31
but in your example `(1 2 3 1 2 3)` is also a valid cycle, in which the result is 6, and `(1 2 3 1 2 3 1 2 3)` where the answer is 9, or which condition does that break? — lccarrasco, Oct 08 '12 at 20:42

score 2 · Answer 1 · edited May 23 '17 at 11:54

2

This is a well-studied problem, with quite a number of algorithms depending on what exactly your resources and contraints are:

http://en.wikipedia.org/wiki/Cycle_detection

You do not need to store everything in memory (just two pointers/indices in the basic algorithms), but you need to be able to access the sequence multiple times.

See also:

Floyd's cycle-finding algorithm

edited May 23 '17 at 11:54

Community

1
1

answered Oct 08 '12 at 21:11

Soverman

1,135
1
9
16

1

That's not what the OP seems to want. The cycle-finding algorithm assumes that the sequence is, indeed, cyclic - i.e. it is calculated by something similar to the recurrence relation shown on the Wikipedia page you've linked. It searches for the first repeating number in the sequence, which may be pretty different from the length of the period. – Ivan Vergiliev Oct 08 '12 at 21:24
That's just generalizing to having some junk on the beginning of the sequence. In the notation used on the Wiki page you're finding both mu and lambda, and lambda (the cycle length) is what the questioner is looking for. – Soverman Oct 08 '12 at 21:30
1

The point is that this algorithm depends on the fact that `x1 = f(x0), x2 = f(x2)`, etc. - it doesn't work correctly otherwise. Imagine having the sequence `1 2 2 1 2 2` - Floyd's algorithm would report a cycle length of 1 or 2, depending on whether you stop straight away at the 1 or continue to the 2s. Both lengths are obviously wrong. – Ivan Vergiliev Oct 08 '12 at 21:36
@Ivan You're quite right if entries in the cycle are non-distinct. I was assuming each entry in the cycle was distinct as in deha's example, but this is probably a bad assumption. – Soverman Oct 08 '12 at 21:53

score 1 · Answer 2 · answered Oct 08 '12 at 21:04

The standard way to find the period length is to imagine your sequence is a string S over an alphabet of integers, and then find the leftmost position (greater than 0) where S can be found in the string SS (the concatenation of the string S with itself). That is, if S = 'abcabc', you have to find the first position i greater than zero, such that S is found at position i in the string SS = 'abcabcabcabc'. In this case, this is 3, and that's the length of the period.

This can be done any of the string searching algorithms with linear performance, and should work perfectly fine for 2^20 numbers on a modern computer. I doubt that you can avoid saving the numbers in memory though.

Right, unless you _know_ that the string is periodic and a bound for the period length, you must have the entire string, otherwise it could break with the last character. — Daniel Fischer, Oct 09 '12 at 19:41

Find period in very large sequence of numbers

2 Answers2