I have 1 file txt with 100 000 lines. Can you know any way to get 1 line fastest randomly ? Thank you very much !
Asked
Active
Viewed 795 times
0
-
why don't you benchmark some ideas you have.. `$line = file($filename)[rand(0,100000)];` could be one test you run – CᴴᵁᴮᴮʸNᴵᴺᴶᴬ Dec 11 '14 at 14:50
-
1Your going to have to loop though it to find the number of lines. See this answer http://stackoverflow.com/a/20537130/1281385 – exussum Dec 11 '14 at 14:51
-
DannyHearnah's suggestion will be faster, but use more memory. exussums will use almost no memory, but will be slower. – Flosculus Dec 11 '14 at 14:53
-
1Depends on the desired per-line randomness. A trivial abstraction would be to just [`fseek`](http://php.net/fseek) to an arbitrary *byte* position, read a sufficiently large windoe (8K ?) then find the nearest linebreak, and extract between that and the next one. – mario Dec 11 '14 at 15:16
1 Answers
1
The fastest way would be to build index (simple array that contains the position in the file of each new line). Then choose random key, get the position, fseek the file to that position, and read the line. This will require updating the index file any time you change the file, but if you want to optimize retrieving the data, that's the way.
You can optimize further by spliting the file in ranges (e.g. sharding the data), or having several representations of the file (for example you can have file with lines inverted so last come first, and if your random number is bigger than half of the elements, you read from the second file)

Maxim Krizhanovsky
- 26,265
- 5
- 59
- 89