-1

I have the following method:

ids = random.sample(list(map(int, open(file_path))), 10)

that returns a list of 10 random ints.

How to speed up it? What is the other way to do this?

  • the way you do it, the whole file is loaded to memory and then sampled. The better way would be to only read specific but random lines. – Ma0 Jul 04 '17 at 10:59
  • It depends :) Imagine, the lines are the same length, for example. You can be very quick then :) – tymtam Jul 04 '17 at 11:01
  • @Ev.Kounis How to do this? –  Jul 04 '17 at 11:02
  • A simple trick to speed it up (that does not solve the issue I mentioned above) is to leave the conversion to `int` for **after** the sampling. It does not make sense to convert **all** `str`ings to `int`egers from the get go. – Ma0 Jul 04 '17 at 11:04

1 Answers1

0

it seems to me that the hangup is always going to be finding the length of the file. See How to get line count cheaply in Python? for that.

then you can do

import random

lines = random.sample(range(length_of_file), 10)

with open(file) as f:
    lines_from_file = [f.readline(line) for line in lines]

or something like that. How does that compare speedwise?

Stael
  • 2,619
  • 15
  • 19