0

what i want to do is write a code that has a file (in the code, no need to be input by user), and the code picks a random line from the file - whatever it is, a long line, an ip or even a word and at the end of the loop puts it into a string so i could use that in other parts of the code.

i tried using randomchoice(lines) but wasn't sure how to continue from here. after that i tried using:

import random
def random_line(afile):
    line = next(afile)
    for num, aline in enumerate(afile):
      if random.randrange(num + 2): continue
      line = aline
    return line

which also for some reason didnt work for me.

Giladiald
  • 857
  • 3
  • 9
  • 17
  • When you say the code you've shown didn't work, what do you mean? Does it raise an exception? Return garbage? Always return the same thing? – Blckknght Jan 12 '14 at 03:45
  • afile is filename or filehandle ? – James Sapam Jan 12 '14 at 03:51
  • well if i understand this correctly `line` is the string that should get the random line value, and it doesn't for some reason. not sure what im doing wrong – Giladiald Jan 12 '14 at 03:54

2 Answers2

2

The last method you posted worked for me. Maybe you are not opening the file correctly. Here is another approach, using random.choice

import random

def random_line(f):
    return random.choice([line for line in f])

f = open("sample.txt", 'r')

print random_line(f)

Edit:

Another way would be (thanks to @zhangxaochen):

def random_line(f):
    return random.choice(f.readlines())
Christian Tapia
  • 33,620
  • 7
  • 56
  • 73
  • Short, but expensive if the file is big. It all gets read into memory at once. – mojo Jan 12 '14 at 03:47
  • @senshin +1, should be `afile`, not `f` – zhangxaochen Jan 12 '14 at 03:47
  • ok so proabably im not opening the file correctly. why did you open the file AFTER the random line? – Giladiald Jan 12 '14 at 03:51
  • 1
    `def random_line(f)` is just the declaration of the method, there is no problem. Note that I'm opening the file **before** calling `random_line(f)` in the print statement. – Christian Tapia Jan 12 '14 at 03:54
  • ALRIGHT! now that it works im not sure what i did wrong but that was very simple, thanks everyone! – Giladiald Jan 12 '14 at 04:00
  • Is it even necessary to use `f.readlines()` vs. `f`? I believe a file is naturally a sequence of lines all by itself. – Mark Ransom Jan 12 '14 at 04:03
  • @MarkRansom: yes, but `random.choice` requires something which can be indexed and which you can call `len` on. (Again, in Python 2 -- can't check 3 right now.) – DSM Jan 12 '14 at 04:05
  • @DSM, I tested it on `xrange` believing it returned a generator, but when I test it on an actual generator function it fails for the exact reason you state. I stand corrected. That means my answer has an advantage over this one. – Mark Ransom Jan 12 '14 at 04:10
1

Translating another answer of mine from C:

def random_line(afile):
    count = 0
    kept_line = None
    for line in afile:
        if random.randint(0, count) == 0:
            kept_line = line
        count += 1
    return kept_line

Edit: This appears to do the same thing as random.choice. I wonder if they use the same algorithm?

Edit 2: from the comments and a little experimentation it appears random.choice uses a different algorithm, which will be much more efficient if all of the elements are already in memory. This isn't usually the case for files unless you use readlines. There will be a tradeoff between having to keep the entire file in memory vs. having to calculate n random numbers.

Community
  • 1
  • 1
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • `random.choice` is simply `seq[int(self.random() * len(seq))]`. (Python 2, anyway.) – DSM Jan 12 '14 at 04:04
  • @DSM a sequence can't be indexed directly. You have to take it one element at a time. Try `random.choice(xrange(1000))` for example. – Mark Ransom Jan 12 '14 at 04:07
  • which is why your algorithm isn't the same as `random.choice`. (Although I wouldn't use the word `sequence` there, because a list is a sequence, and obviously can be directly indexed.) – DSM Jan 12 '14 at 04:09
  • @DSM I just reached the same conclusion which I left in a comment to the other answer. – Mark Ransom Jan 12 '14 at 04:11