1

I have a large (maybe 22^56 or so items this time) generator in python 2.7. I'd like to know how many items are in the generator so that I can estimate time to complete the task. Unfortunately when I tried len() of a list comprehension, it killed the whole python instance...

>>> len([i for i in giant_word_list_generator])
Killed: 9
[user@host:~/Documents/work/bin|16:59:28]
$ 

How can I estimate the number of items in the generator for progress estimation? I would be okay with estimating to the nearest .25 order of magnitude (e.g. 250,000,000 or 50,000)

user3.1415927
  • 367
  • 3
  • 19

1 Answers1

0

You cannot get the estimate from the generator itself as explained here.

But if the generator is a part of your own platform, you can most probably add a function to the same class or module, which would give you an estimate of the total size of the generator.

For instance, if you're reading the list of sentences or words from a file, with a generator, you can estimate the total number of generated items from the size of the file, which you can get with a constant computational cost (a sys call).

Depending on your application and data, you can apply a similar heuristic to estimate the total size.

adrin
  • 4,511
  • 3
  • 34
  • 50