0

I am writing a program for which I need up to 100,000 lines of integer pairs from sys.stdin on which to do calculations. My whole program, consisting of reading this input and performing calculations on the integers of each input line has to take a maximum of 1 second. The problem is that, just going through all the lines of input takes way more than 1 second! In the case of 100,000 lines, it takes roughly 10 seconds. My question is, is this performance to be expected for this amount of lines?

The input is in the format:

100000 5 100000
72324 563
56487 2252
866 19750
65532 69349
96171 56840
70287 14094
76381 14722
48359 38831
74431 12611
29994 66230
92169 20726
39565 38429
59416 2360
45470 40781
...

Where the rightmost integer on the first line indicates the number of lines to come. To read this input, I'm using the following code:

import time
from sys import stdin, stderr

def read():

    row = stdin.readline().split()
    n, k, q = int(row[0]), int(row[1]), int(row[2])

    start = time.clock()

    for i in range(q):
        line = stdin.readline().split()
        # Do some calculation on the integers of this line...
    end = time.clock()
    print("Reading time: " + str(end-start))

read()

Am I missing something here? The limit of 1 second is due to this being a school project of calculating Q number of distances between two nodes in a K-ary tree. Thanks in advance.

Veltzer Doron
  • 934
  • 2
  • 10
  • 31
  • Way too long, how are you inputting the file? – Veltzer Doron Dec 25 '17 at 11:34
  • @VeltzerDoron I have a file with 100,000 lines. I just copy it and paste it into my command window after running the program. But maybe that's not even remotely equivalent to doing read.py < queries.txt ? – Calle Freme Dec 25 '17 at 11:38
  • 1
    @VeltzerDoron Yes, you are correct. Doing "python read.py < queries.txt" is a LOT quicker. So thanks for that! However, now I'm stuck with the fact that printing out all the results takes up to 3 seconds! I have to output these results as part of the task... I use print(*results, sep="\n") – Calle Freme Dec 25 '17 at 11:50
  • you can't print a hundred thousand lines in under a second. can you print it to a log file perhaps? – Veltzer Doron Dec 25 '17 at 11:54
  • @VeltzerDoron That's what I thought. Reading and doing calculations on 100,000 lines takes about 1.3 seconds, so there I've already passed the time limit... I don't think a log file works. My program is to be submitted through a website that takes it through a number of testcases, and I have to output all my results right back to the website through prints or stdout. – Calle Freme Dec 25 '17 at 12:01
  • Listen, printing is a slow process if you wait for it to be over, what you need to understand is buffering, here, read this https://stackoverflow.com/questions/3857052/why-is-printing-to-stdout-so-slow-can-it-be-sped-up – Veltzer Doron Dec 25 '17 at 12:07
  • The only things that are really lightning fast are memory copying stuff which is basically what happens when you perform asynchronous system calls s.a. file or network access. – Veltzer Doron Dec 25 '17 at 12:11
  • @VeltzerDoron Nah. Printing isn't slow. Printing *like that* is somewhat slow. – Stefan Pochmann Dec 25 '17 at 12:14
  • @VeltzerDoron And why do you keep your *"you can't print a hundred thousand lines in under a second"* claim up after linking to a question that reports printing a hundred thousand lines in 0.053 seconds? – Stefan Pochmann Dec 25 '17 at 12:29
  • Possible duplicate of [Fastest stdin/out IO in python 3?](https://stackoverflow.com/questions/7982146/fastest-stdin-out-io-in-python-3) – Pavel Dec 25 '17 at 13:06
  • @Pavel I don't think so. That other question is about optimization. This question here is really about not measuring it wrong. – Stefan Pochmann Dec 25 '17 at 13:13
  • That's not printing to a terminal, that's using the system call print which is just mem copying your string into system buffer and then dumping it, printing to the screen can take ages unless it's buffered, which I'm assuming by the time it takes to print is not the case here. – Veltzer Doron Dec 25 '17 at 13:17
  • BTW, I just timed jupyter printing of 100000 lines and it took about 0.4 secs whereas just accumulating the string for the print command was about a factor of 10 faster – Veltzer Doron Dec 25 '17 at 13:20
  • If @CalleFreme needs the print to terminal to actually finish before the second is over than that's impossible in most terminals. – Veltzer Doron Dec 25 '17 at 13:24

0 Answers0