1

I have a simple script that reads values from one csv, runs some internal function on them that takes 2-3 seconds each time, and then writes the results into another csv file.

Here is what it looks like, minus the internal function I referenced.

import csv
import time

pause = 3

with open('input.csv', mode='r') as input_file, \
     open('output.csv', mode='w') as output_file:
    input_reader = csv.DictReader(input_file)
    output_writer = csv.writer(output_file, delimiter=',', quotechar='"',
                               quoting=csv.QUOTE_MINIMAL)
    count = 1
    for row in input_reader:
        row['new_value'] = "result from function that takes time"
        output_writer.writerow( row.values() )
        print( 'Processed row: ' + str( count ) )
        count = count + 1
        time.sleep(pause)

The problem I face is that the output.csv file remains blank until everything is finished executing.

I'd like to access and make use of the file elsewhere whilst this long script runs.

Is there a way I can prevent the delay of writing of the values into the output.csv?

Edit: here is an dummy csv file for the script above:

value
43t34t34t
4r245r243
2q352q352
gergmergre
435q345q35
martineau
  • 119,623
  • 25
  • 170
  • 301
Jack Robson
  • 2,184
  • 4
  • 27
  • 50

1 Answers1

2

I think you want to look at the buffering option - this is what controls how often Python flushes to a file.

Specifically setting open('name','wb',buffering=0) will reduce buffering to minimum, but maybe you want to set it to some thing else that makes sense.

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:

  • Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On many systems, the buffer will typically be 4096 or 8192 bytes long.
  • “Interactive” text files (files for which isatty() returns True) use line buffering. Other text files use the policy described above for binary files.

See also How often does python flush to a file? .

kabanus
  • 24,623
  • 6
  • 41
  • 74