26

im reading a csv file and then writing a new one:

import csv

with open('thefile.csv', 'rb') as f:
  data = list(csv.reader(f))

import collections
counter = collections.defaultdict(int)
for row in data:
    counter[row[11]] += 1

writer = csv.writer(open('/pythonwork/thefile_subset1.csv', 'w'))
for row in data:
    if counter[row[11]] >= 500:
       writer.writerow(row)

for some reason i cannot get the csv.writer to close the file. when i open the file it opens it as READ ONLY because it says that is still open.

how do i close thefile_subset1.csv after i am done with it?

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
Alex Gordon
  • 57,446
  • 287
  • 670
  • 1,062

6 Answers6

37
with open('/pythonwork/thefile_subset1.csv', 'w') as outfile:
    writer = csv.writer(outfile)
    for row in data:
        if counter[row[11]] >= 500:
           writer.writerow(row)
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Didn't work for me while continuously writing to a CSV file until a `KeyboardInterrupt` occurs. `flush()` method solved the problem. Checkout the [answer](https://stackoverflow.com/a/76193980/10400598) I provided. – Caglayan DOKME May 07 '23 at 12:45
25

You can break out the open command into its own variable, so that you can close it later.

f = open('/pythonwork/thefile_subset1.csv', 'w')
writer = csv.writer(f)
f.close()

csv.writer throws a ValueError if you try to write to a closed file.

Donald Miner
  • 38,889
  • 8
  • 95
  • 118
6

close the file, not the csv writer. To do this, you'll need to open the file first before instantiating your writer rather than keeping it all in one line.

import csv
import collections

with open('thefile.csv', 'rb') as f:
    data = list(csv.reader(f))

counter = collections.defaultdict(int)
for row in data:
    counter[row[11]] += 1

f.close()  # good idea to close if you're done with it

fSubset = open('/pythonwork/thefile_subset1.csv', 'w')
writer = csv.writer(fSubset)
for row in data:
    if counter[row[11]] >= 500:
        writer.writerow(row)

fSubset.close()

Also, I would suggest keeping your imports at the top of the script and closing the first file when you're done with it.

Jim Clouse
  • 8,774
  • 6
  • 32
  • 25
  • 1
    `f.close()` is unnecessary. Why not use `with open` for `fSubset`? –  Aug 18 '16 at 21:18
  • 1
    I'm working with a class that uses (does not extend!) a csv writer, with the writer being shared across several methods indefinitely. The "with open as" pattern doesn't fit all use cases. Ensuring that there is a way to clean up the open file descriptor is important for a long running process. – Jeff W Feb 13 '20 at 12:46
5

Force the writer to clean up:

del writer
James
  • 6,978
  • 2
  • 17
  • 18
  • 2
    -1 There's no guarantee that the cleanup action will happen immediately (it depends on which implementation of Python you're using). – Greg Hewgill Jul 27 '10 at 21:00
  • 2
    The [`del` statement](http://docs.python.org/reference/simple_stmts.html#grammar-token-del_stmt) *only* removes the name binding: "Deletion of a name removes the binding of that name from the local or global namespace, depending on whether the name occurs in a global statement in the same code block." It doesn't say anything about when cleanup might happen. – Greg Hewgill Jul 27 '10 at 21:04
  • @James Roth, while this probably will work for CPython, it is a really bad idea. It won't work for Jython for example as `writer` will not be closed until it is gc'd. Python says nothing about when the file would be closed by deleting the reference to `writer`. More importantly though it's disguising the intent, which is to close the file which harms readability – John La Rooy Jul 27 '10 at 21:38
  • oy vey james this is really a horrific idea indeed, horrificable – Alex Gordon Jul 27 '10 at 22:28
  • you might be able to redeem yourself: http://stackoverflow.com/questions/3348460/python-getting-rid-of-extra-line – Alex Gordon Jul 27 '10 at 22:28
  • Haha, okay, okay, I was wrong. This was a lame attempt to push my score over 1000! – James Jul 28 '10 at 15:49
  • 2
    +1 Only answer that fixes my problem. All the others are retroactive (i.e. here's what you should do in the future). Should be noted that you should call `gc.collect()` immediately afterward for manually starting garbage collection. – Muhd Aug 23 '12 at 21:17
0

Although it might deteriorate the performance of your application, flush() is also a solution to this problem.

with open("path/to/file.csv", "w", newline="") as csvFile:
    # ... do your things ...
    # ... write to the CSV file ...
       
    csvFile.flush()

You can either call it at the end of your writing loop or after each call to the writerow(..) method of your writer. In my case, where I continuously write into a CSV file unless the program is terminated, this was the only working solution.

By the way, if you wonder what causes this situation, you better take a look at the file caching methodology of your operating system. This is actually caused by a buffering mechanism implemented to enhance the file I/O performance of our computers.

Caglayan DOKME
  • 948
  • 8
  • 21
-2

Look at the difference:

with open('thefile.csv', 'rb') as f:
    data = list(csv.reader(f))

vs:

writer = csv.writer(open('/pythonwork/thefile_subset1.csv', 'w'))
pillmuncher
  • 10,094
  • 2
  • 35
  • 33