3

I am writing a program with a while loop, which would write giant amount of data into a csv file. There maybe more than 1 million rows.

Considering running time, memory usage, debugging and so on, what is the better option between the two:

  1. open a CSV file, keep it open and write line by line, until the 1 million all written

  2. Open a file, write about 100 lines, close(), open again, write about 100 lines, ......

I guess I just want to know would it take more memories if we're to keep the file open all the time? And which one will take longer?

I can't run the code to compare because I'm using a VPN for the code, and testing through testing would cost too much $$ for me. So just some rules of thumb would be enough for this thing.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
BigBowl
  • 73
  • 2
  • 7

1 Answers1

2

I believe the write will immediately write to the disk, so there isn't any benefit that I can see from closing and reopening the file. The file isn't stored in memory when it's opened, you just get essentially a pointer to the file, and then load or write a portion of it at a time.

Edit

To be more explicit, no, opening a large file will not use a large amount of memory. Similarly writing a large amount of data will not use a large amount of memory as long as you don't hold the data in memory after it has been written to the file.

Josh Russo
  • 3,080
  • 2
  • 41
  • 62
  • Thank you for answering. Does it mean that opening a very large file not be taking a lot of memory? – BigBowl Jun 23 '15 at 17:25
  • 2
    @BigBowl: Opening a file does not mean reading the whole content to the memory. What is important, in POSIX and related operating systems, there is a mechanism of **file descriptor** which gives you an ability to read or write data to the file in a way similar to the I/O streams. – Konrad Talik Jun 23 '15 at 17:35
  • @BigBowl try to look at this for reading big files: http://stackoverflow.com/questions/519633/lazy-method-for-reading-big-file-in-python – Dušan Maďar Jun 23 '15 at 17:35
  • ktalik is correct, simply opening a file will not load it into memory. When you then perform a read operation on that file, that is the moment a portion of the file enters memory – Josh Russo Jun 23 '15 at 17:44
  • I see. Thank you guys and the link of reading big files is extremely useful. Appreciate that! – BigBowl Jun 23 '15 at 17:56