2

I am new to Python, but I didn't know this til yet. I have a basic program inside a for loop, that requests data from a site and saves it to a text file But when I checked inside my task manager I saw that the memory usage only increase? This might be a problem for me when running this for a long time. Is it standard for Python to do this or can you change it? Here is a what the program basically is

savefile = open("file.txt", "r+")
for i in savefile:
     #My code goes here
     savefile.write(i)
#end of loop
savefile.close()
Leistungsabfall
  • 6,368
  • 7
  • 33
  • 41
Uber
  • 331
  • 1
  • 5
  • 18

3 Answers3

7

Python does not write to file until you call .close() or .flush() or until it hits a specified buffer size. This question might help you: How often does python flush to a file?

Community
  • 1
  • 1
Almog
  • 731
  • 4
  • 10
  • Ahh, great, I just checked it and it was at 18MB and raising so that's why I thought that. Will look into flush, thank you sir! – Uber Oct 21 '15 at 21:23
  • 1
    Note that to have it flush to actual disk you may also have to call `os.fysnc()` http://stackoverflow.com/questions/7127075/what-exactly-the-pythons-file-flush-is-doing – Almog Oct 21 '15 at 21:25
3

As @Almog said, Python does not write to the file immediately. Because of this, every line you write to the file gets stored into RAM until you use savefile.close(), which flushes the internal buffer and writes everything to the file. This would explain the extra memory usage.

Try changing the loop to this:

savefile = open('file.txt', 'r+')
for i in savefile:
    savefile.write(i)
    savefile.flush() #flushes buffer, saving RAM
savefile.close()
Colby Gallup
  • 328
  • 1
  • 2
  • 10
  • Great example, I love it, it looks like it is steady at 18mb so it's not that big of a deal. Through I don't understand why there is a buffer, it it because ram is much faster then a hdd so that it can write it all at once vs 1 by 1. Or is it because some files run really fast thus not allowing the hdd to keep up with the speed of the program? – Uber Oct 21 '15 at 21:47
  • 1
    @Uber - It's very inefficient to write to the disk every time you add a line. In all, it's better for the computer to wait and write everything at once. – Colby Gallup Oct 22 '15 at 01:49
  • Yeah I looked into it. I will keep it as-is as it doesn't go above 18.5mb. Thanks for the help! – Uber Oct 22 '15 at 17:46
2

There is a better Solution, in pythonic way, to this:

with open("your_file.txt", "write_mode") as file_variable_name:
    for line in file_name:
        file_name.write(line)
        file_name.flush()

This code flushes the File for each line and after it's execution it closes the File thanks to the with-Statement