0

I am a beginner and have problem with MemoryError on my code. The csv file is big (1,5gb) and I want to find and replace every " char with blank space. Code is working on smaller file but on this its return MemoryError. I found here that I could use Garbage Colector and was trying to make it but failed. How can i solve this problem?

text = open(r"C:\Users\jarze\abc.csv", "r")
text = ''.join([i for i in text]) \
    .replace('"', '')
x = open(r"C:\Users\jarze\abc.csv","w")
x.writelines(text)
x.close()
martineau
  • 119,623
  • 25
  • 170
  • 301
Michał
  • 61
  • 8
  • How about using sed ? https://askubuntu.com/questions/20414/find-and-replace-text-within-a-file-using-commands OR (in windows) powershell powershell -Command "(gc myFile.txt) -replace 'foo', 'bar' | Out-File -encoding ASCII myFile.txt" – balderman Sep 07 '20 at 14:26
  • 3
    try reading the file in chunks don't keep the whole file in the memory. that's the best practice anyway – Raman Mishra Sep 07 '20 at 14:26
  • 1
    What's the purpose of `''.join([i for i in text])`? Wouldn't that result in `text` again? – Thomas Weller Sep 07 '20 at 14:27
  • 2
    Does this answer your question? [Lazy Method for Reading Big File in Python?](https://stackoverflow.com/questions/519633/lazy-method-for-reading-big-file-in-python) – Raman Mishra Sep 07 '20 at 14:28
  • Simply speaking you could read and write 100 or so lines at a time instead of the whole file. – Tarik Sep 07 '20 at 14:41
  • 1
    Use the `csv` module to read the file, and process it a row at time. This should allow you to process files of any size assuming you've got the time and disk space. – martineau Sep 07 '20 at 14:47
  • Does this answer your question? [Reading a huge .csv file](https://stackoverflow.com/questions/17444679/reading-a-huge-csv-file) – Tomerikoo Sep 07 '20 at 14:50
  • 1
    It is best to read the file row by row as suggested by others, then write these rows to a new file, and at the end of the process override the original file with the new one – Tomerikoo Sep 07 '20 at 14:51

1 Answers1

1

This has been answered for the general case here

In summary, python's file object is already a generator and defined as a memory efficient way of reading a file line by line (see here):

f_out = open(r"C:\Users\jarze\out_file.csv","w")

with open(r"C:\Users\jarze\in_file.csv", "r") as f_in:
    ''' 
    As Tomerikoo indicates, This is a preferred way of opening 
    files in python and you don't need to close it later.
    '''
    for line in f_in:
        f_out.write(line.replace('"', ''))    

f_out.close()    
aerijman
  • 2,522
  • 1
  • 22
  • 32
  • @Tomerikoo yes. – aerijman Sep 07 '20 at 15:16
  • Same as input without '"'. What would you expect? – aerijman Sep 07 '20 at 15:18
  • @Tomerikoo I hope that you did not downvoted for _you_ not having replaced the filename... I am answering Michal, which is a beginner... – aerijman Sep 07 '20 at 15:19
  • Good point. I did not realize that the name is the same. I hope I did not truncate Michal's file... Next time, please be more explicit and thank you! – aerijman Sep 07 '20 at 15:28
  • 1
    Sorry @aerijman I could indeed get to the point faster... ^_^ Now all I will say is that a more pythonic way is to use `with` and do: `with open(r"in_file.csv", "r") as f_in, open(r"out_file.csv","w") as f_out` instead of explicitly using `open` and `close` – Tomerikoo Sep 07 '20 at 15:30
  • I agree with you and I preferred that (as I always use `with`) but to avoid complicating the life of a beginner i prefer to avoid including new different options. If you think that my decision is not optimal, I take your suggestion and change the answer for good. – aerijman Sep 07 '20 at 15:38
  • You are free to put whatever you think in your answer. I just personally think that this philosophy of *do not complicate an innocent beginner* that I hear a lot of people saying is wrong. If there is a certain idiomatic way of doing stuff, beginners should get used to it right from the start. Even if it complicates them a bit (which I don't think is true in this case), it is still better to get used to the right way than learning something "bad" only to change it in the future... – Tomerikoo Sep 07 '20 at 15:42