0

Please do not behead me for my noob question. I have looked up many other questions on stackoverflow concerning this topic, but haven't found a solution that works as intended.

The Problem: I have a fairly large txt-file (about 5 MB) that I want to copy via readlines() or any other build in string-handling function into a new file. For smaller files the following code sure works (only schematically coded here):

f = open('C:/.../old.txt', 'r');
n = open('C:/.../new.txt', 'w');
for line in f:
    print(line, file=n);

However, as I found out here (UnicodeDecodeError: 'charmap' codec can't encode character X at position Y: character maps to undefined), internal restrictions of Windows prohibit this from working on larger files. So far, the only solution I came up with is the following:

f = open('C:/.../old.txt', 'r', encoding='utf8', errors='ignore');
n = open('C:/.../new.txt', 'a');
for line in f:
    print(line, file=sys.stderr) and append(line, file='C:/.../new.txt');   

f.close();
n.close();

But this doesn't work. I do get a new.txt-file, but it is empty. So, how do I iterate through a long txt-file and write every line into a new txt-file? Is there a way to read the sys.stderr as the source for the new file (I actually don't have any idea, what this sys.stderr is)? I know this is a noob question, but I don't know where to look for an answer anymore.

Thanks in advance!

Community
  • 1
  • 1
hab
  • 1
  • 1
  • `print(line, file=sys.stderr) and append(line, file='C:/.../new.txt')` the second part of this statement will never get executed as `print()` doesn't return anything which is interpreted as `None` or in context `False`. – AChampion Mar 01 '17 at 05:17

4 Answers4

1

There is no need to use print() just write() to the file:

with open('C:/.../old.txt', 'r') as f, open('C:/.../new.txt', 'w') as n:
    n.writelines(f)

However, it sounds like you may have an encoding issue, so make sure that both files are opened with the correct encoding. If you provide the error output perhaps more help can be provided.

BTW: Python doesn't use ; as a line terminator, it can be used to separate 2 statements if you want to put them on the same line but this is generally considered bad form.

AChampion
  • 29,683
  • 4
  • 59
  • 75
  • Yes, it might be an encoding issue. I get the "UnicodeDecodeError: 'charmap' codec can't decode ..." However, I have tried many possible encodings, of which none have worked. Sry for using ; . Might have mixed sth up with java here. :D – hab Mar 01 '17 at 05:25
  • Do you still have this issue if you don't write to the console? – AChampion Mar 01 '17 at 05:55
  • Yes, the new file stays empty, no matter what. – hab Mar 01 '17 at 05:57
0

You can set standard output to file like my code. I successfully copied 6MB text file with this.

import sys

bigoutput = open("bigcopy.txt", "w")
sys.stdout = bigoutput
with open("big.txt", "r") as biginput:
    for bigline in biginput.readlines():
        print(bigline.replace("\n", ""))
bigoutput.close()
Dongeon Kim
  • 443
  • 4
  • 12
  • Doesn't work. I still get the UnicodeDecodeError. But thanks anyway. Your code looks like it should work, though. I think it actually might be a decoding problem. – hab Mar 01 '17 at 05:24
  • @hab Oh im sorry to hear that. If you doesn't matter, can u upload your text file so that I can give you some advise? – Dongeon Kim Mar 01 '17 at 11:03
0

Why don't you just use the shutil module and copy the file?

sureshvv
  • 4,234
  • 1
  • 26
  • 32
  • I don't just want to simply copy the txt-file, but also want to form lists and tables from the data. I want to come up with a simple analysis tool as a starting point. – hab Mar 01 '17 at 05:30
0

you can try with this code it works for me.

with open("file_path/../large_file.txt") as f:
    with open("file_path/../new_file", "wb") as new_f:
            new_f.writelines(f.readlines())
            new_f.close()
    f.close()
  • You don't need to close a file if you use `with open()`. – Huy Vo Mar 01 '17 at 05:43
  • It's basically the same solution as others have provided it. But why are you using "wb" when writing into the new file? This is not in the python documentation. – hab Mar 01 '17 at 05:43
  • I have used "wb" to just open binary files as I have not used any specific extension to a file in my code it may be a text file or any binary file too. – pravin lanjile Mar 27 '17 at 06:45