2

I'm currently have an issue with my python 3 code.

 replace_line('Products.txt', line, tenminus_str)

Is the line I'm trying to turn into utf-8, however when I try to do this like I would with others, I get errors such as no attribute ones and when I try to add, for example...

.decode("utf8")

...to the end of it, I still get errors that it is using ascii. I also tried other methods that worked with other lines such as adding io. infront and adding a comma with

encoding = 'utf8'

The function that I am using for replace_line is:

def replace_line(file_name, line_num, text):
    lines = open(file_name, 'r').readlines()
    lines[line_num] = text
    out = open(file_name, 'w')
    out.writelines(lines)
    out.close()

How would I fix this issue? Please note that I'm very new to Python and not advanced enough to do debugging well.

EDIT: Different fix to this question than 'duplicate'

EDIT 2:I have another error with the function now.

File "FILELOCATION", line 45, in refill replace_line('Products.txt', str(line), tenminus_str) 

File "FILELOCATION", line 6, in replace_line lines[line_num] = text

TypeError: list indices must be integers, not str 

What does this mean and how do I fix it?

Name not Found
  • 69
  • 2
  • 2
  • 10
  • Show us your stracktrace, show us your data – Falmarri Nov 15 '16 at 23:59
  • What do you mean? – Name not Found Nov 16 '16 at 00:02
  • use utf_8_sig, instead of utf8, your file might start with bom – YOU Nov 16 '16 at 00:02
  • .decode('utf_8_sig') – YOU Nov 16 '16 at 00:05
  • `decode('utf8', errors='ignore')` – Peter Wood Nov 16 '16 at 00:07
  • Possible duplicate of [UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1](http://stackoverflow.com/questions/10561923/unicodedecodeerror-ascii-codec-cant-decode-byte-0xef-in-position-1) – Peter Wood Nov 16 '16 at 00:08
  • @PeterWood It's sill giving me the same error for some reason. I don't think it's duplicate (but don't take my word), as it worked in other lines for other things. – Name not Found Nov 16 '16 at 00:09
  • `io.open("Products.txt","r", encoding = 'utf8')` , if thats what you mean by whats the encoding. And what do you mean by, specifying with both open calls I'm a newbie to python and haven't come across many terms yet. It's already a with statement I think. `with io.open("Products.txt","r", encoding = 'utf8') as f: for word in f.readlines():` – Name not Found Nov 16 '16 at 00:17
  • Instead of `out = open(file_name, 'w')` you can do `with open(file_name, 'w', encoding='utf-8') as out:` and then indent the following `out.writelines(lines)` line. At the end of the `with` block the file gets closed, and if there are IO problems with the file the `with` makes sure the file gets closed. – PM 2Ring Nov 16 '16 at 00:26
  • You may find this article helpful: [Pragmatic Unicode](http://nedbatchelder.com/text/unipain.html), which was written by SO veteran Ned Batchelder. – PM 2Ring Nov 16 '16 at 00:31
  • Thanks I'll look at that tomorrow morning. However i'm still having the same error, hopefully a morning mind is better than a tried one. – Name not Found Nov 16 '16 at 00:36
  • "stacktrace" means the error message and all the lines before it - it helps narrow down *exactly* where the error is occurring. – Mark Ransom Nov 16 '16 at 04:16

4 Answers4

5

Change your function to:

def replace_line(file_name, line_num, text):
    with open(file_name, 'r', encoding='utf8') as f:
        lines = f.readlines()
    lines[line_num] = text
    with open(file_name, 'w', encoding='utf8') as out:
        out.writelines(lines)

encoding='utf8' will decode your UTF-8 file correctly.

with automatically closes the file when its block is exited.

Since your file started with \xef it likely has a UTF-8-encoding byte order mark (BOM) character at the beginning. The above code will maintain that on output, but if you don't want it use utf-8-sig for the input encoding. Then it will be automatically removed.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
4

codecs module is just what you need. detail here

import codecs
def replace_line(file_name, line_num, text):
    f = codecs.open(file_name, 'r', encoding='utf-8')
    lines = f.readlines()
    lines[line_num] = text
    f.close()
    w = codecs.open(file_name, 'w', encoding='utf-8')
    w.writelines(lines)
    w.close()
Enix
  • 4,415
  • 1
  • 24
  • 37
  • 3
    No need for `codecs` in Python 3, `open` also supports the `encoding` parameter. – Mark Ransom Nov 16 '16 at 04:17
  • @MarkRansom thank you for pointing out. actually, i am a developer of python2... :) – Enix Nov 16 '16 at 04:33
  • I have another error with the function now. `File "LOCATION", line 45, in refill replace_line('Products.txt', str(line), tenminus_str)` `File "LOCATION", line 6, in replace_line lines[line_num] = text` `TypeError: list indices must be integers, not str` What does this mean and how do I fix it? – Name not Found Nov 16 '16 at 13:26
  • @NamenotFound the second parameter `line_num` should be an integer represent which line you need to replace. you have passed a `str` to the function, so it will be failed with that error. You should call the function like `replace_line("Products.txt", 1, tenminus_str)`, it means, you want to replace the second line with string `tenminus_str`. – Enix Nov 16 '16 at 15:35
  • No need for `codecs` in Python 2, either. `io.open` is present in both Python 2 and Python 3 and works like Python 3's built-in `open`. – Mark Tolonen Sep 01 '21 at 02:14
1

Handling coding problems You can try adding the following settings to your head


import sys
reload(sys)
sys.setdefaultencoding('utf-8')
Type = sys.getfilesystemencoding()
luyishisi
  • 195
  • 11
  • No. [Why sys.setdefaultencoding will break code](https://anonbadger.wordpress.com/2015/06/16/why-sys-setdefaultencoding-will-break-code/) – Mark Tolonen Nov 16 '16 at 08:06
1

Try adding encoding='utf8' if you are reading a file

with open("../file_path", encoding='utf8'):
         # your code
tausif
  • 672
  • 1
  • 6
  • 15