0

I wrote a script that reads a .cnf file, analyses some stuff and then outputs some results (via print()). For reading the .cnf file, I use the following line:

with open('config.cnf') as f:
file_content = f.read()

Now if I run this in the Spyder-Environment (Python 3.6), everything works fine. The scripts read config.cnf, does the operations and outputs the results. If I run the exact same script on Linux (with the config.cnf located in the same Directory), the following error message is shown:

Traceback (most recent call last):
  File "Conf2Monit_V2.py", line 45, in <module>
    file_content = f.read()
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in Position 29834: invalid start byte

I use the following command:

python3 myScript.py 

I am new to Python AND to Linux, so please don't be fed up if this is some basic mistake. Thank you.

Akash Mahapatra
  • 2,988
  • 1
  • 14
  • 28
Joey
  • 207
  • 4
  • 11
  • Have you checked that the line endings of the config file you are trying to read are encoded the same way on Windows and Linux? Maybe this points you in the right direction: https://en.wikipedia.org/wiki/Newline#Representation – Aarkon Apr 29 '19 at 08:55
  • Well I ran another python script of mine on my Linux machine and it works just fine. It also reads a file via f.read(). The only difference is, that the working script reads a .nmap file while the broken one reads a .cnf file. So this should mean that the Problem is not in the Encoding right? – Joey Apr 29 '19 at 09:01
  • I don't know how nmap-files are encoded, but plain text files tend to contain the line endings the creating operating system prefers. You could try to feed a file to your script that you created on Linux and where you don’t paste in any line breaks. – Aarkon Apr 29 '19 at 09:13
  • Are you also running this program under Python 3 under Windows, or Python 2? Because it seems to be that under Python 3 you should get the same error on both platforms. – BoarGules Apr 29 '19 at 09:16

2 Answers2

0

I'm guessing the issue could be because of Historical reason behind different line ending at different platforms

If this is the case then please give a try at this

Else you can try the iconv command of Linux.

Can you check the encoding of the characters in the file by running the below command in your Linux environment:

file -i <filename>

Check if the output is something like charset=utf-8

If not then, there is a way to translate the encoding to UTF-8 as shown here

It explains how to convert the encoding of a file ("input.txt") from the code set ISO88592 to UTF8 code set or ASCII and stores the result as "output.txt".

iconv -f ISO88592 -t UTF8 input.txt output.txt

So, the steps you might want to follow are, if the file under consideration is input.txt:

  1. file -i input.txt
    and let's say its output comes something like
    input.txt: text/plain; charset=iso-88592
  2. iconv -f ISO88592 -t UTF8 input.txt output.txt
Akash Mahapatra
  • 2,988
  • 1
  • 14
  • 28
  • I get "Unknown-8bit" as charset. Now if I try iconv -f unknown8bit -t utf8 Input.cnf Output.txt, I just get an error message – Joey Apr 29 '19 at 12:11
0

The solution was simple:

I opened the file in the Windows Editor. Then under "save as..." I was able to Change a Setting at the bottom from "ASCII" to "UTF-8". Then I transferred the file back to my Linux System et voila. Worked.

Joey
  • 207
  • 4
  • 11