#'charmap' codec can't decode byte 0x8d in position 1148

Question

I want to read several .text documents but got some error on the line

lyrics = "".join(f.readlines())

The error is:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1148: character maps to <undefined>

How can I fix it. It would be helpful if anyone fixes it.

My code function is:

def read_lyrics():
reg1 = re.compile("\.txt$")
reg2 = re.compile("([0-9]+)\.txt")
reg3 = re.compile(".*_([0-9])\.txt")
reg4 = re.compile("\[.+\]")
reg5 = re.compile("info\.txt")
lyrics_dictionary = {}
#iter all directory and load all song(txt file)
for i in os.listdir():
    if os.path.isdir(i):
        for path,sub,items in os.walk(i):
            if any([reg1.findall(item) for item in items]):
                for item in items:
                    if reg5.findall(item):
                        continue
                    if reg3.findall(item):
                        num = ["0"+reg3.findall(item)[0]]
                        name = "_".join(path.split("/") + num)
                    else:
                        name = "_".join(path.split("/") + reg2.findall(item))
                    
                    print("The path is: ", path)
                    print("The item is: ", item)
                    
                    with open(os.path.join(path,item),"r") as f:
                        print("The file path is: ", f)
                        lyrics = "".join(f.readlines()) 
                        
                        lyrics = reg4.subn("",lyrics)[0]
                        lyrics_dictionary[name] = lyrics
return lyrics_dictionary

So when you tried putting `'charmap'+codec+can't+decode+byte+0x8d+in+position+1148` into a search engine, and looked at [the results](https://duckduckgo.com/?t=ffsb&q=%27charmap%27+codec+can%27t+decode+byte+0x8d+in+position+1148&ia=web), and tried the advice in the results, what happened? — Karl Knechtel, Jan 30 '21 at 09:53
Your `open` call is `open(os.path.join(path,item),"r") as f`. In Python 3 that would open a file with the default encoding of UTF-8. But you are getting an error message about a charmap encoding, which suggests to me that you might be running this code in Python 2. If you are, your `print()` calls will put parens in the output, as `('The path is', '...')`. — BoarGules, Jan 30 '21 at 10:32

score 1 · Accepted Answer · edited Jan 30 '21 at 09:55

1

When you use open(), you also use a default encoding. It most likely didn't fit you. Try using something like - with open(os.path.join(path,item),"r",encoding='utf8') Or, if you can, check what is the enryption which was used on this file.

Try to check the answers this post, one of them might help you.

edited Jan 30 '21 at 09:55

Karl Knechtel

62,466
11
102
153

answered Jan 30 '21 at 09:50

Noga K

398
1
8

When the question is answered by an existing Stack Overflow post like this, please do not add your own answer - instead, vote to close the question as a duplicate. – Karl Knechtel Jan 30 '21 at 09:56

#'charmap' codec can't decode byte 0x8d in position 1148

1 Answers1