0

df = pd.read_csv("D:\jihyun0115\desktop\dataframe01.csv")

After this, error code is keep occur. I already did encoding = 'UTF-8', 'euk-kr', 'cp949' , sep = ',' "r", etc.. do you know what kind of action can solve this ? please help ~~

error code : 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

  • What are the first few bytes of the file if you `open` and `read` it *as binary*? Well, we already know the first byte is 0xff. My best guess from that is UTF-16 big-endian, which would encode a byte-order mark as 0xff 0xfe. – Karl Knechtel Aug 24 '22 at 01:20
  • 1
    @S4RUUL that will definitely not help; Pandas handles that sort of thing for you, and anyway your example is still assuming an encoding that was already tried. The problem is to know which encoding to use. – Karl Knechtel Aug 24 '22 at 01:23
  • There are many encodings that can accept any input data and convert to string, such as `iso-8859-1` - but they may not interpret the data *correctly*, and so they could cause problems for reading the CSV structure. It's at least worth a shot; looking at the results sometimes offers another clue. – Karl Knechtel Aug 24 '22 at 01:26
  • @KarlKnechtel thank you so much! i tried UTF-16 with sep='\t' and it's working :) – Chloe Baek Aug 24 '22 at 01:51

0 Answers0