0

I am using python version 3.8.5.

I have a following row in excel file:

SystemId    URL             FIELD           VALUE
794         google.com/sear Organic_Price   F CFA 20

having VALUE as F CFA 20

Which is giving me this error: could not convert string to float: 'f\u202fcfa20'

when I try to df_price_all['VALUE'] = df_price_all['VALUE'].astype(float)

I tried to print this value in python terminal but I get
enter image description here

Why are the other weird characters not getting printed?

Also by default the Unicode encoding scheme is assumed to be utf-8 but can we verify this in python3?

martineau
  • 119,623
  • 25
  • 170
  • 301
Azima
  • 3,835
  • 15
  • 49
  • 95
  • 1
    I don't think MS Excel is using `utf-8` encoding but rather `cp1252`. This answer might help: https://stackoverflow.com/a/15502713/42346 – mechanical_meat Nov 11 '21 at 02:37
  • Why are you expecting `float(randomtext)` to not fail anyway? – tripleee Nov 11 '21 at 06:54
  • There is no default encoding scheme. Python usually use operating system deafult, which it is often UTF-8 is most common operating systems but on Windows. If you know the encoding, you should force such encoding on reading (so you will not have surprises on other computers). – Giacomo Catenazzi Nov 11 '21 at 17:08
  • always put full error message (starting at word "Traceback") in question (not comment) as text (not screenshot, not link to external portal). There are other useful information. – furas Nov 11 '21 at 17:16
  • how did you create this excel file? Maybe you should rather fix code which creates this file. – furas Nov 11 '21 at 17:18
  • if you want to get value `20` then `split(" ")` text and get only last element – furas Nov 11 '21 at 17:20
  • The file is encoded in UTF-8, but Excel is using a legacy (ANSI) encoding, probably Windows-1252. Write the file with UTF-8 w/ BOM instead (`encoding='utf-8-sig'` in Python) and Excel will read it properly. Excel expects the BOM signature to use UTF-8. – Mark Tolonen Nov 11 '21 at 18:01

0 Answers0