I have a data frame as taken from SDSS database. Example data is here.
I want to remove the character 'b' from data['class']
. I tried
data['class'] = data['class'].replace("b','')
But I am not getting the result.
I have a data frame as taken from SDSS database. Example data is here.
I want to remove the character 'b' from data['class']
. I tried
data['class'] = data['class'].replace("b','')
But I am not getting the result.
You're working with byte strings. You might consider str.decode
:
data['class'] = data['class'].str.decode('utf-8')
Further explanation:
df = pd.DataFrame([b'123']) # create dataframe with b'' element
Now we can call
df[0].str.decode('utf-8') # returns a pd.series applying decode on str succesfully
df[0].decode('utf-8') # tries to decode the series and throws an error
Basically what you are doing with .str() is applying it for all elements. It could also be written like this:
df[0].apply(lambda x: x.decode('utf-8'))