1

I have a column with values in degrees with the degree sign.

42.9377º
42.9368º
42.9359º
42.9259º
42.9341º

The digit 0 should replace the degree symbol

I tried using regex or str.replace but I can't figure out the exact unicode character.

The source xls has it as º

the error shows it as an obelus ÷

printing the dataframe shows it as ?

the exact position of the degree sign may vary, depending on rounding of the decimals, so I can't replace using exact string position.

HDunn
  • 533
  • 2
  • 13
  • 26

2 Answers2

2

Use str.replace:

df['a'] = df['a'].str.replace('º', '0')
print (df)
          a
0  42.93770
1  42.93680
2  42.93590
3  42.92590
4  42.93410

#check hex format of char
print ("{:02x}".format(ord('º')))
ba

df['a'] = df['a'].str.replace(u'\xba', '0')
print (df)
          a
0  42.93770
1  42.93680
2  42.93590
3  42.92590
4  42.93410

Solution with extract floats.

df['a'] = df['a'].str.extract('(\d+\.\d+)', expand=False) + '0'
print (df)
          a
0  42.93770
1  42.93680
2  42.93590
3  42.92590
4  42.93410

Or if all last values are º is possible use indexing with str:

df['a'] = df['a'].str[:-1] + '0'
print (df)
          a
0  42.93770
1  42.93680
2  42.93590
3  42.92590
4  42.93410
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

If you know that it's always the last character you could remove that character and append a "0".

s = "42.9259º"

s = s[:-1]+"0"

print(s) # 42.92590
Mike Scotty
  • 10,530
  • 5
  • 38
  • 50