10

I'm reading several spreadsheets into a data frame and there is an ID field that is a string in some spreadsheets and a number in others. I've converted it into a string, which is the data type I need, but I'm ending up with some IDs that have a ".0" at the end. How do I remove the decimal and zero?

Example: ID number 805096730.0 should be 805096730

Dread
  • 789
  • 3
  • 16
  • 31

2 Answers2

41

Use astype with replace:

df = pd.DataFrame({'ID':[805096730.0,805096730.0]})

df['ID'] = df['ID'].astype(str).replace('\.0', '', regex=True)
print (df)
          ID
0  805096730
1  805096730

Or add parameter dtype:

df = pd.read_excel(file, dtype={'ID':str})
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Check type of your numbers before converting them to strings. It seems that they are floats, rather than integers. If this is the case, convert your numbers to integers:

df = pd.DataFrame([123.0, 456.0])
df = df.apply(int, axis=1)

0    123
1    456

Then, convert it into strings:

df = df.apply(str)
print(df.iloc[1])

'456'
Boris M
  • 71
  • 3