0

For example, I have a dataframe that looks like this

category  numbers
a         100
b         200
c         200

And I want to add a column that present their percentages(with the percentage symbol). So this is what I've tried

df['percentage'] = str(100 * df['numbers']/df['numbers'].sum()) + '%'

However, this would return a list of number

category  numbers  percentage
a         100      0 20.00 1 40.00 2 40.00 Name: numbers, dtype: float64%
b         200      0 20.00 1 40.00 2 40.00 Name: numbers, dtype: float64%
c         200      0 20.00 1 40.00 2 40.00 Name: numbers, dtype: float64%

What could I do to let it become 20% 40% 40%

  • 2
    Don't use `str(...)` but `df['numbers'].div(df['numbers'].sum()).mul(100).astype(str).add('%')` – mozway Dec 24 '22 at 07:24
  • If ```%``` will just be used to display the percentage then it's okay otherwise changing it to ```str``` type is bad practice(Assuming further mathematical operations might be carried out using percentage values). Conversion of numbers to string during displaying will be much better than storing a number in string format in DataFrame. – imraklr Dec 24 '22 at 07:36

1 Answers1

1

As already commented by mozway, the key is to use .astype(str) which turns the value of each individual cell into a string, while str() gives you the string representation of the series as a whole.

>>> df = pd.DataFrame({"numbers": [100, 200, 200]})
>>> tmp = 100 * df.numbers / df.numbers.sum()
>>> tmp
0    20.0
1    40.0
2    40.0
Name: numbers, dtype: float64
>>> df["percentage"] = tmp.astype(str) + "%"
>>> df
   numbers percentage
0      100      20.0%
1      200      40.0%
2      200      40.0%
fsimonjetz
  • 5,644
  • 3
  • 5
  • 21