1

Let's say I have a DataFrame like this:

df = pd.DataFrame({'col1':[0.2, 0.3, .5], 'col2':['a', 'b', 'c']})

And I want to obtain a third column col3 which would be something like:

{'col1':['20% a', '30% b, '50% c']}

Is there anyway of solving this without iterating each row of the DataFrame ?

jpp
  • 159,742
  • 34
  • 281
  • 339
m33n
  • 1,622
  • 15
  • 38

1 Answers1

1

This is one way.

df = pd.DataFrame({'col1':[0.2, 0.3, .5], 'col2':['a', 'b', 'c']})

df['col3'] = (df['col1']*100).astype(int).apply(str) + '% ' + df['col2']

print(df)

   col1 col2   col3
0   0.2    a  20% a
1   0.3    b  30% b
2   0.5    c  50% c

As @JonClements points out, you can use lambda with string formatting, but I have an allergy to them.. only good in small doses:

df['cole'] = df.apply(lambda r: f'{r.col1 * 100}% {r.col2}', 1)
jpp
  • 159,742
  • 34
  • 281
  • 339
  • But haven't we learned that using `apply(str)` is [faster](https://stackoverflow.com/questions/49371629/converting-a-series-of-ints-to-strings-why-is-apply-much-faster-than-astype)? :-) – pault Apr 25 '18 at 16:21
  • @pault, Yes, that's true :). Still haven't got a C-level explanation though! – jpp Apr 25 '18 at 16:22
  • 1
    One could also do: `df.apply(lambda r: f'{r.col1 * 100}% {r.col2}', 1)` (slightly slower though) – Jon Clements Apr 25 '18 at 16:23