0

Just learning some new pandas techniques and working on trying to fine tune the outputs.

Here's my code.

import pandas as pd import numpy as np

dogs = np.random.choice(['labrador', 'poodle', 'pug', 'beagle', 'dachshund'], size=50_000)
smell = np.random.randint(1,100, size=50_000)
df = pd.DataFrame(data= np.array([dogs, smell]).T, columns= ['dog', 'smell']) 

So far so simple.

    dog         smell
0   poodle      83
1   labrador    3
2   poodle      86
3   dachshund   31
4   labrador    16
... ... ...

Then created a one-liner to list the number of each breed using .value_counts.

I normalised using the normalize attribute and then multiplied by 100 to return percentage and then combined .to_frame and .round()

print(f"{(df.value_counts('dog', normalize=True, )*100).to_frame().round(2)}") 

               0
dog             
beagle     20.04
poodle     20.03
labrador   19.98
dachshund  19.98
pug        19.97

It's almost there but is there a simple way to extend the formatting of this one-liner so it looks like - that is that there is a percentage symbol?

               0
dog             
beagle     20.04%
poodle     20.03%
labrador   19.98%
dachshund  19.98%
pug        19.97%
elksie5000
  • 7,084
  • 12
  • 57
  • 87
  • 1
    See https://stackoverflow.com/questions/20937538/how-to-display-pandas-dataframe-of-floats-using-a-format-string-for-columns – Galo do Leste Jan 18 '23 at 00:57

1 Answers1

1

The following one-liners work!

  • Using a lambda function:
print(f"{((df.value_counts('dog', normalize=True, )*100).to_frame().round(2)).iloc[:,0].apply(lambda x: str(x) + '%')}")
  • Change type:
print(f"{((df.value_counts('dog', normalize=True, )*100).to_frame().round(2)).iloc[:,0].astype(str) + '%'}")

Hope this help!

tmc
  • 389
  • 2
  • 6