Just learning some new pandas techniques and working on trying to fine tune the outputs.
Here's my code.
import pandas as pd import numpy as np
dogs = np.random.choice(['labrador', 'poodle', 'pug', 'beagle', 'dachshund'], size=50_000)
smell = np.random.randint(1,100, size=50_000)
df = pd.DataFrame(data= np.array([dogs, smell]).T, columns= ['dog', 'smell'])
So far so simple.
dog smell
0 poodle 83
1 labrador 3
2 poodle 86
3 dachshund 31
4 labrador 16
... ... ...
Then created a one-liner to list the number of each breed using .value_counts.
I normalised using the normalize attribute and then multiplied by 100 to return percentage and then combined .to_frame and .round()
print(f"{(df.value_counts('dog', normalize=True, )*100).to_frame().round(2)}")
0
dog
beagle 20.04
poodle 20.03
labrador 19.98
dachshund 19.98
pug 19.97
It's almost there but is there a simple way to extend the formatting of this one-liner so it looks like - that is that there is a percentage symbol?
0
dog
beagle 20.04%
poodle 20.03%
labrador 19.98%
dachshund 19.98%
pug 19.97%