0

I'm having trouble visualising the correlation between one variable and another. My brain is dead on this one.

I have a dataset listing the uptake of a sport across a number of cities based on data from individuals.

So the data looks a bit like:

Sport City
Sport1 city1
Sport2 city2
Sport2 city2
Sport1 city2

What I want to determine is, are certain sports more popular in different cities? ie. from the above we can see that Sport 2 is more popular in City 2. How can I visualise/list this in Python?

  • You can use group by and count/sum, possible duplicate of: https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum – dee cue Aug 19 '21 at 01:29

1 Answers1

2

Maybe try a pivot table?

import pandas as pd

df = pd.DataFrame(
    {'sport': ['sport1', 'sport2', 'sport2', 'sport1'],'city': ['city1', 'city2','city2','city2']}
)
df.pivot_table(index='sport',columns='city',aggfunc=len, fill_value=0)

Output is

city1 city2
sport1 1 1
sport2 0 2
jdel3
  • 86
  • 2