3

I have a list called 'gender', of which I counted all the occurrences of the values with Counter:

gender = ['2',
          'Female,',
          'All Female Group,',
          'All Male Group,',
          'Female,',
          'Couple,',
          'Mixed Group,'....]

gender_count = Counter(gender)
gender_count 
Counter({'2': 1,
     'All Female Group,': 222,
     'All Male Group,': 119,
     'Couple,': 256,
     'Female,': 1738,
     'Male,': 2077,
     'Mixed Group,': 212,
     'NA': 16})

I want to put this dict into a pandas Dataframe. I have used pd.series(Convert Python dict into a dataframe):

s = pd.Series(gender_count, name='gender count')
s.index.name = 'gender'
s.reset_index()

Which gives me the dataframe I want, but I don't know how to save these steps into a pandas DataFrame. I also tried using DataFrame.from_dict()

s2 = pd.DataFrame.from_dict(gender_count, orient='index')

But this creates a dataframe with the categories of gender as the index.

I eventually want to use gender categories and the count for a piechart.

Community
  • 1
  • 1
Lisadk
  • 325
  • 2
  • 6
  • 19

3 Answers3

3

Skip the intermediate step

gender = ['2',
          'Female',
          'All Female Group',
          'All Male Group',
          'Female',
          'Couple',
          'Mixed Group']

pd.value_counts(gender)

Female              2
2                   1
Couple              1
Mixed Group         1
All Female Group    1
All Male Group      1
dtype: int64
piRSquared
  • 285,575
  • 57
  • 475
  • 624
2
In [21]: df = pd.Series(gender_count).rename_axis('gender').reset_index(name='count')

In [22]: df
Out[22]:
              gender  count
0                  2      1
1  All Female Group,    222
2    All Male Group,    119
3            Couple,    256
4            Female,   1738
5              Male,   2077
6       Mixed Group,    212
7                 NA     16
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • I used your code, but this gives me the error message 'str' object is not callable, for 'gender' and 'count'. – Lisadk Mar 29 '17 at 16:53
  • 1
    @Lisadk: You're likely using an older version of pandas. See the output of `pd.__version__`. – root Mar 29 '17 at 18:40
0

what about just

s = pd.DataFrame(gender_count)
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
  • Since 'gender_count' is a dict, this does not work (ValueError: If using all scalar values, you must pass an index). EDIT: when you put in index=[0], it gives me the categories of gender as the columns. – Lisadk Mar 29 '17 at 16:39
  • This gives me the categories of gender as the columns. I would like 'gender' and 'count' as the colums. – Lisadk Mar 29 '17 at 16:52