-1

in the screenshot below you'll find a dataframe that contains string values in each cell. What i would like to do is to create a new dataframe out of this one that contains 3 columns: 'Very interested' 'Somewhat interested', and 'Not interested'. I don't know how to transform the original df into this new one, i tried just counting the values that meets a condition like 'Very interested' and putting them into a new df but the numbers don't seem right.

i would appreciate any help here. Thank you.

Df to transform EDIT: here is also the code to reproduce a dataframe similar to the one in the screenshot:

df = pd.DataFrame({1: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 2: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 3: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 4: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 5: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested'], 6: ['Very interested', 'Not interested', 'Somewhat interested', 'Very interested', 'Not interested', 'Somewhat interested']}, 
                 index=['Big Data','Data Analysis','Data Journalism', 'Data Visualization', 'Deep Learning', 'Machine Learning'])

As per the desired output, it should be something like this:

Desired output

Miguel 2488
  • 1,410
  • 1
  • 20
  • 41
  • Could you include your expected output dataframe in your post? – rahlf23 Sep 12 '18 at 14:26
  • Please read up on [how to ask a good pandas question](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Your code contains no usable input (because you pasted an image), no expected output, and shows no research effort. – DSM Sep 12 '18 at 14:36
  • @rahlf23 Sorry, i just edited the question and added what you were asking for – Miguel 2488 Sep 13 '18 at 08:04

1 Answers1

1

I think need reshape by melt and then get counts by GroupBy.size with Series.unstack:

df = (df.rename_axis('val')
        .reset_index()
        .melt('val', var_name='a', value_name='b')
        .groupby(['val','b'])
        .size()
        .unstack(fill_value=0))

Another solution withstack, counts by SeriesGroupBy.value_counts with Series.unstack:

df = (df.stack()
        .groupby(level=0)
        .value_counts()
        .unstack(fill_value=0))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi @jezrael Thank you very much!! that was exactly what i'm looking for. I'm glad to see someone understood what i was asking :) – Miguel 2488 Sep 13 '18 at 07:21
  • 1
    @Miguel2488 - There is problem in your question not possible copy data, you can improve your question by add `df = pd.DataFrame({1: ['vi', 'ni', 'ni'], 2: ['vi', 'ni', 'vi'], 3: ['vi', 'si', 'si'], 4: ['si', 'vi', 'vi']}, index=['dv','ml','das'])` - be free modify it ;) – jezrael Sep 13 '18 at 07:24
  • Allright, i'm editing it – Miguel 2488 Sep 13 '18 at 07:57