0

I have a dataframe of this type (only there are like 15 columns)

         electronic_used    how_it_works    what_it_says     how_it_looks
         smartphone           right          right           right
         computer             wrong          wrong           wrong
         smartphone           wrong          wrong           wrong
         computer             right          right           right

What I want is to organize it so that the answers are like these:

         electronic_used    how_it_works    what_it_says     how_it_looks
         smartphone_right          1          1                 1
         computer_right            2          2                 2
         smartphone_wrong          2          1                 1
         computer_wrong            1          3                 2

I am not sure how to do this, but basically I want to organize it by the electronic_used and counting how many right and wrongs there were for each category.

Any help would be appreciated!

aware
  • 37
  • 6
  • It might be a good idea to make the title more specific than "organizing columns". E.g. how about "Counting dataframe values by category in multiple columns"? – Bill Jul 19 '20 at 23:59
  • In fact, I think this might be a duplicate of this question: [How to get value counts for multiple columns at once in Pandas DataFrame?](https://stackoverflow.com/q/32589829/1609514) – Bill Jul 20 '20 at 00:01

3 Answers3

1

You can do melt before crosstab

s=df.melt('electronic_used')
new=pd.crosstab(s['electronic_used']+s['value'],s['variable']).reset_index()
new
variable            row_0  how_it_looks  how_it_works  what_it_says
0           computerright             1             1             1
1           computerwrong             1             1             1
2         smartphoneright             1             1             1
3         smartphonewrong             1             1             1
BENY
  • 317,841
  • 20
  • 164
  • 234
  • I get this when I try the second line: TypeError: can only concatenate str (not "float") to str – aware Jul 20 '20 at 02:20
0

After melting use pivot_table:


df1 = df.melt(id_vars = 'electronic_used')
df1.assign(electronic_used = df1.electronic_used + '_' + df1.value, value = 1)\
  .pivot_table(columns = 'variable',index = 'electronic_used', values = 'value', aggfunc = 'sum')\
  .reset_index() 

result:

variable   electronic_used  how_it_looks  how_it_works  what_it_says
0           computer_right             1             1             1
1           computer_wrong             1             1             1
2         smartphone_right             1             1             1
3         smartphone_wrong             1             1             1

Please note how new values of column electronic_used are changed by concatenating columns:

electronic_used = df1.electronic_used + '_' + df1.value

ipj
  • 3,488
  • 1
  • 14
  • 18
  • I get this when I try the second line: TypeError: can only concatenate str (not "float") to str – aware Jul 20 '20 at 02:23
  • Maybe in Your dataset columns values are also numeric so the error occurs. When running code on the data as You provided there is no error on my side. – ipj Jul 20 '20 at 08:47
0

Since there are only two possible values, 'right' or 'wrong', you can use a groupby and sum for both boolean comparisons.

data = """
 smartphone           right          wrong           wrong
 computer             wrong          wrong           wrong
 smartphone           right          right           wrong
 computer             right          right           right
"""
columns = ["electronic_used", "how_it_works", "what_it_says", "how_it_looks"]
df = pd.DataFrame(np.array(data.split()).reshape((4, 4)), columns=columns)
df = df.set_index('electronic_used')
right_counts = (df == "right").astype('int').groupby('electronic_used').sum()
wrong_counts = (df == "wrong").astype('int').groupby('electronic_used').sum()
print(right_counts)
print(wrong_counts)

Output:

                 how_it_works  what_it_says  how_it_looks
electronic_used                                          
computer                    1             1             1
smartphone                  2             1             0
                 how_it_works  what_it_says  how_it_looks
electronic_used                                          
computer                    1             1             1
smartphone                  0             1             2
Bill
  • 10,323
  • 10
  • 62
  • 85
  • When I do it like this, it does order things in the right way, but the count is 0 for all of them which is not true... I don't understand why it doesn't work. – aware Jul 20 '20 at 02:18
  • I added the full code and changed some of the data so you can see how it works. – Bill Jul 21 '20 at 16:47
  • Perfect! I wasn't doing something right initially but it worked in the end, thank you! – aware Jul 21 '20 at 18:43