-1

I have this dataframe which has a number of rows, every row has the number of purchased items, and then all the item's names, one per column. If there are are less items than columns, there are NaN values.

   Count  Column1  Column2  Column3  Column4
 0     1        a      NaN      NaN      NaN
 1     3        c        a        b      NaN
 2     2        e        b      NaN      NaN
 3     4        b        c        d        f

I need a dataframe which has as labels the items and as values True or False, depeding if in that row the item was present.

   Count       a        b        c        d        e        f
 0     1    True    False    False    False    False    False
 1     3    True     True     True    False    False    False
 2     2   False     True    False    False     True    False
 3     4   False     True     True     True    False     True

I have no idea how can I get this.

Edit: Found a solution that works for me:

from mlxtend.preprocessing import TransactionEncoder

dataset =  df.drop('Count', axis=1).T.apply(lambda x: x.dropna().tolist()).tolist()
te = TransactionEncoder()
te_ary=te.fit(dataset).transform(dataset)
df = pd.DataFrame(te_ary, columns=te.columns_)

Francesca
  • 11
  • 1
  • Does this answer your question? [Renaming columns in Pandas](https://stackoverflow.com/questions/11346283/renaming-columns-in-pandas) – kennyvh Jun 15 '21 at 16:03

2 Answers2

2

Try with set_index + stack to reshape then pd.get_dummies then sum level=0:

pd.get_dummies(df.set_index('Count').stack()).sum(level=0).astype(bool).reset_index()
   Count      a      b      c      d      e      f
0      1   True  False  False  False  False  False
1      3   True   True   True  False  False  False
2      2  False   True  False  False   True  False
3      4  False   True   True   True  False   True
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
2

Try,

dfm = df.melt('Count')
pd.crosstab(dfm['Count'], dfm['value']).astype(bool).reset_index()

Output:

value  Count      a      b      c      d      e      f
0          1   True  False  False  False  False  False
1          2  False   True  False  False   True  False
2          3   True   True   True  False  False  False
3          4  False   True   True   True  False   True
Scott Boston
  • 147,308
  • 15
  • 139
  • 187