-1

I have a dataset of some products with their units and price. I want to work on price that falls in one category. For e.g

The data looks like this,

CATEGORY   UNIT    PRICE
Gloves     pair    50
Gloves     pack    100
Gloves     unit    80
Comb       set     150
Comb       pack    100

Considering the above data, the data can be categorized in two bins Gloves and Comb, which then further contains 3 and 2 bins respectively. Gloves - (pair, pack, unit), Comb - (set, pack).

I did found some useful answers but that were only for 1-D data. How can I do so for such data ?

EDIT: The link groupby was not quite helpful because it showed grouping for 2 columns, but I need grouping for 3 columns in my case (CATEGORIES->UNITS->PRICE).

kunal
  • 365
  • 1
  • 4
  • 16

1 Answers1

1

You can do a groupby on CATEGORY and then apply a list operation to UNIT

df.groupby('CATEGORY')['UNIT'].apply(list).reset_index()

  CATEGORY                UNIT
0     Comb         [set, pack]
1   Gloves  [pair, pack, unit]

df.groupby('CATEGORY')['UNIT'].apply(list).reset_index().values

array([['Comb', list(['set', 'pack'])],
       ['Gloves', list(['pair', 'pack', 'unit'])]], dtype=object)
gold_cy
  • 13,648
  • 3
  • 23
  • 45
  • What could I do for the prices. Applying `groupby` on `unit` would consider all prices (irrespective of category) and that would be wrong. – kunal Apr 25 '19 at 11:41