0

I have the follwing dataframe:

Item Type Value
item1 A1 11
item1 A1 12
item2 A1 21
item2 A1 22
item3 A1 31
item3 A1 11
item4 A1 12
item4 A1 21
item5 A2 22
item5 A2 31

how can I count the unique occurrence of each item when the type=A1? in the above example, it should be 4

I was thinking of something like this:

df['Type']=='A1'
list=df['Item'].unique()
occurance=list.str.len()

is there a better and easier way to do that?

Kimchi
  • 97
  • 6
  • Do not overwrite the built-in `list` function. Also, what is uneasy about the way you are currently doing it? – It_is_Chris Nov 16 '21 at 21:20
  • 1
    Use [Boolean Indexing](https://pandas.pydata.org/docs/user_guide/10min.html#boolean-indexing) and [nunique](https://pandas.pydata.org/docs/reference/api/pandas.Series.nunique.html) `df.loc[df['Type'] == 'A1', 'Item'].nunique()` as recommended in [this answer](https://stackoverflow.com/a/45760042/15497888) by [Scott Boston](https://stackoverflow.com/users/6361531/scott-boston) – Henry Ecker Nov 16 '21 at 21:30
  • 1
    Or [groupby nunique](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.SeriesGroupBy.nunique.html) `df.groupby('Type')['Item'].nunique()` like [this answer](https://stackoverflow.com/a/15411596/15497888) by [Dan Allan](https://stackoverflow.com/users/1221924/dan-allan) if you want all unique item counts for all types. – Henry Ecker Nov 16 '21 at 21:31

3 Answers3

1

Use:

df.groupby('Type')['Item'].nunique()

Output:

Type
A1    4
A2    1

Only A1:

df.groupby('Type')['Item'].nunique()['A1']

Output: 4

mozway
  • 194,879
  • 13
  • 39
  • 75
1
len(df.loc[df["Type"]=="A1"]["Item"].unique())
intedgar
  • 631
  • 1
  • 11
1

You can use this one liner: df[df['Type']=='A1'][['Item']].drop_duplicates()

Then you can convert the output into list if you need it.

Yashar Ahmadov
  • 1,616
  • 2
  • 10
  • 21