0

Each Asset ID has two types of problems ("RSL Critical Deviation" Or "RSL and TX Power Deviation") and we need to count the repetition for each of them within the same asset ID

Asset    ID    Categorization Tier 3
4053     0001  RSL Critical Deviation
4054     0001  RSL and TX Power Deviation
3342     0005  RSL and TX Power Deviation
3343     0005  RSL and TX Power Deviation
3344     0005  RSL and TX Power Deviation
3345     0005  RSL and TX Power Deviation
3346     0005  RSL and TX Power Deviation
4363     0040  RSL and TX Power Deviation
4055     0046  RSL Critical Deviation
4056     0046  RSL Critical Deviation

Result should be

Asset ID  Categorization Tier 3     Count 
0001      RSL Critical Deviation        1
          RSL and TX Power Deviation    1
0005      RSL Critical Deviation        0
          RSL and TX Power Deviation    5
eyllanesc
  • 235,170
  • 19
  • 170
  • 241

1 Answers1

0
df.groupby(['ID', 'Categorization']).size()

A groupby statement in SQL can be applied to multiple columns. Such holds in Pandas too. Pandas way of achieving SQL's COUNT aggregation is to use either sizeor count. Their difference was answered at this SO question.

Sıddık Açıl
  • 957
  • 8
  • 18
  • Welcome to Stack Overflow! While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – double-beep Jun 23 '19 at 14:39
  • @double-beep I expanded my answer. Thank you. – Sıddık Açıl Jun 23 '19 at 14:47