1

I have a dataset with the following format:

             query_phone         Day   Actor      ObjGrp
0              495393475  2017-09-21   Joana      din
1              676793475  2017-09-21   marta      ver
2              806494953  2017-09-21   joao       hav
3              595243631  2017-09-21   mark       din
4              444709531  2017-09-25   caty       ver
5              447159403  2017-09-25   rodin      tug
6              762976443  2017-09-25   rodin      tug
7              865853581  2017-09-25   john       han
8              441331962  2017-09-25   van        ver
9              261331962  2017-09-25   van        ver
10             455924196  2017-09-25   david      wog

May i had, the dataframe has 80000 rows.

I want to plot it's distribuition. What i mean by that is plot a line that for each combinaation phone, day, actor and objgrp tells me how many times that instance appears, that way i can tell repitive behaviours.

Does anyone know how? all the plot ways i find, not only don't accept string type, also don't let me choose the y-axis as quantity.

Thank you,

Mariana
  • 73
  • 8
  • 1
    If I got it right you want the frequency of every entry for each column. Is that correct? Or you do want to evaluate how many times the same row repeats itself over the file? – Chicrala Dec 13 '18 at 13:09
  • Thank you for answering, i want the the frequency of each unique combination, but the only way i gave to identify a row is by the index, in this case it would appear just one of the index of that combination, and it would say how many times it appears in total. – Mariana Dec 13 '18 at 13:19
  • What does your expected output look like? – Scott Boston Dec 13 '18 at 14:35

1 Answers1

1

There is a discussion in this Post where @DSM show how to concatenate entries with the same value in a given column, as the original post author suggested:

pd.concat(g for _, g in df.groupby("ID") if len(g) > 1)

This is assuming that you opened your data as a pandas dataframe.

If you have a look at the groupby function documentation you will be able to notice that you can group those itens by more than one column which, if I understood your question correctly, will return for you a list where you can easily see the repetitions and study their frequency.

Chicrala
  • 994
  • 12
  • 23