4

I have a dataframe as showed below, and here i wanted to apply group by and count operations on it get the count of each category in a pydatatable way?.

here is a sample dt contains the different programming languages

prog_lang_dt = dt.Frame({"languages": ['html','R','R','html','R','javascript','R','javascript','html']})

Here is a code that i'm trying to apply group and count operations

prog_lang_dt[:,:,by(f.languages)]

Is there any count specific function for it in place of J ... DT[i,j,by]

Pasha
  • 6,298
  • 2
  • 22
  • 34
myamulla_ciencia
  • 1,282
  • 1
  • 8
  • 30

1 Answers1

3

The count() method can be used to find the number of elements in each group:

from datatable import dt, f, by, count

prog_lang_dt = dt.Frame(languages= ['html', 'R', 'R', 'html', 'R', 'javascript',
                                    'R', 'javascript', 'html'])
prog_lang_dt[:, count(), by(f.languages)]

produces

   | languages   count
-- + ----------  -----
 0 | R               4
 1 | html            3
 2 | javascript      2

[3 rows x 2 columns]

Although not needed for your example, but the function count can also take a column as an argument, in which case it will report the number of non-missing entries in that specific column.

Pasha
  • 6,298
  • 2
  • 22
  • 34