How to do a count with a unique key that is made up of several columns?

Question

In python3 and pandas I have this dataframe:

autores_naodeputados.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 0 entries
Data columns (total 19 columns):
IdAutor             0 non-null object
IdDocumento         0 non-null object
NomeAutor           0 non-null object
codigo_unico        0 non-null object
nome_deputado       0 non-null object
uf                  0 non-null object
nome_completo       0 non-null object
sequencial          0 non-null object
cpf                 0 non-null object
nome_urna           0 non-null object
partido_eleicao     0 non-null object
situacao            0 non-null object
AnoLegislativo      0 non-null object
CodOriginalidade    0 non-null object
DtEntradaSistema    0 non-null datetime64[ns]
DtPublicacao        0 non-null datetime64[ns]
Ementa              0 non-null object
IdNatureza          0 non-null object
NroLegislativo      0 non-null object
dtypes: datetime64[ns](2), object(17)
memory usage: 0.0+ bytes

It is a database on authorship of legislative projects. Column "NomeAutor" is the name of the politician.

The column "NroLegislativo" is the sequential number that the project receives in the year.

The "CodOriginalidade" column has other given sequential code, not all project types.

The column "IdNatureza" is the code that indicates which type of process (law, amendment, etc.).

Column "AnoLegislativo" is the year the project was submitted.

With these four fields united (NroLegislativo, CodOriginalidade, IdNatureza, AnoLegislativo) I have a unique key that differentiates the projects, in each political name.

Is there a way to count how many unique keys each politician has? So, to know how many projects each person has.

-/-

A sample of the rows look like:

autores_projetos[['NomeAutor', 'NroLegislativo', 'CodOriginalidade', 'IdNatureza', 'AnoLegislativo']].head(5).to_dict()
{'NomeAutor': {0: 'Vaz de Lima',
  1: 'Edmir Chedid',
  2: 'Roberto Engler',
  3: 'Campos Machado',
  4: 'Célia Leão'},
 'NroLegislativo': {0: '9', 1: '9', 2: '9', 3: '9', 4: '9'},
 'CodOriginalidade': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan},
 'IdNatureza': {0: '5', 1: '5', 2: '5', 3: '5', 4: '5'},
 'AnoLegislativo': {0: '2015', 1: '2015', 2: '2015', 3: '2015', 4: '2015'}}

I need to know something like:

NomeAutor
Gil Lancaster            386
Itamar Borges            200
Campos Machado           189
Carlos Giannazi          189
Cezinha de Madureira     165
Afonso Lobato            152
Mauro Bragato            149
...

The source is a groupby:

autores_deputados.groupby("NomeAutor").NroLegislativo.count().sort_values(ascending=False)

But as I said above in my case the unique key is made up of many fields

Thanks, but can nunique be used with multiple columns at the same time? — Reinaldo Chaves, Aug 16 '18 at 14:00
Yes, use `df = autores_deputados.groupby("NomeAutor").nunique()` — jezrael, Aug 16 '18 at 14:03
Thanks, but so - df = autores_projetos.groupby("NomeAutor").nunique() - it is only counted how many different authors there are. I need to know how many different keys (made up of 4 columns) exist for each author — Reinaldo Chaves, Aug 16 '18 at 14:10
No, it groups by author name, and for each group it find out the number of unique rows. (so, basically the number of unique values of the remaining 4 columns combined) — Rohith, Aug 16 '18 at 15:48

How to do a count with a unique key that is made up of several columns?

0 Answers0