0

I am new to R. I have some data from local elections in Mexico and I want to determine how many votes each party had in each municipality.

Here is an example of the data (political parties are all variables from PRI onwards, NAME_MUN is the name of the municipalities):

head(Campeche)

# A tibble: 6 x 14
  CABECERA_DISTRITAL        CIRCUNSCRIPCION NOMBRE_ESTADO NOM_MUN    PRI   PAN MORENA   PRD  PVEM    PT    MC
  <chr>                               <dbl> <chr>         <chr>    <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
1 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE   153   137     43     5     6     9     7
2 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE   109   113     52    15     9     4     5
3 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE   169   154     33    14    12     5     6
4 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE  1414  1474    415   154    73    62    53
5 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE   199   238     88    25    17    11    12
6 SAN FRANCISCO DE CAMPECHE               3 CAMPECHE      CAMPECHE   176   197     60    15     7    13    11
# … with 3 more variables: NVA_ALIANZA <dbl>, PH <dbl>, ES <dbl>

tail(Campeche)

CABECERA_DISTRITAL CIRCUNSCRIPCION NOMBRE_ESTADO NOM_MUN     PRI   PAN MORENA   PRD  PVEM    PT    MC
  <chr>                        <dbl> <chr>         <chr>     <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
1 SABANCUY                         3 CAMPECHE      CARMEN       83    74     21     7     0     3     1
2 SABANCUY                         3 CAMPECHE      CARMEN       68    47     28     5     3     4     1
3 SABANCUY                         3 CAMPECHE      CARMEN       56    72     16     1     0     1     1
4 SEYBAPLAYA                       3 CAMPECHE      CHAMPOTON    90   147      3     2     4     1     3
5 SEYBAPLAYA                       3 CAMPECHE      CHAMPOTON   141   161     39    30     4     9    15
6 SEYBAPLAYA                       3 CAMPECHE      CHAMPOTON    84    77      1     6     0     0     3
# … with 3 more variables: NVA_ALIANZA <dbl>, PH <dbl>, ES <dbl>

The data is disaggregated by electoral section, there is more than one electoral section for each municipality, what I am looking for is to obtain the total votes for each political party by municipality.

This is what I was doing, but I believe there is a faster way to do the same and that can be replicated for different municipalities with different parties.

results_Campeche <- Campeche %>% group_by(NOM_MUN) %>% 
  summarize(PRI = sum(PRI), PAN = sum(PAN), PRD = sum(PRD), MORENA = sum(MORENA),
          PVEM = sum(PVEM), PT = sum(PT), MC = sum(MC), NVA_ALIANZA = sum(NVA_ALIANZA),
          PH = sum(PH),ES = sum(ES), .groups = "drop")

head(results_Campeche)

NOM_MUN      PRI   PAN   PRD MORENA  PVEM    PT    MC NVA_ALIANZA    PH    ES
  <chr>      <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>       <dbl> <dbl> <dbl>
1 CALAKMUL    4861  5427   290    198    70   109    84         236     9    53
2 CALKINI     9035  1326   319  11714   684   194   282        4537    41   262
3 CAMPECHE   39386 32574  4394  11639  2211  2033  1451        4656  1995  4681
4 CANDELARIA  6060 11982    98    209    38    73   135          73    21    21
5 CARMEN     25252 38239  2505   9314  1164   708   712        1124   742   838
6 CHAMPOTON  16415  8500  3212   5387   457   636  1122        1034   203   340

Quinoba
  • 41
  • 8
  • Try `results_Campeche <- Campeche %>% group_by(NOM_MUN) %>% summarize(across(PRI:ES, sum))` or `summarise(across(where(is.numeric), sum)) ` (This requires dplyr 1.0.0 or greater, since across was introduce in May 2020.) Or – Jon Spring Mar 10 '21 at 07:44
  • BTW, it's most helpful to include example data in exactly the format you have it. The easiest way is often to use code like `dput(head(Campeche))`. This will generate code which recreates the data structure exactly. – Jon Spring Mar 10 '21 at 07:46

0 Answers0