0

I want to do a time series analysis with fixed effects. My current dataset has about 500 observations of party manifestos. For most parties I have 2-6 manifestos. I now want to delete the couple of parties with only 1 manifesto from my dataset. How can this be done?

  • hard to answer without sample data. – Wimpel Jul 07 '21 at 07:42
  • Well I thought there might be some sort of command like "delete if frequency of partyname is <2" :D didnt think you would need to see data for that – Newbie1994 Jul 07 '21 at 07:45
  • There most certainly is. But how to implement it, is based on your data. Is it a vector, matrix, data.frame, data.table? Character, numeric or factor? Are you working within the tidyverse, data.table or base R? Do you want to delete the selected rows, or just filter them out? And so on and so on... That is why there is a faq: https://stackoverflow.com/a/5963610/6356278 – Wimpel Jul 07 '21 at 08:00
  • Right, sry for not being more specific, it was my first time posting here. I actually found a way to do it within the tidyverse in a different thread now. In case someone who is still wondering stumbles upon this thread: library(dplyr) new.dataset <- dataset %>% group_by(Variable) %>% filter(n() > 1) %>% ungroup() – Newbie1994 Jul 07 '21 at 08:01
  • Before seeing your comment I produced an answer based on the same approach -- see below. – Andy Eggers Jul 07 '21 at 08:13

1 Answers1

0

With this dplyr-based approach, we do a group_by on partyname, followed by n() to get the number of rows per group, and then filter() on that.

library(tidyverse)
set.seed(1234)
n <- 20
tibble(partyname = sample(c("blue", "red", "green"), size = n, replace = T), x = rnorm(n)) %>% 
  group_by(partyname) %>% 
  mutate(n = n()) %>% 
  filter(n > 4)
#> # A tibble: 16 x 3
#> # Groups:   partyname [2]
#>    partyname        x     n
#>    <chr>        <dbl> <int>
#>  1 red        0.0183     11
#>  2 red        0.705      11
#>  3 green      0.868       5
#>  4 red        0.00501    11
#>  5 red       -0.0376     11
#>  6 green      0.724       5
#>  7 red       -0.497      11
#>  8 red        0.0114     11
#>  9 red        0.00986    11
#> 10 green      0.678       5
#> 11 red        1.03       11
#> 12 red       -1.73       11
#> 13 red       -2.20       11
#> 14 red        0.543      11
#> 15 green      0.163       5
#> 16 green      1.24        5

Created on 2021-07-07 by the reprex package (v2.0.0)

Andy Eggers
  • 592
  • 2
  • 10