I want to do a time series analysis with fixed effects. My current dataset has about 500 observations of party manifestos. For most parties I have 2-6 manifestos. I now want to delete the couple of parties with only 1 manifesto from my dataset. How can this be done?
Asked
Active
Viewed 32 times
0
-
hard to answer without sample data. – Wimpel Jul 07 '21 at 07:42
-
Well I thought there might be some sort of command like "delete if frequency of partyname is <2" :D didnt think you would need to see data for that – Newbie1994 Jul 07 '21 at 07:45
-
There most certainly is. But how to implement it, is based on your data. Is it a vector, matrix, data.frame, data.table? Character, numeric or factor? Are you working within the tidyverse, data.table or base R? Do you want to delete the selected rows, or just filter them out? And so on and so on... That is why there is a faq: https://stackoverflow.com/a/5963610/6356278 – Wimpel Jul 07 '21 at 08:00
-
Right, sry for not being more specific, it was my first time posting here. I actually found a way to do it within the tidyverse in a different thread now. In case someone who is still wondering stumbles upon this thread: library(dplyr) new.dataset <- dataset %>% group_by(Variable) %>% filter(n() > 1) %>% ungroup() – Newbie1994 Jul 07 '21 at 08:01
-
Before seeing your comment I produced an answer based on the same approach -- see below. – Andy Eggers Jul 07 '21 at 08:13
1 Answers
0
With this dplyr
-based approach, we do a group_by
on partyname
, followed by n()
to get the number of rows per group, and then filter()
on that.
library(tidyverse)
set.seed(1234)
n <- 20
tibble(partyname = sample(c("blue", "red", "green"), size = n, replace = T), x = rnorm(n)) %>%
group_by(partyname) %>%
mutate(n = n()) %>%
filter(n > 4)
#> # A tibble: 16 x 3
#> # Groups: partyname [2]
#> partyname x n
#> <chr> <dbl> <int>
#> 1 red 0.0183 11
#> 2 red 0.705 11
#> 3 green 0.868 5
#> 4 red 0.00501 11
#> 5 red -0.0376 11
#> 6 green 0.724 5
#> 7 red -0.497 11
#> 8 red 0.0114 11
#> 9 red 0.00986 11
#> 10 green 0.678 5
#> 11 red 1.03 11
#> 12 red -1.73 11
#> 13 red -2.20 11
#> 14 red 0.543 11
#> 15 green 0.163 5
#> 16 green 1.24 5
Created on 2021-07-07 by the reprex package (v2.0.0)

Andy Eggers
- 592
- 2
- 10
-
Does my answer solve your problem? If so, please designate it as the correct answer. – Andy Eggers Jul 07 '21 at 08:24
-