2

I have a datatable where I want to remove singleton observations based on column b. I need some code that can remove a singleton observation (row) from a table based on the fact that it only appears once in the relevant data column (column b here).

I need the solution to be applicable to a wide variety of tables. So the solution must be based on the fact that the row is a singleton based on column b. Also, some tables may have multiple singletons. Table 3 below is an example that has 3 rows which are singletons.

I tried this code, but it gives me an empty table:

data_no_singleton <- filter(data, !table(data$b == 1))

Here are 3 example datatables that the code should be able to apply to:

data <- data.frame(a = c("OP2775iia","OP2775iib","OP2958i_a","OP2958i_b","OP2958iia","OP3023iia","OP3023iib"),
                    b = c("WAT","WAT","PAV","SAV","SAV","PAV","COM"),
                    c = c(10.9,12,5.6,1.23,8.99,45.6,30.2))

data <- data.frame(a = c("OP2775iia","OP2775iib","OP2958i_a","OP2958i_b","OP2958iia","OP3023iia","OP3023iib"),
                    b = c("SAV","SAV","SAV","WAT","COM","COM","COM"),
                    c = c(10.9,12,5.6,1.23,8.99,45.6,30.2))

data <- data.frame(a = c("OP2775iia","OP2775iib","OP2958i_a","OP2958i_b","OP2958iia","OP3023iia","OP3023iib"),
                    b = c("KAL","MOU","MOU","SAV","SAV","PAV","COM"),
                    c = c(10.9,12,5.6,1.23,8.99,45.6,30.2))

I need to have a table where the row containing any singletons in column b is removed.

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
KahBak
  • 35
  • 2

2 Answers2

2

You can do this in base R with

TAB= table(data$b)
data[ifelse(TAB[data$b]==1, FALSE, TRUE),]
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
G5W
  • 36,531
  • 10
  • 47
  • 80
1

We can use

library(dplyr)
data %>% 
    group_by(b) %>% 
    filter(n() > 1)
akrun
  • 874,273
  • 37
  • 540
  • 662