I have a list of 130
dataframes each with 27
columns and 2
factor levels per dataframe. I want to remove the duplicated rows in each dataframe based on 3
columns for one factor level only, keeping all rows in the other factor level and their duplicates.
I have sorted all the dataframes according to the factor levels and then I tried to remove the duplicated rows only for the first factor level.
The list is called x
and i
index between the dataframes in list with x[[i]]
, with i
running from 1
to 130
.
The column in every dataframe called temp
contains 2
factor levels, either 0
or 1
. The 130
dataframes have been ordered according to level = 0
first and then level=1
.
for (i in 1:130)
{
x[[i]]$temp <- factor(x[[i]]$temp,levels = c(0,1))
# Creating 2 factor levels called `0` and `1` in column called `temp` and index position of the `temp` column is `24`
x[[i]] <- x[[i]][order(x[[i]]$temp),]
# Ordering all of the dataframes by levels; level = 0 first then level = 1
x[[i]] <- x[[i]][!(duplicated(x[[i]][c(2,27,25)])),]
# This is removing duplicated based on columns 2,27 and 25, but I to perform this only for the first factor level = 0
}