Best way to delete duplicities from a list of lists in R

Question

I have a list of lists with names, and I want to have a efficient way to delete duplicities, this is, I only want to preserve the first time a "word" appears (example: "hey"). This is the following code Im using right now.


del <- function(a,b){return(a[!a%in%b])}

for(i in 1:length(mcli)){ 

  temp<-mcli[[i]]
  mcli<-sapply(mcli,del,b=temp)
  mcli[[i]]<-temp
  #print(i)
}

The variable mcli is the list of lists:

mcli<-list()
mcli[[1]]<-c("hey","hou")
mcli[[2]]<-c("yei","hou")
mcli[[3]]<-c("yei","hey")

So the variable will be:

> mcli
[[1]]
[1] "hey" "hou"

[[2]]
[1] "yei" "hou"

[[3]]
[1] "yei" "hey"

This will generate empty lists inside the list of lists (the third list as all the words in it are duplicated from previous lists) , so at the end I run:

mcli<-Filter(Negate(function(X){length(X)==0}),mcli)

The result should be:

> mcli
[[1]]
[1] "hey" "hou"

[[2]]
[1] "yei"

Thank you in advance.

EDIT: SOLVED

MacOS · Answer 1 · 2020-05-22T18:29:51.357

How about

mcli <- list()
mcli[[1]] <- c("hey","hou")
mcli[[2]] <- c("yei","hou")
mcli[[3]] <- c("yei","hey")


delete.duplicates.from.list.of.lists <- function(list.of.lists) {
    df <- data.frame(
      list.names = rep.int(1:length(list.of.lists), times=unlist(lapply(list.of.lists,length))),
      list.values = unlist(list.of.lists)
    )

    df <- df[!duplicated(df$list.values), ]   
    list.of.lists.without.duplicates <- split(df$list.values, df$list.names)
    list.of.lists.without.duplicates <- lapply(list.of.lists.without.duplicates, as.character)

    return(list.of.lists.without.duplicates)
}

mcli <- delete.duplicates.from.list.of.lists(mcli)

score 0 · Accepted Answer · answered May 22 '20 at 18:07

Looks like this thread here, answered by @akrun, answers the question:

Remove duplicated elements from list

to adopt it to your code:

unmcli <- unlist(mcli)
res<- Map('[', mcli, relist(!duplicated(unmcli), skeleton = mcli))

And then you could remove the third element as you described.

Best way to delete duplicities from a list of lists in R

2 Answers2