1

I have a column in table as below

    Col1
    ========================
    "No","No","No","No","No"
    "No","No","No"
     Yes
     No
    "Yes","Yes","Yes","Yes"
    "Yes","No","Yes", "Yes

I am trying to remove duplicate No and Yes and create column like this

            Col1
    ========================
     No
     No
     Yes
     No
     Yes
     Yes, No

I started with

     kickDuplicates <- c("No","Yes")
     # create a list of vectors of place names
     broken <- strsplit(Table1$Col1, ",")
     # paste each broken vector of place names back together
     # .......kicking out duplicated instances of the chosen names
     Table1$Col1 <- sapply(broken, FUN = function(x)  paste(x[!duplicated(x)  
     | !x %in% kickDuplicates ], collapse = ", "))

But this is not working, i get the same original column with duplicates as before, can anybody tell me where I am going wrong ?

c("\"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\"", 
"\"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"Yes\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\"", 
"\"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\"", 
"\"No\", \"No\"", "\"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\", \"No\"", 
"No")
Veerendra Gadekar
  • 4,452
  • 19
  • 24
bison2178
  • 747
  • 1
  • 8
  • 22

1 Answers1

1

I think this will work as your final line:

Table1$Col1 <- sapply(broken,function(x) paste(unique(x), collapse=','))

Because I am a fan of package functional, here is an equivalent:

sapply(broken, Compose(unique, Curry(paste, collapse=',')))
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • 1
    Don't think so. Did you see the `dput`? Pretty messed up data set. I wonder where they got it from. – David Arenburg Jun 06 '15 at 22:13
  • @MatthewLundberg , David Actually I changed the `strsplit(Table1$Col1, ",")` to `strsplit(Table1$Col1, ",\\s*")` and used what what Matthew suggested and it worked – bison2178 Jun 06 '15 at 22:21