3

I have a List with many dataframes. Each dataframe contains duplicate columns. I would like to return only the unique columns in each dataframe. I have tried several codes including below, but continue to get errors. The code I'm presently using is below and a description of the first dataframe in my List is listed as well. I appreciate any help.

x  <- lapply(dataFiles, function(x){
  for(i in 1:length(colnames(dataFiles)))
  dataFiles[[!duplicated(dataFiles[[i]])]]
}
)



str(dataFiles[[1]])
'data.frame':   20381 obs. of  10 variables:
 $ FILEID    : chr  "ACSSF" "ACSSF" "ACSSF" "ACSSF" ...
 $ FILETYPE  : num  2.01e+08 2.01e+08 2.01e+08 2.01e+08 2.01e+08 ...
 $ STUSAB    : chr  "ny" "ny" "ny" "ny" ...
 $ CHARITER  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ SEQUENCE  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ LOGRECNO  : int  3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 ...
 $ B00001_001: int  212 215 278 246 235 NA 225 522 213 262 ...
 $ B00002_001: int  108 124 126 105 122 NA 108 105 104 140 ...
 $ LOGRECNO  : int  3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 ...
 $ GEOID     : chr  "14000US36001000100" "14000US36001000200" "14000US36001000300" "14000US36001000401" ...
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
user3067851
  • 524
  • 1
  • 6
  • 20

1 Answers1

5

Here is a simple example:

tmp <- data.frame(seq(10), seq(10), rnorm(10))
colnames(tmp) <- c("A","A","B")

l <- list(tmp, tmp)

lapply(l, function(x) x[,!duplicated(colnames(x))])

or as noted by @agstudy you could use unique

lapply(l, function(x) x[,unique(colnames(x))])
cdeterman
  • 19,630
  • 7
  • 76
  • 100
  • unique will return every column once, !duplicated will return columns that appear exactly once, no?? – latorrefabian Jan 21 '16 at 20:20
  • @latorrefabian no. `x = c(1, 1, 2); x[!duplicated(c(1, 1, 2))]`. (Gives `#[1] 1 2`). See also `?duplicated`, [How do I remove ALL duplicates so that none are left?](http://stackoverflow.com/a/13763299/903061) – Gregor Thomas Jan 21 '16 at 20:24
  • @ cdeterman ...that was too simple. I really didn't think along those lines since I'm working with a list of dataframes. THANKS. – user3067851 Jan 21 '16 at 20:39