-4

I'm not sure how to word this correctly. Basically I have two datasets. One for Total variables and the other for Common Variables. I want to generate a new dataset from the difference in variables in total vs. common. For example if Total has columns A,B,C,D,E,F,G,H and Common has A,B,C,D I want a new dataset with the remaining columns E,F,G,H. Would the drop function work in this case? I have over 300 columns, so I can't simply look to see which are different or missing in the common dataset. I would need a loop of some sort to look through columns 1-300 and determine which are not in my Total dataset and create a new dataset with the "missing" columns not found in the Common variables dataset.

  • Please read [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and edit your question – pogibas Jul 12 '18 at 15:15

1 Answers1

0

You can use setdiff to get the set of variable names, which belong to one dataset but do not belong to the other.

# Totatl Dataset w/300 variables: A, B, C, D, E, F, G, X8, X9, X10, ..., X300
Total <- data.frame(matrix(1:3000, ncol = 300, dimnames = list(NULL,  c(LETTERS[1:7], paste0("X", 8:300)))))

# Common Dataset w/6 variables: A, B, C, D, Foo, Bar
Common <- data.frame(matrix(1:60, ncol = 6, dimnames = list(NULL,  c(LETTERS[1:4], "Foo", "Bar"))))

# Dataset with remaining columns 
# which belongs to Total but not to Common
new1 <- Total[, setdiff(names(Total), names(Common))]

# Dataset with missing columns
# Which belongs to Common but not Total
new2 <- Common[, setdiff(names(Common), names(Total))]
Artem
  • 3,304
  • 3
  • 18
  • 41