I have a large table containing thousands of entries queried from a database having a structure similar to that in Table 1 in the image below. I would like to keep the duplicate row that has the highest value for Var 1, as shown in Table 2. The situation is similar to that is described in an earlier query in this forum remove duplicates based on one column and keep last entry. Selecting the rows by using a simple for
loop works but, it is taking a long time to run. Is there a faster elegant way of handling this in R?
Table1 <- structure(list(Var1 = 1001:1009, Var2 = c("AAA", "BBB", "CCC",
"AAA", "DDD", "BBB", "AAA", "EEE", "DDD"), Var3 = c(95L, 100L,
90L, 95L, 85L, 100L, 95L, 45L, 85L), Var4 = c("mg", "kg", "pg",
"mg", "mg", "kg", "mg", "mg", "mg")), .Names = c("Var1", "Var2",
"Var3", "Var4"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-9L), spec = structure(list(cols = structure(list(Var1 = structure(list(), class = c("collector_integer",
"collector")), Var2 = structure(list(), class = c("collector_character",
"collector")), Var3 = structure(list(), class = c("collector_integer",
"collector")), Var4 = structure(list(), class = c("collector_character",
"collector"))), .Names = c("Var1", "Var2", "Var3", "Var4")),
default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))