I am working with the R programming language. I have the following table that is located on a server:
age=18:29
height=c(76.1,77,78.1,78.2,78.8,79.7,79.9,81.1,81.2,81.8,82.8,83.5)
gender=c("M","F","M","M","F","F","M","M","F","M","F","M")
testframe = data.frame(age=age,height=height,height2=height,gender=gender,gender2=gender)
head(testframe)
age height height2 gender gender2
1 18 76.1 76.1 M M
2 19 77.0 77.0 F F
3 20 78.1 78.1 M M
4 21 78.2 78.2 M M
5 22 78.8 78.8 F F
6 23 79.7 79.7 F F
My goal is to remove columns from the above tables that are identical, even if they have different names. Thus, the end product would look like this:
age height gender
1 18 76.1 M
2 19 77.0 F
3 20 78.1 M
4 21 78.2 M
5 22 78.8 F
6 23 79.7 F
My Problem: I want to remove these duplicate columns from the table directly on the server. I have been reading about the "dbplyr" library in R, and I am not sure if this is able to do this. I tried the two following methods:
Method 1:
library(RODBC)
library(sqldf)
library(dplyr)
library(dbplyr)
library(odbc)
con = odbcConnect("some name", uid = "some id", pwd = "abc")
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
copy_to(con, testframe)
final_1 = testframe %>%
dplyr::select(., which(as.list(.) %>%
duplicated %>%
`!`))
Method 2:
mtcars2 <- tbl(con, "testframe")
final_2 = mtcars2 %>%
dplyr::select(., which(as.list(.) %>%
duplicated %>%
`!`))
In the above examples, the data ("testframe") is in my local environment, but I tried to use the copy_con
statement in an attempt to see if these statements would work if the data was on a server.
My Question: The code from the above examples seemed to have worked - but I am not sure if the above code can only perform these statements on data in the local environment - or it can also perform these same statements on data located on a server. Can someone please comment on this?
Thanks!