Here is a way.
1. Column names
The column names are read with reaLines
, setting n = 1
, in order to read just the columns headers line. Then scan
with sep = ","
will break the line into column names.
library(sqldf)
col_names <- readLines(tmpfile, n = 1)
col_names
#[1] "mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb"
tc <- textConnection(col_names)
col_names <- scan(tc, sep = ",", what = character())
close(tc)
2. Data
The 4th column is "hp
. Read only that one with read.csv.sql
. The SQL statement is put together with sprintf
.
col_names[4]
#[1] "hp"
SQL <- sprintf("select %s from file", col_names[4])
SQL
#[1] "select hp from file"
hp <- read.csv.sql(tmpfile, sql = SQL)
str(hp)
#'data.frame': 6718464 obs. of 1 variable:
# $ hp: int 110 110 93 110 175 105 245 62 95 123 ...