1

I have read several other posts about how to import csv files with read.csv but skipping specific columns. However, all the examples I have found had very few columns, and so it was easy to do something like:

 columnHeaders <- c("column1", "column2", "column_to_skip")
 columnClasses <- c("numeric", "numeric", "NULL")
 data <- read.csv(fileCSV, header = FALSE, sep = ",", col.names = 
 columnHeaders, colClasses = columnClasses)

I have 201 columns, without column labels. I would like to skip the last column. How would it be possible to do this without naming all the other columns to keep? Many thanks.

R. Schifini
  • 9,085
  • 2
  • 26
  • 32
dede
  • 1,129
  • 5
  • 15
  • 35
  • 1
    What about? `columnClasses <- c(rep("numeric",200), "NULL")` – S Rivero Nov 15 '17 at 17:20
  • 1
    Or just read all the columns in and then eliminate the columns you don't like afterwards? `data <- read.csv("../CAASPP_clustering/ca2016_1_csv_v3.zip") data_trimmed <- data[,1:(ncol(data)-1)]` – leerssej Nov 15 '17 at 17:23
  • For your column names you can use: `columnHeaders<- c(sprintf("column%d", 1:200))` – S Rivero Nov 15 '17 at 17:28

2 Answers2

0

Bit hacky but, I usually read in a small number of the rows of the dataset I want, then use sapply(..., class) to find the column types and set the last one to "NULL".

data<-read.table("test.csv", sep=',', nrows = 100)
colClasses<-sapply(data, class)
colClasses[length(colClasses)]<-"NULL"

Then you can pass this colClasses to your read.csv() function

Andrew Haynes
  • 2,612
  • 2
  • 20
  • 35
0

You can just read in all the data and then eliminate the offenders afterwards.

data <- read.csv("../CAASPP_clustering/ca2016_1_csv_v3.zip")
data_trimmed <- data[,1:(ncol(data)-1)]

If you prefer to screen the classes more programmatically then you could do something like this:

class_list <- lapply(data, class)
chosen_cols <- names(class_list[class_list != "NULL"])
data_trimmed <- data[chosen_cols]
leerssej
  • 14,260
  • 6
  • 48
  • 57