I have a simple csv file called "test.csv" with the following content:
colA,colB,colC
1,"x",12
2,"y",34
3,"z",56
Let's say I want to skip reading in colA and just read in colB and colC. I want a general way to do this because I have lots of files to read in and sometimes colA is called something else altogether but colB and colC are always the same.
According to the read_csv documentation, one way to accomplish this is to pass a named list for col_types and only name the columns you want to keep:
read_csv('test.csv', col_types = list(colB = col_character(), colC = col_numeric()))
By not mentioning colA it should get dropped from the output. However, the resulting data frame is:
Source: local data frame [3 x 3]
colA colB colC
1 1 x 12
2 2 y 34
3 3 z 56
Am I doing something wrong or is the read_csv documentation not correct? According to the help file:
If a list, it must contain one "collector" for each column. If you only want to read a subset of the columns, you can use a named list (where the names give the column names). If a column is not mentioned by name, it will not be included in the output.