?read.csv
describes check.names=
in a similar fashion:
check.names: logical. If 'TRUE' then the names of the variables in the
data frame are checked to ensure that they are syntactically
valid variable names. If necessary they are adjusted (by
'make.names') so that they are, and also to ensure that there
are no duplicates.
The default action is to allow you to do something like dat$<column-name>
, but unfortunately dat$if
will fail with Error: unexpected 'if' in "dat$if"
, ergo check.names=TRUE
changing it to something that the parser will not trip over. Note, though, that dat[["if"]]
will work even when dat$if
will not.
If you are wondering if check.names=FALSE
is ever a bad thing, then imagine this:
dat <- read.csv(text = "a,a\n2,3")
dat
# a a.1
# 1 2 3
dat <- read.csv(text = "a,a\n2,3", check.names = FALSE)
dat
# a a
# 1 2 3
In the second case, how does one access the second column by-name? dat$a
returns 2
only. However, if you don't want to use $
or [[
, and instead can rely on positional indexing for columns, then dat[,colnames(dat) == "a"]
does return both of them.