It really depends on when you want to use your variable "labels". While doing your data analysis, you definitely want to keep your short, concise variable names, otherwise you end up in a scenario of
lm(Sex of Participant ~ `Year of Participation`, data=data)
which is not valid syntax, and a heck of a bother to type again and again and agian (whops, typos!).
And when you've finished your analysis, your boss asks you to rename the age "label" to "Participant age", and there goes the analysis until you've searched and replaced every occurrence of the previous variable name.
So, the case should be clear for keeping concise variable names during coding (and you are not arguing against this in your question).
I am guessing you want variable labels for presentation. How to apply variable labels depends entirely on how you are presenting your data. I'll give a few examples.
Output to console:
> data
age sex year
1 12 1 1998
2 14 0 1997
3 16 1 1994
In this case I would store the labels in a named vector, which also defines the order of the columns. In this case we can
labels <- c(age='Age of participant', sex="Sex of Participant", year="Year of Participation")
present <- data[,names(labels)]
colnames(present) <- labels
> present
Age of participant Sex of Participant Year of Participation
1 12 1 1998
2 14 0 1997
3 16 1 1994
Plotting data:
plot(data[,c('age','year'])
Want to print proper labels? Use xlab
and ylab
:
plot(data[,c('age','year'], xlab='Age of participant', ylab='Year of participation')
Plotting data using ggplot2:
Again, the axis labels are polishing and are applied separatly
ggplot(data, aes(x=age, y=year)) + geom_point() + labs(x='Age of participant', y='Year of participation')
And if you wanted to make a really small plot, perhaps you would scoot in a newline (\n
) to break the label into two lines.
Formatted tables using xtable
:
This is actually the same approach as with "output to console".
Conclusion:
I hope I have convinced you why this is not a trivial answer, that variable labels "are not a thing" in R, because their application differs widely.
Although the renaming example supports the case for having labels. There is however not a structure for containing this meta data throughout the R analysis, as many functions from hoards of packages routinely strips of input data.frames of their attributes.
You are more than welcome to ask a new question here on Stackoverflow when you have a specific use case in mind for displaying labels for variables.