1

I am sub-setting a dataframe by selecting specific columns. As in the example data below, one column heading (i.e. 80% height) is spaced.

df<-data.frame(x1=c(1020,2053,1840,3301,2094),
           x11=c(816,1642.4,1472,2640.8,1675.2),
           x2=c(584,187,746,177,483))
names(df)<-c("height","80% height","length")

I want to select the first two columns. From the link Extracting specific columns from a data frame one finds helpful discussion and solution about extracting specific columns in a dataframe. Following the link I tried several approaches including

library(dplyr)
df %>% subset(df, select=c(height,"80% height"))

but these end up with an error accompanied by the following message

Error in subset.data.frame(., df, select = c(height, "80% height")) : 
'subset' must be logical

I want to get this resolved. Thank you for your help!

T Richard
  • 525
  • 2
  • 9

1 Answers1

2

We can use backquotes to select those unusual names i.e. column names that doesn't start with letters

subset(df, select = c(height, `80% height`))

-output

#   height 80% height
#1   1020      816.0
#2   2053     1642.4
#3   1840     1472.0
#4   3301     2640.8
#5   2094     1675.2

Also, the dplyr use with specifying df twice is not needed. We can have select function from dplyr

library(dplyr)
df %>%
     select(height, `80% height`)

-output

#   height 80% height
#1   1020      816.0
#2   2053     1642.4
#3   1840     1472.0
#4   3301     2640.8
#5   2094     1675.2

It may be also better to remove spaces and append a letter for those column names that start with numbers. clean_names from janitor does

library(janitor)
df %>%
    clean_names()
akrun
  • 874,273
  • 37
  • 540
  • 662