0

I have a dataframe "c1" with one column as "region".

sum(is.na(c1$region))
[1] 2

class(c1$region)
[1] "factor"

However, when I use paste()

f1<-paste("c1","$","region",sep="")

> f1
[1] "c1$region"

> sum(is.na(f1))
[1] 0

I tried as.name(f1) and as.symbol(f1). Both convert f1 to the "name" class. noquote(f1) converts the char[1] element to the "noquote" class.

> f2<-as.name(f1)

> f2
`c1$region`

> sum(is.na(f2))
[1] 0
Warning message:
In is.na(f2) : is.na() applied to non-(list or vector) of type 'symbol'

> class(f2)
[1] "name"

I want to retain the class of c1$region while being able to use it in queries such as sum(is.na(f2)). Please help.

Ujjwal
  • 3,088
  • 4
  • 28
  • 36
  • 1
    See [**here**](http://stackoverflow.com/a/18228613/1478381) for more information on why you can't do what you are trying to do. – Simon O'Hanlon Sep 03 '14 at 13:32
  • Although it is not a popular method, `sum(is.na(eval(parse(text = f1))))` will work with your `f1` – David Arenburg Sep 03 '14 at 13:53
  • 3
    Building variable names and indexing as if they were plain strings is a terrible idea in R. Are you sure this is really required? Seems like you should be using proper `list()` objects to store your related data rather than a bunch of different variables in your global environment. – MrFlick Sep 03 '14 at 13:54

1 Answers1

2

I'm not 100% sure I understand what you are trying to do, but maybe this will help:

c1 <- data.frame(region=c(letters[1:3], NA))
clust <- 1
variable <- "region"

f1 <- get(paste0("c", clust))[[variable]]   # <--- key step
class(f1)
# [1] "factor"
sum(is.na(f1))
# [1] 1

In the key step, we use get to fetch the correct cluster data frame using its name as a character vector, and then we use [[, which unlike $, allows us to use a character variable to specify which column we want.

BrodieG
  • 51,669
  • 9
  • 93
  • 146
  • 1
    @Ujjwal, note MrFlick's comment. I just got lazy and decided against going on a diatribe, but if you're programmatically referring to variables in your workspace (as is the case with the `cX` variables here), you should really consider having them in a list. – BrodieG Sep 03 '14 at 14:18