I have to work on a large database, with many varaibles to generate/modify, and so much code to write. I'm used to the Stata environment where everything you do is "inside" the database.
I'd like to break free from the "database$variable" syntax and be able to use a simple "variable" syntax.
An example of what I want to do: I have my database with the "age" variable and I want to recode it.
agecat1<-(floor(age) )
agecat1[ agecat1<10] <- NA
agecat1[ agecat1>=17] <- NA
describe(agecat1)
Of course, this code sample does not find the "age" variable.
To make it work, I can either attach my database before running it (works well, for the first part), or write it as follow (but it's exactly what I want to avoid):
agecat1<-(floor(db$age) )
agecat1[ agecat1<10] <- NA
agecat1[ agecat1>=17] <- NA
describe(agecat1)
And this is where I reach "attach()" limit: my new variable "agecat1" is NOT in my database, it's now an independent value which won't be affected by what I may do with my database (remove rows with NA for example).
So if I want my variable to be included in my DB, I need to write:
db$agecat1<-(floor(db$age) )
db$agecat1[ db$agecat1<10] <- NA
db$agecat1[ db$agecat1>=17] <- NA
describe(db$agecat1)
And I'm back to square 1, even if I used "attach()", I still have to use this painful db$variable syntax.
I read Post about attach(), Peter Ellis suggest attach as a good way to reproduce a "stata-like" environment but Brian Diggs explains very well my problem. The alternatives offered (with() and data=) are only ponctual and need to be repeated for each function (if I understood well) and thus are even more tedious than what I want to avoid.
Any way to work "inside" my database ?