2

I've just started learning R and am curious about attribute propagation.

I'd like to annotate a vector with custom values (I'm assuming attributes are the right way to do this?), which I've been able to do fairly easily. (Giving background in case this is an X-Y question)

The problem begins when I start manipulating these vectors - I'd like these custom annotations to propagate, or at the very least, have a well defined set of rules for annotation propagation/loss.

I've done some research on this, including this other SO question, that addresses the subsetting function in particular, but I'd like to generalize it a bit further:

  1. What is the complete list of functions that do not propagate values, or
  2. how do I find this out?
  3. Is there a better way to accomplish what I'm doing?

The goal is to apply these annotations, call arbitrary (as much as possible) R functions on the data, and ensure the attributes are maintained. Data frames in particular are of importance here as well.

Thanks

Community
  • 1
  • 1

2 Answers2

2

I think you need to adopt the practice of making the "custom values" into data columns rather than using attributes. Calling this an X-Y problem is not terribly specific, but it hints at the notion that you have positional, numeric data and you want to have character data registered by row. This is exactly what dataframes are designed to support.

Just wrapping c() around a vector is enough to strip its attributes, so the class and attributes are fairly fragile. A dataframe is a list so this suggestion is really not in contraposition to flodel's suggestion.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanks, yeah I decided to go with a dataframe (data.table to be specific) after using the information here and my own experiments. sorry i can't upvote –  Oct 24 '13 at 19:30
  • So I told you why you were having problems and you basically followed my advice to put the values in a more durable data structure rather than storing in attributes, but you "cannot upvote". Oh well. It's not like I need the points. – IRTFM Oct 24 '13 at 19:40
  • apologies - as I stated in my other comment I don't have enough rep. It was indeed your comment that allowed me to research the underlying facts better and gain a better understanding, thank you –  Oct 25 '13 at 13:45
0

Even simple addition can destroy attributes. In this next example, only y's attributes remain:

x <- 1:5
attr(x, "foo") <- letters[1:3]

y <- 6:10
attr(x, "foo") <- letters[4:6]

x + y
## [1]  7  9 11 13 15
## attr(,"foo")
## [1] "d" "e" "f"

As DWin said, they are fragile; probably too fragile for what you want.


To expand on flodel's point, a common approach that will robustly propagate everything is to use a list with a class attribute.

Models returned by lm are a typical example of this. The output is too big to show here, but if you do unclass on an lm object, you'll see that it is just a list.

model <- lm(Sepal.Length ~ Sepal.Width + Species, iris)
unclass(model)

Then you can overload any functions (which are now S3 methods) to deal with your new class.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • Thank you for your answer - I decided to go with a data.table in the end. However, there are some further issues that I'll delve into in another question. Thanks! (sorry I can't upvote, not enough rep) –  Oct 24 '13 at 19:31