21

I don't understand why R gives me a warning about "Longer object length is not a multiple of shorter object length"

I have this object which is generated by doing an aggregate over an xts series giving the weekday median:

u <- aggregate(d, list(Ukedag = format(index(d),"%w")), median)

1 314.0
2 282.5
3 270.0
4 267.0
5 240.5

Then I try to apply this to my original xts series, which looks like this (only a lot longer)

head(d)
2009-01-02 116
2009-01-05 256
2009-01-06 286

Using:

coredata(d) <- coredat(d) - u[format(index(d),"%w")];

Which results in a warning.

The intent is to subtract the weekday mean. It appears to work despite the warning, but what should I worry about?

Revised solution: Attempt 2

apply.daily(d, function(x) coredata(x) - u[format(index(x), "%w")] )

I did indeed have a serious error. This doesn't give any warnings and I tested it by doing:

apply.daily(d, function(x) u[format(index(x), "%w")] )

Then checking some dates, and it appeared that is was in alignment with the calendar.

Community
  • 1
  • 1
tovare
  • 4,027
  • 5
  • 29
  • 30
  • 2
    May be better suited to stackoverflow. – Shane Aug 03 '10 at 09:25
  • Yes, I was a bit unsure about that. From one perspective R is a scriptable tool for doing statistics, but it is also a programming language. – tovare Aug 03 '10 at 09:38
  • This question at least involves a data problem, but this will be an ongoing cause for confusion wig potentially no right answer. http://meta.stats.stackexchange.com/questions/1/how-to-answer-r-questions – Shane Aug 03 '10 at 09:46
  • Feel free to bounce me to stackoverflow :-) Is that something that can be done automaticly? – tovare Aug 03 '10 at 09:49
  • No, let's just leave it here (IMO). – Shane Aug 03 '10 at 09:55

1 Answers1

19

Yes, this is something that you should worry about. Check the length of your objects with nrow(). R can auto-replicate objects so that they're the same length if they differ, which means you might be performing operations on mismatched data.

In this case you have an obvious flaw in that your subtracting aggregated data from raw data. These will definitely be of different lengths. I suggest that you merge them as time series (using the dates), then locf(), then do your subtraction. Otherwise merge them by truncating the original dates to the same interval as the aggregated series. Just be very careful that you don't drop observations.

Lastly, as some general advice as you get started: look at the result of your computations to see if they make sense. You might even pull them into a spreadsheet and replicate the results.

Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
Shane
  • 98,550
  • 35
  • 224
  • 217
  • Thank you. I'm not sure i follow you completely. u is supposed to be shorter because it's weekdays, since this is for a call center i have more people calling on mondays than fridays. On some events, like a letter being sendt I have an effekt i think might be additive (i.e. 1% of people recieving letters make a call). So if letter A is recieved on a monday 1 i get different total number of calls than if it's recieved on a friday. – tovare Aug 03 '10 at 09:44
  • How can you subtract a vector of length 10 from a vector of length 15 (for instance)? R will simply repeat the first 5 elements of the shorter vector, which would not be what you want. So you need to merge them and carry forward the aggregated values. – Shane Aug 03 '10 at 09:54
  • Thank you, I think I am missing some basic insight. To me my revised solution above looks almost the same as my initial solution, but an A-B test showed calendar misalignment, which is consistent with what you made me aware of. – tovare Aug 03 '10 at 10:19