0

I have the following table:

 Date     Country  Class Value   
6/1/2010   USA      A     45
6/1/2010  Canada    A     23 
6/1/2010  Brazil    B     65
9/1/2010   USA      B     47
9/1/2010  Canada    A     98
9/1/2010  Brazil    B     25
12/1/2010  USA      B     14 
12/1/2010 Canada    A     79
12/1/2010 Brazil    A     23
3/1/2011   USA      A     84 
3/1/2011  Canada    B     77
3/1/2011  Brazil    A     43
6/1/2011   USA      A     45
6/1/2011  Canada    A     23 
6/1/2011  Brazil    B     65
9/1/2011   USA      B     47
9/1/2011  Canada    A     98
9/1/2011  Brazil    B     25
12/1/2011  USA      B     14 
12/1/2011 Canada    A     79
12/1/2011 Brazil    A     23
3/1/2012   USA      A     84 
3/1/2012  Canada    B     77
3/1/2012  Brazil    A     43

In column "Date" years are divided by the following months - March, June, September and December. I need to group months from June to March as a Fiscal Year and by each Fiscal Year calculate the mean of column "Value" by "Country" and "Class". Could anybody help me to do that?

I am trying to do that using data.table but get the error:

d=data[,list(Val=mean(Value,na.rm=T)),by=list(Country,Class,
  Period.grp=cut(Period,list(6/1/2010,3/1/2011,6/1/2011,3/1/2012,
  6/1/2012,3/1/2013,6/1/2013,3/1/2014)))]  

Error in cut.default(Period, list(6/1/2010, 3/1/2011, 6/1/2011, 3/1/2012, : 'x' must be numeric

Thank you!

user45415631
  • 175
  • 2
  • 11
  • I'm pretty sure this has been asked and answered before ... what searching have you done? – IRTFM Nov 01 '14 at 00:37
  • So is jan-1-2010 supposed to be in FY 2010 or FY 2009? – IRTFM Nov 01 '14 at 00:59
  • The problem with your approach is that there is no cut() function for the data-type you are creating. Those are going to be characters if they don't get parsed as division. Notice that list(6/1/2010) returns a decimal fraction. Those are not R Dates. Also there is no `Period` column in the dataframe. – IRTFM Nov 01 '14 at 04:14
  • @BondedDust Can I do smth like that to split the Dates by Fiscal Year:p=data$Date;p=as.character(p); p[p =="6/1/2010" & p=="9/1/2010" & p=="12/1/2010" & p=="3/1/2011"] <- 'FY1011' p[p =="6/1/2011" & p=="9/1/2011" & p=="12/1/2011" & p=="3/1/2012"] <- 'FY1112' – user45415631 Nov 01 '14 at 04:26
  • Why not convert to R Dates? Then you can use the various Date-oriented functions. – IRTFM Nov 01 '14 at 04:41
  • @BondedDust But the approach above is interesting for me. Do you know why Dates are not changed by FY1011 and FY1112 in my table. – user45415631 Nov 01 '14 at 13:54
  • 1) The values in your breaks objects are numeric. Look at the output of `list(6/1/2010)`. 2) the ordering of character variables will make months of "12" come before the other months since characters are sorted by lexical rules and "1" is less than "3", "6, "or "9". – IRTFM Nov 01 '14 at 15:25

2 Answers2

1

I found the answer I was thinking I had written, but it's actually a bit different

# this should "shift" the year calculation 3 months and provide quarter
c('Q1','Q2','Q3','Q4')[ 1+((as.POSIXlt(dates)$mon+3) %/% 3)%%4]

This then pastes the FY with the quarter shifted 6 months, but you may need to adjust because your year specification was ambiguous about "naming the year":

dat$FY_Q <- paste( 1900+as.POSIXlt( dat$dates )$year+
                        1*(as.POSIXlt( dat$dates )$mon %in% 7:12) ,  
  c('Q1','Q2','Q3','Q4')[ 1+((as.POSIXlt(dat$dates)$mon-6) %/% 3)%%4] 
  , sep="-")

dat
        Date Country Class Value      dates    FY_Q
1   6/1/2010     USA     A    45 2010-06-01 2010-Q4
2   6/1/2010  Canada     A    23 2010-06-01 2010-Q4
3   6/1/2010  Brazil     B    65 2010-06-01 2010-Q4
4   9/1/2010     USA     B    47 2010-09-01 2011-Q1
5   9/1/2010  Canada     A    98 2010-09-01 2011-Q1
6   9/1/2010  Brazil     B    25 2010-09-01 2011-Q1
snipped---------

So now do a tapply or aggregate by FY_Q and Country:

> with( dat, aggregate(Value, list(FY_Q, Country),FUN=mean)  )
   Group.1 Group.2  x
1  2010-Q4  Brazil 65
2  2011-Q1  Brazil 25
3  2011-Q2  Brazil 23
4  2011-Q3  Brazil 43
5  2011-Q4  Brazil 65
6  2012-Q1  Brazil 25
7  2012-Q2  Brazil 23
8  2012-Q3  Brazil 43
9  2010-Q4  Canada 23
10 2011-Q1  Canada 98
11 2011-Q2  Canada 79
12 2011-Q3  Canada 77
13 2011-Q4  Canada 23
14 2012-Q1  Canada 98
15 2012-Q2  Canada 79
16 2012-Q3  Canada 77
17 2010-Q4     USA 45
18 2011-Q1     USA 47
19 2011-Q2     USA 14
20 2011-Q3     USA 84
21 2011-Q4     USA 45
22 2012-Q1     USA 47
23 2012-Q2     USA 14
24 2012-Q3     USA 84

See: Format date-time as seasons in R? for a somewhat similar problem and solutions.

Community
  • 1
  • 1
IRTFM
  • 258,963
  • 21
  • 364
  • 487
0

Try:

> dat$fiscal = rep(2011:2015,each=12, length.out=nrow(dat))
> 
> aggregate(Value~Country+Class+fiscal, data=dat, mean)
   Country Class fiscal    Value
1   Brazil     A   2011 33.00000
2   Canada     A   2011 66.66667
3      USA     A   2011 64.50000
4   Brazil     B   2011 45.00000
5   Canada     B   2011 77.00000
6      USA     B   2011 30.50000
7   Brazil     A   2012 33.00000
8   Canada     A   2012 66.66667
9      USA     A   2012 64.50000
10  Brazil     B   2012 45.00000
11  Canada     B   2012 77.00000
12     USA     B   2012 30.50000

For FY2011 etc:

dat$fiscal = paste0('FY',rep(2011:2015,each=12, length.out=nrow(dat)))
rnso
  • 23,686
  • 25
  • 112
  • 234