0

I have a data set like following:

Date            Country    Item    Qty    Value
15-04-2014      SE         08888   2      20
28-04-2014      SE         08888   2      20
05-05-2014      SE         08888   6      80

I want to sum quantity values when the dates are before the 1 May, and the aggregated value (the sum) should be marked as 1 May.

I tried ddply, but it sums all the value regardless of the dates.

ddply(se, .(se$Item), summarize, Qty = sum(se$Qty), Value = sum(se$Value))

Also tried subsetting by the date, with no big success.

se$Date <- as.Date(as.character(se$Date))
se_q <- subset(se,se$Date <= 01-05-2014)

Date         Country Item     Qty    Value
0015-04-20   SE      08888    2      20
0028-04-20   SE      08888    2      20
0005-05-20   SE      08888    6      80

How could I add the date argument in the code? or how could I do this?

Thank you

1 Answers1

0

You could use dplyr for example:

require(dplyr)

> df %.% 
    filter(Date <= as.Date("2014-05-01")) %.% 
 #  group_by(Item) %.%                       #you can add this line if you need to group by Item (it will appear in the output then)
    summarize(Date = as.Date("2014-05-01"), Qty = sum(Qty), Value = sum(Value))

#        Date Qty Value
#1 2014-05-01   4    40

The problem in your subset is that you are not telling R that 2014-05-01 is a Date.

talat
  • 68,970
  • 21
  • 126
  • 157
  • Thanks a lot for your prompt reply :-) – user3206981 May 29 '14 at 12:52
  • I ran the code, and get the following message Date Qty 1 2014-05-01 10 Warning message: In filter_impl(.data, dots(...), environment()) : Incompatible methods ("Ops.factor", "Ops.Date") for "<=" – user3206981 May 29 '14 at 12:53
  • Did you convert your `Date` column to `Date` format? try `df$Date <- as.Date(as.character(df$Date), format="%d-%m-%Y")` – talat May 29 '14 at 12:54