0
id   product            id2     year   cost 
 1  biscuits    202-55-3041     2017      2 
 2  biscuits    903-36-9457     2014      2 
 3  biscuits    938-33-7254     2014      2 
 4  biscuits    739-29-5963     2017      2 
 5  biscuits    731-49-5483     2017      2 
 6  biscuits    892-15-2567     2018      2 
 7  biscuits    518-79-7674     2017      2 
 8  biscuits    305-63-7908     2017      2 

This is my current data set the name of this data is called 'total1'

I am a beginner in R and I was wondering if there was a way to add up the cost of the product based on the year, for example;

In 2017 there were 10 biscuits sold

In 2018 there were 8 biscuits sold

I am trying to determine which is the least profitable year in terms of biscuits sold.

I apologise if this is answered elsewhere if it is direct me thank you.

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
bob
  • 15
  • 5

1 Answers1

0

Assuming that the number of sold items is stored in column cost, here's a simple solution using tapply:

tapply(total1$cost, total1$year, sum)
2014 2017 2018 
   4   10    2 

Another simple solution is by using aggregate: Edit: thanks to @Darren Tsai's comment, the code here is simplified:

aggregate(cost ~ year, total1, sum)
  total1$year total1$cost
1        2014           4
2        2017          10
3        2018           2
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
  • I don't suppose you know how to turn this information into a graph? – bob Jan 20 '20 at 18:02
  • Would a barchart be helpful? – Chris Ruehlemann Jan 20 '20 at 18:05
  • yeah it would, if you know how? – bob Jan 20 '20 at 18:08
  • Ver simple: 1. store the sums in a dataframe: `df <- tapply(total1$cost, total1$year, sum)` and barplot `df`: `barplot(df, xlab = "Year", ylab = "Number of sales")` – Chris Ruehlemann Jan 20 '20 at 18:08
  • Curious to see that the answer was voted down but accepted by the OP ;) – Chris Ruehlemann Jan 20 '20 at 18:10
  • The formula part should be simplified to `cost ~ year` because you have set `data = total1` in `aggregate`. – Darren Tsai Jan 20 '20 at 18:11
  • Okay, rather than downvoting the answer you could have suggested I improve the code along this line. But thanks anyway. – Chris Ruehlemann Jan 20 '20 at 18:13
  • Thanks Chris appreciate the help! – bob Jan 20 '20 at 18:15
  • 1
    @ChrisRuehlemann Yes it doesn't deserve a downvote. I did that for two reasons. (1) There may exist other products except for biscuits, so the summation should be based on `product` and `year`. Of course, the OP emphasizes "based on the year" and doesn't specify whether there exist other products or not. I think an assumption based on `product` and `year` will be more general and reasonable. i.e. `aggregate(cost ~ product + year, total1, sum)` (2) This question obviously is duplicate and should be closed. You can flag it and don't need to answer it again. – Darren Tsai Jan 20 '20 at 18:57
  • Do feel free to click the upward arrow if the answer was useful to your query. Also note @Darren Tsai's perfectly reasonable point about there potentially being other products than just biscuits and, if that is the case, the code he is proposing. – Chris Ruehlemann Jan 20 '20 at 22:45