0
Date        Profit        
2018-01-18  3024490

2018-01-17  3985331

2018-01-16  5987905 

2018-01-15  5742019 

2018-01-14  4645206

2018-01-13  4766255 

I have data table like above, want to know profit difference and growth rate between days ultimately.

So I think I had to make a new column called previous day profit, bring previous day's data into new column and aggregate between two columns. I mean two days data should be in the same row for example) 2018-1-1 profit / 2018-1-2 profit should be in the same row to compare the prices between 2 days.

Please help me!

user2554330
  • 37,248
  • 4
  • 43
  • 90
Justina
  • 11
  • 1
  • Edit your question and show the results you want. – Gordon Linoff Jan 19 '18 at 02:34
  • Hello, welcome to StackOverflow! This question has been asked a few times before (if I understand you correctly) so I'm just going to link you to another anwser [here](https://stackoverflow.com/questions/34651295/calculate-the-difference-between-rows) – Mist Jan 19 '18 at 03:10

2 Answers2

2

A fully worked example using dplyr showing the steps.

library(dplyr)
data = read.table(header = T, 
                  text="Date Profit        
                  2018-01-18  3024490
                  2018-01-17  3985331
                  2018-01-16  5987905 
                  2018-01-15  5742019 
                  2018-01-14  4645206
                  2018-01-13  4766255")

data = 
  data %>% 
  arrange(Date) %>% 
  mutate(PreviousProfit = lag(Profit)) %>% 
  mutate(Difference = Profit - PreviousProfit)

data
#        Date  Profit PreviousProfit Difference
#1 2018-01-13 4766255             NA         NA
#2 2018-01-14 4645206        4766255    -121049
#3 2018-01-15 5742019        4645206    1096813
#4 2018-01-16 5987905        5742019     245886
#5 2018-01-17 3985331        5987905   -2002574
#6 2018-01-18 3024490        3985331    -960841
Hlynur
  • 335
  • 2
  • 7
Andrew Chisholm
  • 6,362
  • 2
  • 22
  • 41
1

This is what the lag() function in the dplyr package is ideal for.

Let's first create your dataframe, df, using tribble from the tibble package.

library(tibble)
df <- tribble(
       ~Date,  ~Profit,        
"2018-01-18",  3024490,
"2018-01-17",  3985331,
"2018-01-16",  5987905,
"2018-01-15",  5742019,
"2018-01-14",  4645206,
"2018-01-13",  4766255)

Now we can load dplyr and get to work. Basically, the thing you have to do is create a new column using mutate and make that column a lag of Profit. But before we do that we arrange the dataframe according to the Date column, so that the lag is actually referring to the previous day when it refers to the previous row.

library(dplyr)
df <- df %>% 
  arrange(Date) %>% 
  mutate(prev_day_profit = lag(Profit))

Which results in the following when we print the data.frame df.

> df
# A tibble: 6 x 3
        Date  Profit prev_day_profit
       <chr>   <dbl>           <dbl>
1 2018-01-13 4766255              NA
2 2018-01-14 4645206         4766255
3 2018-01-15 5742019         4645206
4 2018-01-16 5987905         5742019
5 2018-01-17 3985331         5987905
6 2018-01-18 3024490         3985331

Hope that helps.

Hlynur
  • 335
  • 2
  • 7