1

I have the following dataframe structure

Company Name stock price Date
HP           1.2         10/05/2020
HP           1.4         11/05/2020
APPL         1.1         05/03/2020
APPL         1.2         06/03/2020
FB           5           15/08/2020
FB           5.2         16/08/2020
FB           5.3         17/08/2020

...and so on for multiple companies and their stock prices for different dates.

I wish to calculate daily returns for each stock and I am trying to figure out the code to iterate this dataframe for each company. I.e. when we are done with APPL we start again over for FB by setting the first row to N/A since we don't have returns to compare with, and so on as shown below.

Company Name stock price Date          Daily Returns
HP           1.2         10/05/2020    N/A
HP           1.4         11/05/2020    0.2
APPL         1.1         05/03/2020    N/A
APPL         1.2         06/03/2020    0.1
FB           5           15/08/2020    N/A
FB           5.2         16/08/2020    0.2
FB           5.3         17/08/2020    0.1

Is there a more efficient solution to tackle this than extracting a list of unique company names and then cycling through each of them to perform this calculation?

viperjonis
  • 40
  • 1
  • 7

1 Answers1

1

You should use dplyr for this kind of tasks:

library(dplyr)

df %>% 
  arrange(Company_Name, Date) %>% 
  group_by(Company_Name) %>% 
  mutate(Daily_Returns = stock_price - lag(stock_price)) %>% 
  ungroup()

This returns

  Company_Name stock_price Date       Daily_Returns
  <chr>              <dbl> <chr>              <dbl>
1 HP                   1.2 10/05/2020        NA    
2 HP                   1.4 11/05/2020         0.2  
3 APPL                 1.1 05/03/2020        NA    
4 APPL                 1.2 06/03/2020         0.100
5 FB                   5   15/08/2020        NA    
6 FB                   5.2 16/08/2020         0.200
7 FB                   5.3 17/08/2020         0.100
  • First we order the data by Company_Name and Date
  • Then we group it by Company_Name, so every calculation starts over again for a new company
  • Then we calculate the daily returns by substracting the former day (here we use lag)

Data

structure(list(Company_Name = c("HP", "HP", "APPL", "APPL", "FB", 
"FB", "FB"), stock_price = c(1.2, 1.4, 1.1, 1.2, 5, 5.2, 5.3), 
    Date = c("10/05/2020", "11/05/2020", "05/03/2020", "06/03/2020", 
    "15/08/2020", "16/08/2020", "17/08/2020")), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -7L), spec = structure(list(
    cols = list(Company_Name = structure(list(), class = c("collector_character", 
    "collector")), stock_price = structure(list(), class = c("collector_double", 
    "collector")), Date = structure(list(), class = c("collector_character", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))
Martin Gal
  • 16,640
  • 5
  • 21
  • 39