2

I’ve a very unusual solution to the problem below. I wonder is there a better way to approach this?

The problem: There are 2 data sets stored in two different data frames. 1 of them is a daily balance of product a , while the other is a weekly balance of product b. Due to a request, i need to perform a “union” on the two data frames in order to do a time series visualization. Since product b does not have daily data, the balance will be hold static until the next update.

How do i “expand” table of product b to have the same length as product a table while holding the balance static for days which product b does not have the data?

Table of product A

Data frame name: TableA

Date (yyyy-mm-dd) / Ccy/ Product /Balance
2019-12-01 / USD / A / 100
2019-12-01 / GBP / A / 100
2019-12-02 / USD / A / 200
2019-12-02 / GBP / A / 150
2019-12-03 / USD / A / 180
2019-12-03 / GBO / A / 210

Table of product B

Data frame name: TableB

Date (yyyy-mm-dd) / Ccy/ Product /Balance
2019-12-01 / USD / B / 100
2019-12-01 / SGD / B / 100
2019-12-03 / USD / B / 180
2019-12-03 / SGD / B / 210

My quirky solution below. Do let me know your thoughts and critique. THANK YOU in advance!!

library(tidyverse)

unique_dates <- unique(tableA$date)
expanded_tableB <- unique_dates%>%left_join(tableB, by=c(Date=Date))%>% fill(-c(“date”))%>% fill(-c(“date”),.direction=c(“up”))

union_table <- bind_rows(tableA,expanded_tableB)
PKumar
  • 10,971
  • 6
  • 37
  • 52
Sean Yau
  • 25
  • 2
  • To clarify, for product "B", you have 2 rows (`Ccy` of `USD` and `SGD`) for each date. Did you want 2 additional balances added for date 2019-12-02 for each of `USD` and `SGD`? – Ben Feb 23 '20 at 16:36
  • Hi Ben. Thank you. Yes, 2 additional lines for COB 2019-12-02 from product B – Sean Yau Feb 25 '20 at 05:53

1 Answers1

0

Here is one approach. If you need additional rows added for missing dates, for both Ccy of SGD and USD for product "B", then would first group_by Ccy. Use complete to fill in dates based on the first and last table found in Table A (based on what you described I believe). Then finally bind_rows to add Table A.

library(tidyverse)

tableB %>%
  group_by(Ccy) %>%
  complete(Date = seq.Date(min(tableA$Date), max(tableA$Date), by = "day")) %>%
  fill(Product, Balance) %>%
  bind_rows(tableA)

Output

# A tibble: 12 x 4
# Groups:   Ccy [4]
   Ccy   Date       Product Balance
   <chr> <date>     <chr>     <dbl>
 1 SGD   2019-12-01 B           100
 2 SGD   2019-12-02 B           100
 3 SGD   2019-12-03 B           210
 4 USD   2019-12-01 B           100
 5 USD   2019-12-02 B           100
 6 USD   2019-12-03 B           180
 7 USD   2019-12-01 A           100
 8 GBP   2019-12-01 A           100
 9 USD   2019-12-02 A           200
10 GBP   2019-12-02 A           150
11 USD   2019-12-03 A           180
12 GBO   2019-12-03 A           210

Data

tableA <- data.frame(
  Date = as.Date(c("2019-12-01", "2019-12-01", "2019-12-02", "2019-12-02", "2019-12-03", "2019-12-03")),
  Ccy = c("USD", "GBP", "USD", "GBP", "USD", "GBO"),
  Product = "A",
  Balance = c(100, 100, 200, 150, 180, 210),
  stringsAsFactors = F
)

tableB <- data.frame(
  Date = as.Date(c("2019-12-01", "2019-12-01", "2019-12-03", "2019-12-03")),
  Ccy = c("USD", "SGD", "USD", "SGD"),
  Product = "B",
  Balance = c(100, 100, 180, 210),
  stringsAsFactors = F
)
Ben
  • 28,684
  • 5
  • 23
  • 45
  • Thank you so much Ben! The data that i have from table A will not have weekend and holiday dates. Do you know if there’s any tweaks that i could do? – Sean Yau Feb 25 '20 at 05:56
  • You could filter weekdays through various approaches, see [similar question here](https://stackoverflow.com/questions/6009351/remove-weekend-data-in-a-dataframe). You can also filter based on holidays, if you have a separate list of holiday dates, or by other means. For example, one library `RQuantLib` can check if dates are on holidays or give a list of holidays between dates for various calendars/countries. I don't have much experience with this, but does look feasible. – Ben Feb 25 '20 at 14:46
  • Thank you so much Ben! – Sean Yau Feb 26 '20 at 15:10