-2

I currently am importing two tables (in the most basic form) that appear as such

Table 1
State Month Account           Value
NY    Jan   Expected Sales    1.04
NY    Jan   Expected Expenses 1.02

Table 2
State Month Account    Value
NY    Jan   Sales      1,000
NY    Jan   Customers  500
NY    Jan   F Expenses 1,000
NY    Jan   V Expenses 100

And my end goal is to create a 3rd data frame that includes the values of the first two rows and calculates a 4th column based off of functions

NextYearExpenses = (t2 F Expenses + t2 V Expenses)* t1 Expected Expenses
NextYearSales = (t2 sales) * t1 Expected Sales

So my desired output is as followed

State Month New Account Value
NY    Jan   Sales       1,040
NY    Jan   Expenses    1,122

I am relatively new to R and I think ifelse statements might be my best bet. I have tried merging the tables and calculating with simple column functions but with no real progress.

Any suggestions?

Mike
  • 3,797
  • 1
  • 11
  • 30
Vic
  • 1
  • 1
  • 1
    Please share your data and a reproducible example so people can help, what have you tried so far? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Mike Oct 31 '18 at 19:38
  • Thanks for this. I will try to reformat my question and provide the necessary info. – Vic Oct 31 '18 at 19:46

3 Answers3

2

You may need to do some data wrangling but nothing out of the ordinary

require(dplyr)
Table1<-tibble(State=c("NY","NY"), Month=c("Jan","Jan"), Account=c("Expected Sales", "Expected Expenses"), Value=c(1.04,1.02))

Table2<-tibble(State=c("NY","NY","NY","NY"), Month=c("Jan","Jan","Jan","Jan"), Account=c("Sales", "Customers", "F Expenses","V Expenses"), Value=c(1000,500,1000,100))

First thing I do is rename the accounts to have a common name, i.e. expenses, this is going to help me to merge later on to Table1

Table2$Account[Table2$Account=="F Expenses"]<-"Expenses"
Table2$Account[Table2$Account=="V Expenses"]<-"Expenses"

then I use the group_by function and group by State, Month and Account and do the sum

Table2 <- Table2 %>% group_by(State, Month,Account) %>% 
summarise(Tot_Value=sum(Value)) %>% ungroup()
head(Table2)

## State Month Account   Tot_Value
##  <chr> <chr> <chr>         <dbl>
## 1 NY    Jan   Customers       500
## 2 NY    Jan   Expenses       1100
## 3 NY    Jan   Sales          1000

then something similar with the renaming for the accounts in table 1

Table1$Account[Table1$Account=="Expected Sales"]<-"Sales"
Table1$Account[Table1$Account=="Expected Expenses"]<-"Expenses"

Merge into a third table, Table 3

Table3<- left_join(Table1,Table2)

use mutate to do the needed operation

Table3 <- Table3 %>% mutate(Value2=Value*Tot_Value)
head(Table3)

## # A tibble: 2 x 6
##   State Month Account  Value Tot_Value Value2
##   <chr> <chr> <chr>    <dbl>     <dbl>  <dbl>
## 1 NY    Jan   Sales     1.04      1000   1040
## 2 NY    Jan   Expenses  1.02      1100   1122
Naus
  • 99
  • 10
1

Here's what I did with dplyr and tidyr. First I combined your initial tables with rbind into a single long format table. Since you have unique identifiers for each of the Account values, these don't need to be separate tables. Next I group_by State and Month to group these assuming eventually you'll have a variety of states/months. Next I summarise based on the values of Account that you specified and created two new columns. Finally to get it into the long format that you want I used gather from tidyr to go from wide format to long format. You can separate these commands into smaller chunks by deleting after the %>% to get a better idea of what each step does.

library(dplyr)
library(tidyr)
rbind(df,df2) %>%
  group_by(State,Month) %>%
  summarise(Expenses = (Value[which(Account == "F Expenses")] + Value[which(Account == "V Expenses")]) * Value[which(Account == "Expected Expenses")],
            Sales = Value[which(Account == "Sales")] * Value[which(Account == "Expected Sales")]) %>%
  gather(New_Account,Value, c(Expenses,Sales))


# A tibble: 2 x 4
# Groups:   State [1]
#  State Month New_Account Value
#  <chr> <chr> <chr>       <dbl>
#1 NY    Jan   Expenses     1122
#2 NY    Jan   Sales        1040
jasbner
  • 2,253
  • 12
  • 24
1

I'd recommend checking out the concept of "tidy data", as there are some real challenges with working on data with the structure you currently have. E.g. creating t3 should only take 2-3 lines of code, all of this is just to work around your data architecture:

library(tidyverse)

t1 <- data.frame(State = rep("NY", 2),
                 Month = rep(as.Date("2018-01-01"), 2),
                 Account = c("Expected Sales", "Expected Expenses"),
                 Value = c(1.04, 1.02),
                 stringsAsFactors = FALSE)

t2 <- data.frame(State = rep("NY", 4),
                 Month = rep(as.Date("2018-01-01"), 4),
                 Account = c("Sales", "Customers", "F Expenses", "V Expenses"),
                 Value = c(1000, 500, 1000, 100),
                 stringsAsFactors = FALSE)

t3 <- t2 %>% 
  spread(Account, Value) %>% 
  inner_join({
    t1 %>% 
      spread(Account, Value)
  }, by = c("State" = "State", "Month" = "Month")) %>% 
  mutate(NewExpenses = (`F Expenses` + `V Expenses`) * `Expected Expenses`,
         NewSales = Sales * `Expected Sales`) %>% 
  select(State, Month, Sales = NewSales, Expenses = NewExpenses) %>% 
  gather(Sales, Expenses, key = `New Account`, value = Value)
Jordo82
  • 796
  • 4
  • 14