0

I have a dataframe that looks like this: It only shows MTD: 12 and 11 and for the years it only shows 2014 and 2015.

testing

   MONTH         YEAR  Client   Revenue     Col1              Col2      Col3       Col4
     MTD: 12     2014     A.    3203821   651404245        477505485 73.304018  6.709496
     MTD: 11     2014     A.    2052195   484668751        334657718 69.048751  6.132221
     MTD: 11     2014     B.    1668958   533246253        305447319 57.280725  5.463978
     MTD: 12     2015     C.    1524561  3931961251        237707315  6.045515  6.413607
     MTD: 12     2015     D.    1506985   171367458        114716359 66.941740 13.136618
     MTD: 12     2014     C.    1380678   436264756        250622785 57.447406  5.508990


testing2<-split(testing, testing$Client)

sapply(testing2, ifelse(testing2$MONTH=="MTD:12", (insert row below that shows % change of every column with previous year), testing2)

Then I want to unsplit and put it back together.

Jaap
  • 81,064
  • 34
  • 182
  • 193
R Guru
  • 171
  • 1
  • 12
  • 2
    Please consider reading up on [ask] and how to produce a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Heroka Dec 30 '15 at 19:41
  • Ok apologies, I will recreate this to make it reproducible – R Guru Dec 30 '15 at 20:17
  • 1
    Don't recreate it, edit your question. Include your own efforts in solving this. – Heroka Dec 30 '15 at 20:20
  • Don't. Do. This. Putting two different types of data in a column is considered bad practise and will give you problems later. It's better to create a 'change'-column instead. – Jaap Dec 30 '15 at 21:05
  • I need to split it though because the percent changes has to be by Client – R Guru Dec 30 '15 at 21:27

1 Answers1

1

Maybe this will help.

# sample data frame for testing code
testing <- data.frame(
  MONTH = c("MTD: 12", "MTD: 12", "MTD: 11", "MTD: 12", "MTD: 12", "MTD: 12"), 
  YEAR = c(2012, 2013, 2014, 2015, 2013, 2014), 
  Client = c("A.", "A.", "A.", "B.", "B.", "B."), 
  Revenue = c(320, 205, 166L, 152, 150, 138),
  Col1 = c(651, 485, 533, 3932, 171, 436), 
  Col2 = c(478, 335, 305, 238, 115, 251), 
  Col3 = c(73, 69, 57, 6, 67, 57), 
  Col4 = c(6.7, 6.1, 5.5, 6.4, 13.1, 5.5)
)

# subset just the month=12 rows
test12 <- testing[testing$MONTH=="MTD: 12", ]
test12 <- test12[order(test12$Client, test12$YEAR), ]

# define a function to calculate percent change
pctchange <- function(x) {
  L <- length(x)
  c(NA, 100 * (x[-1] - x[-L]) / x[-L])
}

# calculate percent change for all columns, by client
change <- apply(test12[, c("Revenue", "Col1", "Col2", "Col3", "Col4")], 2,
  function(y) unlist(tapply(y, test12$Client, pctchange)))
change <- data.frame(change)
names(change) <- paste0("d", names(change))
test12b <- cbind(test12[, c("MONTH", "YEAR", "Client")], change)

# merge back with monthly data
merge(testing, test12b, all=TRUE)
Jean V. Adams
  • 4,634
  • 2
  • 29
  • 46
  • One question to your answer which did work for me. How do I split it up now so that the percent change is under each Client. So essentially I am trying to have a separate table for each client with the first two rows being the monthly values and then I want to add the percent change under each column for each client. @Jean V. Adams – R Guru Jan 06 '16 at 19:11
  • I suggest you post that as a new question, along with a reproducible example. See http://stackoverflow.com/a/5963610/2140956 – Jean V. Adams Jan 06 '16 at 22:33