0

Continuation to this question: Add row in R dataframe by unique factor in column showing percent change by Month

testing <- data.frame(
  MONTH = c("MTD: 12", "MTD: 12", "MTD: 11", "MTD: 12", "MTD: 12", "MTD: 12"), 
  YEAR = c(2012, 2013, 2014, 2015, 2013, 2014), 
  Client = c("A.", "A.", "A.", "B.", "B.", "B."), 
  Revenue = c(320, 205, 166L, 152, 150, 138),
  Col1 = c(651, 485, 533, 3932, 171, 436), 
  Col2 = c(478, 335, 305, 238, 115, 251), 
  Col3 = c(73, 69, 57, 6, 67, 57), 
  Col4 = c(6.7, 6.1, 5.5, 6.4, 13.1, 5.5)
)

# subset just the month=12 rows
test12 <- testing[testing$MONTH=="MTD: 12", ]
test12 <- test12[order(test12$Client, test12$YEAR), ]

# define a function to calculate percent change
pctchange <- function(x) {
  L <- length(x)
  c(NA, 100 * (x[-1] - x[-L]) / x[-L])
}

# calculate percent change for all columns, by client
change <- apply(test12[, c("Revenue", "Col1", "Col2", "Col3", "Col4")], 2,
  function(y) unlist(tapply(y, test12$Client, pctchange)))
change <- data.frame(change)
names(change) <- paste0("d", names(change))
test12b <- cbind(test12[, c("MONTH", "YEAR", "Client")], change)

# merge back with monthly data
merge(testing, test12b, all=TRUE)

So after running this code you get a list that has been splitted by client.

I want to run the following code that will essentially remove the 2nd row if the number of rows for that factor (the client) is greater than 2.

I tried this already and it didn't work:

testing<-ifelse(length(splitresult)>2,splitresult[-2,],splitresult)

Ultimate goal out of all of this:

1) To just get the percent change of the last year with the previous year and don't show the inbetween stuff like NA for the previous year. But if it is a new client I do want the NA there to specify it is a new client. So that is why I tried out the code above which didn't work.

2) I want to reorder the clients in the split by Revenue in MTD: 12 2014.

splitlist[order(sapply(splitlist, function(x) (x[["Revenue"]])))]

(Didn't work: assume splitlist is the name of the list)

If anyone can help me with either question it would be extremely helpful. Thanks!

Community
  • 1
  • 1
R Guru
  • 171
  • 1
  • 12
  • Have you tried to use the `dplyr` package? – Jubbles Jan 07 '16 at 17:24
  • I haven't. I think I need to use sapply for the 2nd part of the question but not sure how to go about it with the month/year involved. I have edited the question to show you what I have tried @Jubbles – R Guru Jan 07 '16 at 17:26

1 Answers1

4

I think the plyr package would help here. For example, instead of the last line of code where you use ifelse, you could try

library(plyr)
out = ddply(splitresult, "Client", function(x){
  if(dim(x)[1] > 2) x = x[-2,]
  return(x)
})

Here, x is a client-specific data frame, and out is the result of combining the rows of a bunch of client-specific data frames.

You might also check out lubridate, which will make dates and times easier to deal with. As mentioned in the comments, dplyr would also be helpful, as would the rest of the "Hadleyverse" of packages for cleaning and plotting data. The solutions to your questions 1 and 2, along with the whole process of cleaning and summarizing, will be much cleaner and easier with the right tools.

landau
  • 5,636
  • 1
  • 22
  • 50
  • Thank you. Very helpful. The only thing I need to figure out now from this is to separate it into different dataframes by client by Revenue of MTD: 2015. Then good to go. I will ask that in a different post though if that is preferred. @landau – R Guru Jan 07 '16 at 17:43
  • Sure thing. If you change `ddply` to `dlply` above, you'll have a list of client-specific data frames in `out` instead of a single data frame. Semantically, I'm not sure what you mean when you talk about separating by Revenue of MTD: 2015. Usually, I think of separating data frames according to features such as date, not levels such as December 2015. – landau Jan 07 '16 at 17:50
  • So when I make that list, it is ordered alphabetically by Client. I want to order the list of clients by Revenue of MTD 12: 2015 – R Guru Jan 07 '16 at 18:24
  • This will list the Clients by revenue – R Guru Jan 07 '16 at 18:25