0

this is a simple question, but I am not proficient in automation in R. So I'm asking this.

I have a dataframe like this:

x<-c(2,5,1,1,23,1,23,51,36,85,24,1,645,945,
2,8,124,4,35,6,71,45,1,5,12,52,764,8,6,234,
98,234,456,6,1,452,32,176,75,31)
x<-matrix(x,ncol=5)
x<-data.frame(x)
names(x)<-c("car","bike","bus","train","walk")

so in the end you'd get a dataframe like this:

car    bike    bus    train    walk
 2      36     124     12      456
 5      85       4     52        6
 1      24      35    764        1
 1       1       6      8      452
23     645      71      6       32
 1     945      45    234      176
23       2       1     98       75
51       8       5    234       31

So I want to calculate the growth of each transportation:

x$car_growth<-((diff(x$car)/x$car)*100)

however, the first value will be empty, so I need to arrange it such that it will go into the right row. To do that, I make a new df

k<-data.frame(seq(from=1,to=8)
names(k)<-c("seq")
k2<-data.frame(x$car_growth)
k2$seq<-seq(from=2, to=9)
k3<-merge(k2, k, by="seq", all=TRUE)
k3<-k3[1:8,]
x2<-cbind(x,k3)
x2$seq<-NULL

So in the end you will get something like this

car    bike    bus    train    walk   car_growth
 2      36     124     12      456       NA
 5      85       4     52        6      150
 1      24      35    764        1     -400
 1       1       6      8      452        0
23     645      71      6       32     2200 
 1     945      45    234      176    -2200
23       2       1     98       75     2200
51       8       5    234       31     48.7

These are only 1 process for item car.

I wanna repeat this for bike, bus, train, and walk.

Of course the original data that I have is MUCHHH bigger and longer than this. I just want to learn how to automate all these processes into a few lines short script like excel VBA

Thanks for all your suggestions.

MT32
  • 677
  • 1
  • 11
  • 24

2 Answers2

1

With base R you could do, in case you agree that when car drops from 5 to 1 it decreases with 80%, not with 400%:

x <- cbind(x,
rbind(NA,
apply(x, 2, function(i) diff(i) / i[-8] * 100)
))

colnames(x)[6:10] <- paste0(colnames(x[6:10]), "_growth")

  car bike bus train walk car_growth bike_growth bus_growth train_growth
1   2   36 124    12  456         NA          NA         NA           NA
2   5   85   4    52    6  150.00000   136.11111  -96.77419    333.33333
3   1   24  35   764    1  -80.00000   -71.76471  775.00000   1369.23077
4   1    1   6     8  452    0.00000   -95.83333  -82.85714    -98.95288
5  23  645  71     6   32 2200.00000 64400.00000 1083.33333    -25.00000
6   1  945  45   234  176  -95.65217    46.51163  -36.61972   3800.00000
7  23    2   1    98   75 2200.00000   -99.78836  -97.77778    -58.11966
8  51    8   5   234   31  121.73913   300.00000  400.00000    138.77551
  walk_growth
1          NA
2   -98.68421
3   -83.33333
4 45100.00000
5   -92.92035
6   450.00000
7   -57.38636
8   -58.66667

Or with dplyr:

library(dplyr)
x %>% 
  mutate_all(funs(c(NA, diff(.)) / lag(.) * 100)) %>% 
  bind_cols(x, .) %>% 
  rename_at(vars(ends_with("1")), function(i) gsub("1", "_growth", i))
Lennyy
  • 5,932
  • 2
  • 10
  • 23
0

this should do it

library(dplyr)
x<-c(2,5,1,1,23,1,23,51,36,85,24,1,645,945,
     2,8,124,4,35,6,71,45,1,5,12,52,764,8,6,234,
     98,234,456,6,1,452,32,176,75,31)
x<-matrix(x,ncol=5)
x<-data.frame(x)
names(x)<-c("car","bike","bus","train","walk")

give your growth function a name

get_growth <- function(column)(diff(column)/column)*100

use mutate_all to map the function over all columns and append "growth" to the name. I use lag the shift the result down one row.

mutate_all(x,funs(growth = lag(get_growth(.))))

hope this helps!!

Bertil Baron
  • 4,923
  • 1
  • 15
  • 24