1

I'm working with the European Social Survey, and have different dataframes for each country. All of these dataframes are equal except for the values on each variable. What I would like to do is to create a new variable in each dataset that is equal to the sum of several other variabels. Is there a way to create a functions that does this for every dataframes?

What I have done before i simply creating a new column with: Data$new <- Data$old1 + Data$old2...etc. However, when working with several variables over several datasets this seams rather inefficient, and I'm quite sure that there must exist an easier way. I just don't know what to google.

Example:

I have two dataframes, A and B:

A1 <- c(1,2,3,4,5)
A2 <- c(6,7,8,9,10)
A <- data.frame(A1, A2)
B1 <- c(10,12,13,15,24)
B2 <- c(23,24,25,45,65)
B <- data.frame(B1, B2)

What I want to do is for each dataframe create a new column which is equal to the sum of the other two. Usually I would do that like this A$A3 <- A$A1 + A$A2 B$B3 <- B$B1 + B$B2

However, doing this across several dataframes with a large amount of variables seems like and inefficient way of doing it. Since the name of the variables are the same across the dataframes, is there a way to make a function that looks for said variable, and create the new one in a better way?

Eric Nilsen
  • 91
  • 1
  • 9
  • Look at `aggregate`/`merge`/`*_join`/`summarise`. Not too sure without sample data. You can improve your question as [stated here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – NelsonGon Aug 21 '19 at 12:54
  • 1
    Thanks for the help! Tried to improve the question a little bit with an example, but I must adming to beeing a bit new to all of this. Hope the edit can clarify somewhat. – Eric Nilsen Aug 21 '19 at 13:16
  • Is this what you need? `rowSums(A)`? – NelsonGon Aug 21 '19 at 13:26
  • 1
    Well, nearly, I want the sum of some variables but not all. Maybe my example was a bit bad as that actually would have solves the problem there, let's say there was a third collumn A/B$3. I would want the sum of 1 and 2, and NOT 3. In both dataframes. Maybe I should try to rewrite the whole question.. – Eric Nilsen Aug 21 '19 at 13:31
  • Check my answer below which does so for specified columns. Also your data sets do not have the same names. One has Bx, another Ax. – NelsonGon Aug 21 '19 at 13:31

2 Answers2

1

We can create a helper auto_add:

auto_add <- function(df, col_a, col_b){
  df$total <- rowSums(df[c(col_a,col_b)])
  df
}
auto_add(A,"A1","A2")

For many data sets and if the target columns are known, we could do:

auto_add <- function(df,target_cols){

  df$total <- rowSums(df[c(target_cols)])
  df
}
lapply(list(A,B),auto_add,target_cols=1:2) 

Result:

[[1]]
  A1 A2 total
1  1  6     7
2  2  7     9
3  3  8    11
4  4  9    13
5  5 10    15

[[2]]
  B1 B2 total
1 10 23    33
2 12 24    36
3 13 25    38
4 15 45    60
5 24 65    89
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
1

An option with map/dplyr

library(tidyverse)
map(mget(c("A", "B")),  ~ .x %>% 
                            mutate(Total = reduce(., `+`)))
akrun
  • 874,273
  • 37
  • 540
  • 662