2

Let's say I have sample dataframe (df) :

id col1 col2 col3 col4 col5 col6
 1   2    3    2    6    2    8
 2   3    2    4    1    3    2 
 3   4    2    9    7    8    7 
 4   7    6    3    2    9    2

Now I am trying to add 2 columns at a time and create new column i.e col1+col2, col3+col4, col5+col6

Desired output:

id col1 col2 col3 col4 col5 col6 t_1 t_3 t_5
 1   2    3    2    6    2    8    5   8   10
 2   3    2    4    1    3    2    5   5   5
 3   4    2    9    7    8    7    6   16  15
 4   7    6    3    2    9    2    13  5   11

I wrote following code:

for(i in c(1, 3, 5)){
paste('df$t', i, sep= '_') <- as.numeric(df[, i]) + as.numeric(df[, i+1])
}

but i get following error:

Error in paste("df$t", i, sep = "_") <- as.numeric(df[, : target of assignment expands to non-language object`

Am I doing something wrong here ?

scoa
  • 19,359
  • 5
  • 65
  • 80
Dheeraj Singh
  • 715
  • 1
  • 12
  • 24
  • What you are doing wrong is that `paste('df$t', i, sep= '_')` returns the *character vector* of length one `"df$t_i"`, not the *object* `df$t_i`. You could use `assign`, for instance, to assign a variable programmaticly – scoa Aug 13 '15 at 11:35

3 Answers3

2

Based on the expected output, we can subset the alternating columns of 'df1' without the first 'id' column and we + those datasets with similar dimensions, and create new columns in the original dataset based on that output.

df1[paste('t', c(1,3,5), sep="_")] <-  df1[-1][c(TRUE, FALSE)]+
                              df1[-1][c(FALSE, TRUE)]
df1
#   id col1 col2 col3 col4 col5 col6 t_1 t_3 t_5
#1  1    2    3    2    6    2    8   5   8  10
#2  2    3    2    4    1    3    2   5   5   5
#3  3    4    2    9    7    8    7   6  16  15
#4  4    7    6    3    2    9    2  13   5  11

Just for clarity, the first step is removing the first column df1[-1] and then we subset every alternating column using the logical vector (c[TRUE, FALSE)]). This will be recycled to the length of the dataset.

df1[-1][c(TRUE, FALSE)]
#  col1 col3 col5
#1    2    2    2
#2    3    4    3
#3    4    9    8
#4    7    3    9

Similarly, we subset the next alternating pair of columns.

df1[-1][c(FALSE, TRUE)]
#  col2 col4 col6
#1    3    6    8
#2    2    1    2
#3    2    7    7
#4    6    2    2

Both the subset datasets have the same dimensions, so we just + to get the output columns that will + for corresponding elements

 df1[-1][c(TRUE, FALSE)]+df1[-1][c(FALSE, TRUE)]
 #  col1 col3 col5
 #1    5    8   10
 #2    5    5    5
 #3    6   16   15
 #4   13    5   11

data

df1 <- structure(list(id = 1:4, col1 = c(2L, 3L, 4L, 7L), col2 = c(3L, 
2L, 2L, 6L), col3 = c(2L, 4L, 9L, 3L), col4 = c(6L, 1L, 7L, 2L
), col5 = c(2L, 3L, 8L, 9L), col6 = c(8L, 2L, 7L, 2L)), .Names = c("id", 
"col1", "col2", "col3", "col4", "col5", "col6"), class = "data.frame",
row.names = c(NA, -4L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • it would be great if you could explain what you did, I undestand what df1[-1] does but not able to get the whole thing. Also is it possible to do using for loop ? – Dheeraj Singh Aug 13 '15 at 11:45
  • @DheerajSingh Updated with some explanations. Hope it helps – akrun Aug 13 '15 at 11:52
1

This will do...

df$t_1 <- df$col1 + df$col2
df$t_3 <- df$col3 + df$col4
df$t_5 <- df$col5 + df$col6

You don't need to run a loop.

Gaurav
  • 1,597
  • 2
  • 14
  • 31
0

I think it is worth mentioning other approach by Tyler Rinker in this post adapted to this problem. We create a list of pairs of columns to pass it later to lappy. Finally, we combine the original data frame (df1) and the matrix (df2).

n <- ncol(df1)
ind <- split(2:n, rep(2:n, each = 2, length = n - 1))
df2 <- do.call(cbind, lapply(ind, function(i) rowSums(df1[, i])))
cbind(df1, df2

Output:

  id col1 col2 col3 col4 col5 col6  2  3  4
1  1    2    3    2    6    2    8  5  8 10
2  2    3    2    4    1    3    2  5  5  5
3  3    4    2    9    7    8    7  6 16 15
4  4    7    6    3    2    9    2 13  5 11
Community
  • 1
  • 1
mpalanco
  • 12,960
  • 2
  • 59
  • 67