data frame
I would like the function to take the difference between the 5th and 4th column. Then, the 4th and 3rd column and then last the 3rd and 2nd column.
We can do
cbind(df[1], df[3:5] - df[2:4])
# id Value2 Value3 Value4
#1 A234 5 NA NA
#2 B345 5 0 5
#3 C500 5 -10 NA
df[3:5] - df[2:4]
works because element-wise arithmetic is well-defined in R between two data frames of the same size. In particular, column names of DF1 - DF2
would inherits column names of the first data frame DF1
.
We can also use negative indexing:
df0 <- df[-1] ## drop "id" column
cbind(df[1], df0[-1] - df0[-length(df0)])
# id Value2 Value3 Value4
#1 A234 5 NA NA
#2 B345 5 0 5
#3 C500 5 -10 NA
caveat:
Since a data frame may store data of different types in different columns, I advise that you first check its columns before trying to take difference, otherwise arithmetic operation may be invalid. With your example data frame, we can do
sapply(df, class)
# id Value Value2 Value3 Value4
#"character" "integer" "integer" "integer" "integer"
So taking difference between the last 4 columns is valid.
Here is another example with iris
dataset:
sapply(iris, class)
#Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# "numeric" "numeric" "numeric" "numeric" "factor"
The last column is a "factor" that can not be used for valid arithmetic.
Note that we use class
rather than mode
for type checking on each data frame column, as it does a more comprehensive check. See this Q & A for more explanation.
matrix
A matrix can only hold a single type of data. Use mode
to check data type to ensure that arithmetic is valid. For example, you can't do arithmetic on "character" data.
Suppose we have a "numeric" matrix
set.seed(0)
A <- round(matrix(runif(25), 5, 5), 2)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 0.90 0.20 0.06 0.77 0.78
#[2,] 0.27 0.90 0.21 0.50 0.93
#[3,] 0.37 0.94 0.18 0.72 0.21
#[4,] 0.57 0.66 0.69 0.99 0.65
#[5,] 0.91 0.63 0.38 0.38 0.13
mode(A)
#[1] "numeric"
We can use the following to take difference between column 2 and column 1, column 3 and column 2, etc:
A[, -1, drop = FALSE] - A[, -ncol(A), drop = FALSE]
# [,1] [,2] [,3] [,4]
#[1,] -0.70 -0.14 0.71 0.01
#[2,] 0.63 -0.69 0.29 0.43
#[3,] 0.57 -0.76 0.54 -0.51
#[4,] 0.09 0.03 0.30 -0.34
#[5,] -0.28 -0.25 0.00 -0.25