The problem is that function definitions, such as that of "sweep" are often explained in a way that offers little further information. "Sweep" "sweeps out" things. I don't know what that means, and have little chance of finding out what it means by reading the definition.
-
2Possible duplicate of [How to use the 'sweep' function](https://stackoverflow.com/questions/3444889/how-to-use-the-sweep-function) – kstew Aug 02 '19 at 22:32
-
@kslew, That question is how it is used but this question is what does it do. – G. Grothendieck Aug 03 '19 at 02:56
2 Answers
The sweep()
function iterates through a matrix by row (MARGIN = 1
) or column (MARGIN = 2
) and performs some operation that you want (defined by FUN
) taking the input of (STATS
). As such, it is useful for performing some operation (FUN
) across rows/columns with different inputs.
Using d.b.'s example in the comments, suppose you have a 3x3 matrix full of 0s:
m <- matrix(0, 3, 3)
m
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
If you want to add 7 to the first column, 3 to the second column, and 11 to the third column (passed to STATS
):
sweep(x = m, MARGIN = 2, STATS = c(7, 3, 11), FUN = "+")
[,1] [,2] [,3]
[1,] 7 3 11
[2,] 7 3 11
[3,] 7 3 11
or you can do it by row (MARGIN = 1
):
sweep(x = m, MARGIN = 1, STATS = c(7, 3, 11), FUN = "+")
[,1] [,2] [,3]
[1,] 7 7 7
[2,] 3 3 3
[3,] 11 11 11
Thus, the sweep()
function is most useful when you want to apply a different value to a given function across rows/columns of your matrix. (Note: You can also apply the function across cells with MARGIN = 1:2
).

- 2,649
- 13
- 30
If the operation given in the 4th argument is minus then sweep refers to subtracting the third argument from each row or column of the first argument.
A special case may make it clearer. We will use the built-in 6x2 BOD
data frame and use subtraction for the 4th argument.
Then sweep(BOD, 1, v, "-")
subtracts the vector v
from each column of BOD
giving an identical result to BOD - cbind(v, v)
(which can be simplified to just BOD - v
). Similarly sweep(BOD, 2, u, "-")
subtracts the vector u
from each row of BOD
giving an identical result as BOD - rbind(u, u, u, u, u, u)
.
In detail, we provide several equivalences for each of the two cases.
# MARGIN = 1 case. These each give identical reesults.
v <- rowMeans(BOD) # any vector having length of nrows(BOD) would work
sweep(BOD, 1, v, "-")
BOD - cbind(v, v)
BOD - matrix(v, nrow(BOD), ncol(BOD))
cbind(BOD[1] - v, BOD[2] - v)
BOD - v
In words for the MARGIN=2 case it subtracts the vector u from each row. These each give identical results except as noted.
# MARGIN = 2 case. These each give identical results except as noted.
u <- colMeans(BOD) # any vector of length ncol(BOD) will work
sweep(BOD, 2, u, "-")
BOD - rbind(u, u, u, u, u, u)
BOD - matrix(u, nrow(BOD), ncol(BOD), byrow = TRUE)
rbind(BOD[1, ] - u, BOD[2, ] - u, BOD[3, ] - u, BOD[4, ] - u,
BOD[5, ] - u, BOD[6, ] - u)
mapply("-", BOD, u) # matrix rather than data.frame
scale(BOD, scale = FALSE) # matrix rather than data.frame

- 254,981
- 17
- 203
- 341