3

I have a data frame with several columns containing different areas by location. In another column I have a density value. I would like to know how to create a new table (ideally through a loop) with each area multiplied by the density. My data frame looks like:

X   Area1  Area2  Area3  Area4 Density
A   10.1    12     20     25    0.04
B   4.2     7.3    30     35    0.05
C   5.3     9.6    10     15    0.07
D   0.2     0.3    2      3     0.01

I have seem a similar question at: Multiply many columns by a specific other column in R with data.table? but cannot figure out a way to adapt it to work for my data. Many thanks :)

user3489562
  • 249
  • 1
  • 3
  • 11
  • If `dt` is your *data.table*: `dt[, 2:5 := lapply(.SD, '*', Density), .SDcols = 2:5][]`? – Jaap Oct 13 '17 at 11:19
  • @Jaap, all they need to do is make a copy, the rest is duplicate, no? – talat Oct 13 '17 at 11:32
  • 1
    @docendodiscimus yes, but added an answer because it wasn't clear to OP apparently; I've also added a link to the standard `copy` question which makes it more complete imo – Jaap Oct 13 '17 at 11:37

1 Answers1

2

If dt is your data.table then

dtcopy <- copy(dt)
dtcopy[, 2:5 := lapply(.SD, '*', Density), .SDcols = 2:5][]

gives the following result:

> dtcopy
   X Area1 Area2 Area3 Area4 Density
1: A 0.404 0.480  0.80  1.00    0.04
2: B 0.210 0.365  1.50  1.75    0.05
3: C 0.371 0.672  0.70  1.05    0.07
4: D 0.002 0.003  0.02  0.03    0.01

While your original data.table dt is still unchanged because you used the copy-function to make a copy:

> dt
   X Area1 Area2 Area3 Area4 Density
1: A  10.1  12.0    20    25    0.04
2: B   4.2   7.3    30    35    0.05
3: C   5.3   9.6    10    15    0.07
4: D   0.2   0.3     2     3    0.01

See also Understanding exactly when a data.table is a reference to (vs a copy of) another data.table on why you need to use copy.


Used data:

library(data.table)
dt <- fread('X   Area1  Area2  Area3  Area4 Density
             A   10.1    12     20     25    0.04
             B   4.2     7.3    30     35    0.05
             C   5.3     9.6    10     15    0.07
             D   0.2     0.3    2      3     0.01')
Jaap
  • 81,064
  • 34
  • 182
  • 193
  • Thank you. Can I ask you to clarify what the .SD and .SDcols are please? – user3489562 Oct 13 '17 at 11:31
  • @user3489562 `.SD` (and `.SDcols`) stands for **S**ubset of **D**ata; using these concepts specifies to which columns the function has to be applied; I this case I used column positions for that, but you can also use columnnames – Jaap Oct 13 '17 at 11:33
  • ok if I use what you have suggested I get the following error message: unused argument (.SDcols = 2:5) – user3489562 Oct 13 '17 at 11:38
  • @user3489562 Do you have `data.table` loaded? Do you have the latest version (v1.10.4)? – Jaap Oct 13 '17 at 11:39
  • @Japp - I have checked and I am actually using a data frame not a data table which is probably why this isn't working. Is there a way to do it on a data frame? – user3489562 Oct 13 '17 at 12:03
  • @user3489562 If `df` is your dataframe: `df[2:5] <- lapply(df[2:5], '*', df$Density)` – Jaap Oct 13 '17 at 12:15
  • @Japp thank you :) – user3489562 Oct 13 '17 at 12:28