Fill data frame by column with for loop

Question

I created an empty data frame with 11 columns and 15 rows and subsequently named the columns.

L_df <- data.frame(matrix(ncol = 11, nrow = 15))
names(L_df) <- paste0("L_por", 0:10)

w <- c(0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2, 2.2, 2.4, 2.6,  2.8, 3)
wu <- 0
L <- 333.7
pm <- c(2600, 2574, 2548, 2522, 2496, 2470, 2444, 2418, 2392, 2366,  2340)

The data frame looks like this:

head(L_df)
  L_por0 L_por1 L_por2 L_por3 L_por4 L_por5 L_por6 L_por7 L_por8 L_por9 L_por10
1     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA
2     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA
3     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA
4     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA
5     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA
6     NA     NA     NA     NA     NA     NA     NA     NA     NA     NA      NA

Now, I would like to fill the data frame by column, based on a formula. I tried to express this with a nested for loop:

 for (i in 1:ncol(L_df)) {
  pm_tmp <- pm[i]
  col_tmp <- colnames(L_df)[i]
  for (j in 1:nrow(L_df)) {
    w_tmp <- w[j]
    L_por_tmp <- pm_tmp*L*((w_tmp-wu)/100)
    col_tmp[j] <- L_por_tmp
  }
}

For each column, I iterate over a predefined vector pm of length 11. For each row, I iterate over a predefined vector w of length 15 (repeats each column).

Example: First, select pm[1] for the first column. Second, select w[i] for each row in the first column. Store the formula in L_por_tmp and use it to fill the first column from row1 to row15. The whole procedure should start all over again for the second column (with pm[2]) with w[i] for each row and so on. wu and L are fixed in the formula.

R executes the code without an error. When I check the tmp values, they are correct. However, the data frame remains empty. L_df does not get filled. I would like solve this with a loop but if you have other solutions, I am happy to hear them! I get the impression there might be a smoother way of doing this. Cheers!

The error is simple enough, there is no assign method for `ncol()`. Are you confusing it with `colnames()`? — AkselA, Dec 09 '17 at 13:56
Can you edit your question to include the values for `w`, `wu`, `L`, and `pm`? — duckmayr, Dec 09 '17 at 14:42
@AkselA You're right. I edited the code so that it can access each column name `col_tmp`. No error subsequently but the data frame is still empty (i.e. filled with `NA`). Any idea on how to proceed? — Arne Brandschwede, Dec 09 '17 at 14:43
@duckmayr Sure, see the edited post. It includes the values now. — Arne Brandschwede, Dec 09 '17 at 14:47

duckmayr · Accepted Answer · 2017-12-09T14:56:28.753

Solution

L_df <- data.frame(sapply(pm, function(x) x * L * ((w - wu) / 100)))
names(L_df) <- c("L_por0", "L_por1", "L_por2", "L_por3", "L_por4", "L_por5",
                 "L_por6", "L_por7", "L_por8", "L_por9", "L_por10")
L_df
 L_por0    L_por1    L_por2    L_por3    L_por4    L_por5    L_por6    L_por7
1   1735.24  1717.888  1700.535  1683.183  1665.830  1648.478  1631.126  1613.773
2   3470.48  3435.775  3401.070  3366.366  3331.661  3296.956  3262.251  3227.546
3   5205.72  5153.663  5101.606  5049.548  4997.491  4945.434  4893.377  4841.320
4   6940.96  6871.550  6802.141  6732.731  6663.322  6593.912  6524.502  6455.093
5   8676.20  8589.438  8502.676  8415.914  8329.152  8242.390  8155.628  8068.866
6  10411.44 10307.326 10203.211 10099.097  9994.982  9890.868  9786.754  9682.639
7  12146.68 12025.213 11903.746 11782.280 11660.813 11539.346 11417.879 11296.412
8  13881.92 13743.101 13604.282 13465.462 13326.643 13187.824 13049.005 12910.186
9  15617.16 15460.988 15304.817 15148.645 14992.474 14836.302 14680.130 14523.959
10 17352.40 17178.876 17005.352 16831.828 16658.304 16484.780 16311.256 16137.732
11 19087.64 18896.764 18705.887 18515.011 18324.134 18133.258 17942.382 17751.505
12 20822.88 20614.651 20406.422 20198.194 19989.965 19781.736 19573.507 19365.278
13 22558.12 22332.539 22106.958 21881.376 21655.795 21430.214 21204.633 20979.052
14 24293.36 24050.426 23807.493 23564.559 23321.626 23078.692 22835.758 22592.825
15 26028.60 25768.314 25508.028 25247.742 24987.456 24727.170 24466.884 24206.598
      L_por8    L_por9   L_por10
1   1596.421  1579.068  1561.716
2   3192.842  3158.137  3123.432
3   4789.262  4737.205  4685.148
4   6385.683  6316.274  6246.864
5   7982.104  7895.342  7808.580
6   9578.525  9474.410  9370.296
7  11174.946 11053.479 10932.012
8  12771.366 12632.547 12493.728
9  14367.787 14211.616 14055.444
10 15964.208 15790.684 15617.160
11 17560.629 17369.752 17178.876
12 19157.050 18948.821 18740.592
13 20753.470 20527.889 20302.308
14 22349.891 22106.958 21864.024
15 23946.312 23686.026 23425.740

Explanation

The sapply() function can be used to iterate over vectors in a more idiomatic way for R programming. We iterate over pm and use your formula once since R is vectorised; each time it creates a vector of length 15 (so 11 vectors of length 15), and when we wrap it in data.frame() returns the data frame you want and we add in the column names.

NOTE: Applying functions to every element of a vector using an apply() family function has some different implications than iterating using for loops. In your case, I think sapply() is easier and more understandable. For more information on when you need a loop or when something like apply is better, see for example this discussion from Hadley Wickham's Advanced R book.

Yes, a smooth solution that works nice! In a further step, is there a way to use `L_df` in another formula to fill up a new data frame `D_df` with the same dimension as `L_df`? `lambda*((172800*k*Isf)/L_df)^0.5` should fill `D_df` in the same manner as above: column by column. `lambda` is a vector of length 16 that repeats for each of the 11 columns; `k` and `Isf` are fixed. Each `L_df` element at the mth row and nth column should be used in the calculation for the mth row and nth column entry of `D_df`. Any suggestion on how to write that with sapply? Currently reading up on the apply family. — Arne Brandschwede, Dec 10 '17 at 15:20
@ArneBrandschwede If the dimensions of `L_df` and `D_df` should be equal, and `length(lambda) == nrow(L_df)`, then `((172800*k*Isf)/L_df)^0.5` should work by itself, without any need for `apply()`; in fact, it would be faster than using `apply()` because of how R's vectorized operations work. Is `lambda` of length 16 or 15? — duckmayr, Dec 10 '17 at 15:43
`lambda` is of length 15. Was a typo. It works by itself, indeed. Just did it and no need for `apply()` or a loop. In addition, wrapped the formula in `data.frame()` so that R stores the new object as a data frame. Cheers! — Arne Brandschwede, Dec 10 '17 at 15:58

score 1 · Answer 2 · answered Dec 09 '17 at 14:54

You are just doing a small mistake and you were almost there, Edited your function:

for (i in 1:ncol(L_df)) {
  pm_tmp <- pm[i]
  col_tmp <- colnames(L_df)[i]
  for (j in 1:nrow(L_df)) {
    w_tmp <- w[j]
    L_por_tmp <- pm_tmp*L*((w_tmp-wu)/100)
    L_df[ j ,col_tmp] <- L_por_tmp ##You must have used df[i, j] referencing here
  }
}

Output:

Just printing the head of few rows:

L_df
     L_por0    L_por1    L_por2    L_por3    L_por4    L_por5    L_por6    L_por7    L_por8    L_por9   L_por10
1   1735.24  1717.888  1700.535  1683.183  1665.830  1648.478  1631.126  1613.773  1596.421  1579.068  1561.716
2   3470.48  3435.775  3401.070  3366.366  3331.661  3296.956  3262.251  3227.546  3192.842  3158.137  3123.432
3   5205.72  5153.663  5101.606  5049.548  4997.491  4945.434  4893.377  4841.320  4789.262  4737.205  4685.148

Fill data frame by column with for loop

2 Answers2

Solution

Explanation

Linked