update a columns in dataframe based on another dataframe using loop

Question

This is similar to this question. But I need to do this for 1000 dataframes.

I have created 1000 data frame using codes below:

df <- replicate(1000, sensitivity.score, simplify = FALSE)
names(df) <- paste("score.rand", 1:length(df), sep = "")
list2env(df, envir = .GlobalEnv)

And the first data frame looks like this:

head(score.rand1)
                     Binomial S1 S2 S3 S4 S5 S6   S7   S8   S9
1    Astacoides betsileoensis  H  L  L  L  H  L <NA>    L    L
2        Astacoides caldwelli  H  L  L  L  H  L <NA>    H <NA>
3        Astacoides crosnieri  L  L  L  L  H  L <NA> <NA>    L
4     Astacoides granulimanus  L  L  L  L  H  L    H <NA>    L
5           Astacoides hobbsi  H  L  H  L  H  L <NA> <NA>    L
6 Astacoides madagascariensis  H  L  L  L  H  L <NA> <NA>    L

I have created another 1000 data frames using codes below:

mydf <- data.frame(Binomial, S4, S5, S6)
lst <- replicate(1000, mydf[sample(nrow(mydf)),] , simplify = FALSE)
names(lst) <- paste("rand.val", 1:length(lst), sep = "")
list2env(lst , envir = .GlobalEnv)

that look like this:

head(rand.val1)
                  Binomial       S4       S5   S6
229  Euastacus girurmulayn 46.63442 3.399884 39.0
168 Distocambarus crockeri 15.76044 6.322875 34.7
235      Euastacus jagabar 46.63442 3.399884 40.6
163        Cherax robustus 44.04395 3.108239 42.5
506  Procambarus ortmannii 88.58447 4.422301 24.0
392    Pacifastacus fortis 30.40509 5.860764 42.0

I need to replace columns S4, S5, S6 of 'score.rand1' dataframe by S4, S5, S6 columns of 'rand.val1' dataframe based on 'Binomial'. And the same for 'score.rand2' by 'rand.val2' ... 'score.rand3' by 'rand.val3' dataframe ... and so on for all 1000 dataframes.

You are looking for a `merge` function, something along the lines of `merge(score.rand1, rand.val1, by = "Binomial")`. — Roman Luštrik, Mar 27 '17 at 07:28
Thank you. But how can I do this for 1000 dataframes. I have tried using for loops but failed :( — Tiny_hopper, Mar 27 '17 at 07:31
This is why R has object `list` into which you store your data.frames and then work on the entire set using `apply` family of functions. If you have data.frames "littered" in your workspace, you'll have to manually scrape it (using `ls()`) and then retrieve them using `get()`. — Roman Luštrik, Mar 27 '17 at 07:41
That's really easier to work on lists. Why do you want to "unlist" your dfs to .Global? — utubun, Mar 27 '17 at 07:48
Possible duplicate of http://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list — zx8754, Mar 27 '17 at 07:49

score 1 · Accepted Answer · answered Mar 27 '17 at 10:57

1

Have your datasets in two separate lists and then you can use a merge function with mapply to go through your two lists of dataframes, remove the redundant columns from the first, and then merge the two together, which would look something like this:

combined = mapply(function(rand,score){
    rand$S4 = NULL
    rand$S5 = NULL
    rand$S6 = NULL
    output = merge(x = rand, y = score, by = "Binomial")
}, scores,rand)

answered Mar 27 '17 at 10:57

SolomonRoberts

114
4

Thank you. The problem is now solved. What I did is: I removed first the column form dataframes in list 'df' and then applied the following codes: `combined.list <- mapply (function (x, y) merge (x, y, by = "Binomial", all = T), x = df, y = lst, SIMPLIFY = F)` – Tiny_hopper Mar 28 '17 at 05:16

update a columns in dataframe based on another dataframe using loop

1 Answers1