0

I want to add a new column depending on another columns row combination.

For example lets say I have a data frame like below:

library(dplyr)
library(minpack.lm)
library(broom)
No  =  c(replicate(1,rep(letters[1:6],each=10)))
ACME <- as.character(rep(rep(c(78,110),each=10),times=3))
ARGON <- as.character(rep(rep(c(256,320,384),each=20),times=1))
V <- rep(c(seq(2,40,length.out=5),seq(-2,-40,length.out=5)),times=1)
DQ0 = c(replicate(2, sort(runif(10,0.001,1))))
direc <- rep(rep(c("North","South"),each=5),times=6)

df <- data.frame(No,ACME,ARGON,V,DQ0,direc)


>df
    No ACME ARGON     V        DQ0 direc
1    a   78   256   2.0 0.07532351 North
2    a   78   256  11.5 0.13785481 North
3    a   78   256  21.0 0.27397961 North
4    a   78   256  30.5 0.44296243 North
5    a   78   256  40.0 0.45721902 North
6    a   78   256  -2.0 0.68077463 North
7    a   78   256 -11.5 0.68764283 North
8    a   78   256 -21.0 0.76284209 North
9    a   78   256 -30.5 0.81040056 North
10   a   78   256 -40.0 0.95336230 North
11   b  110   256   2.0 0.04190305 South
12   b  110   256  11.5 0.17484353 South
13   b  110   256  21.0 0.22409319 South
----------------

I fit this df with using nlsLM fucntion from minpack.lm package

->fit part

nls_fit=nlsLM(DQ0~ifelse(df$direc=="North"&V<J1, exp((-t_pw)/f0*exp(-del1*(1-V/J1)^2)),1)*ifelse(df$direc=="South"&V>J2, exp((-t_pw)/f0*exp(-del2*(1-V/J2)^2)),1)
            ,data=df,start=c(del1=1,J1=15,del2=1,J2=-15),trace=T) 

After fitting I want to create a new data frame df_new with new column called address

  df_new<- df%>%
  group_by(No)%>%
  do(data.frame(model=tidy(nls_fit)))%>% # **this part is related fit fitting result. After this process I got "model.term" and "model.estimate"** columns and in the next step I renamed them.
  select_("delta"="model.term","value"= "model.estimate")%>%
  filter(delta%in%c("del1","del2"))%>% #**I filter some fitting parameters**
  mutate(adress=interaction(ACME,ARGON))%>% #this part is not working  
  ungroup

I am getting error which says

Error: incompatible size (%d), expecting %d (the group size) or 1

Finally I have a this kind of output without mutatate part

df_new

    No delta    value
1   a  del1 1.479056
2   a  del2 1.016404
3   b  del1 1.479056
4   b  del2 1.016404
5   c  del1 1.479056
6   c  del2 1.016404
7   d  del1 1.479056
8   d  del2 1.016404
9   e  del1 1.479056
10  e  del2 1.016404
11  f  del1 1.479056
12  f  del2 1.016404

I wish to get something like this;

    No delta  value    adress
1   a  del1 1.479056   78.256
2   a  del2 1.016404   78.256
3   b  del1 1.479056  110.256
4   b  del2 1.016404  110.256
5   c  del1 1.479056   78.320
6   c  del2 1.016404   78.320
7   d  del1 1.479056  110.320
8   d  del2 1.016404  110.320
9   e  del1 1.479056   78.384
10  e  del2 1.141958   78.384
11  f  del1 1.019201  110.384
12  f  del2 1.141958  110.384
Jaap
  • 81,064
  • 34
  • 182
  • 193
Alexander
  • 4,527
  • 5
  • 51
  • 98
  • Where does `nls_fit` come from? Please include the packages you used. – Jaap Aug 20 '15 at 08:43
  • @Jaap Do you want me to add fitting part? `nls_fit` comes from `minpack.lm` package. I fitted some of the columns of `df` and excluded them here since they are no relevant to problem here. I put the output `df_new` here. – Alexander Aug 20 '15 at 08:46
  • @Jaap Ok I attached the relevant packages. – Alexander Aug 20 '15 at 08:49
  • It's always best to post a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). Including code that can't be reproduced will not help in getting an answer. So, it would be nice if you included the `nls_fit` object as well. – Jaap Aug 20 '15 at 08:52
  • @Jaap. thanks I understand. Please check the problem again. And even I tried to avoid to put fitting part because so far it has a problem I added upon your suggestion. I hope you can give me some clue for me:) please check that out [link](http://stackoverflow.com/questions/32107224/conditional-nls-fitting-with-dplyrbroom) – Alexander Aug 20 '15 at 09:04
  • The code you've provided to create `df` throws an error because it's looking for `direc` and can't find it. Would you please add a line to create that object to fit your specs? – ulfelder Aug 20 '15 at 09:50
  • I made something up for `direc`, and now I'm getting an error when I try to run `nlsLM()` that is unrelated to the structure of `direc`. PLEASE make sure your reproducible example is actually reproducible before asking other people to spend their time trying to help you. – ulfelder Aug 20 '15 at 09:58
  • You have all kinds of parameters in your model which are not in the Q. As @ulfelder said: make it reproducible. – Jaap Aug 20 '15 at 10:47
  • @ulfelder I am so sorry direc is added. – Alexander Aug 20 '15 at 11:42
  • @Jaap I understand your insist about the reproducible example and if you check my other questions you will see I always provide reproducible example. Only this time I though it should not be needed. – Alexander Aug 20 '15 at 11:43
  • 1
    Ok, I understand. But in that case you could omitted quite some infor from the question as it is not needed. You are actually asking for how to join the two dataframes. See my answer for a solution. – Jaap Aug 20 '15 at 12:38

1 Answers1

3

What you actually want is a join between df_new and df. You can do that with for example data.table:

library(data.table) #v1.9.5+
setDT(df_new)[df, adr:=adress, on="No"]

If you want to do it with the latest version from CRAN, you can do:

setDT(df_new, key="No")[setDT(df, key="No"), adr:=adress]

both give the following result:

> dt_new
    No delta    value     adr
 1:  a  del1 1.479056  78.256
 2:  a  del2 1.016404  78.256
 3:  b  del1 1.479056 110.256
 4:  b  del2 1.016404 110.256
 5:  c  del1 1.479056  78.320
 6:  c  del2 1.016404  78.320
 7:  d  del1 1.479056 110.320
 8:  d  del2 1.016404 110.320
 9:  e  del1 1.479056  78.384
10:  e  del2 1.016404  78.384
11:  f  del1 1.479056 110.384
12:  f  del2 1.016404 110.384

An approach with dplyr:

df_new2 <- df %>% select(No, adress) %>% group_by(No) %>% 
  summarise(adr = unique(adress)) %>% 
  left_join(df_new, ., by="No")

which gives the same result:

> identical(df_new2, setDF(df_new))
[1] TRUE

Note: I used the development version of data.table

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • thanks a lot. Can we do it also inside of `df_new` with mutate? Besides why I'm getting the same fitting results for all groups despite they are replicated in reproducible example? – Alexander Aug 20 '15 at 12:41
  • I mean del1 and del2 should differs. – Alexander Aug 20 '15 at 12:43
  • @aoronbarlow Added a `dplyr` approach. I'm not sure what you mean by "del1 and del2 should differ". They are the same in the resulting dataframe/datatable because the join is only on `No`. It would be impossible to join also on `delta`, as that variable is not part of `df`. – Jaap Aug 20 '15 at 13:36
  • thanks for adding dplyr approach too. I"m sorry for misunderstanding. What I mean by they should be differ is that fitting results with `tidy` function is giving the same del1 and del2 values for each group `a:f`. If you have an idea can you comment to this question. [[link](http://example.com) (http://stackoverflow.com/questions/32107224/conditional-nls-fitting-with-dplyrbroom) – Alexander Aug 21 '15 at 00:15
  • Now I realized that `dt_new <- setDT(df_new)[df, adr:=adress, on="No"]` is not working. its says **Error in `[.data.table`(setDT(df_new), df, `:=`(adr, adress), on = "No") : unused argument (on = "No")** – Alexander Aug 21 '15 at 01:23
  • in addition there are many data frames `df_new2` `dt_new`. What I mean by inside of dplyr is that doing everything inside of `df_new`. – Alexander Aug 21 '15 at 01:25
  • @aoronbarlow The error is probably due to the fact that you are not using the [latest version of `data.table`](https://github.com/Rdatatable/data.table/wiki/Installation) (see also the comment behind the `library(data.table)` call & the last line of my answer). The `on =` argument was introduced in `v1.9.5`, which is not on CRAN yet. Furthermore, I now added an option which should work with the latest version from CRAN as well. – Jaap Aug 21 '15 at 05:18
  • well I could't download the latest version of `data.table`. because of this error **Error in curl::curl_fetch_memory(url, handle = handle) : Couldn't resolve host name** – Alexander Aug 21 '15 at 05:44
  • @aoronbarlow Strange, I have no problem with it. Which code did you use? – Jaap Aug 21 '15 at 06:04
  • I used the link that you provided. `install_github("Rdatatable/data.table", build_vignettes = FALSE)` – Alexander Aug 21 '15 at 06:08
  • @aoronbarlow I just tried it again and it is working for me. Which version of R are you using? – Jaap Aug 21 '15 at 06:14
  • R version 3.1.3 (2015-03-09) – Alexander Aug 21 '15 at 06:19
  • @aoronbarlow I use 3.2.1. Else than that I have no clue what the source of that error might be. Have a look at [this](http://stackoverflow.com/questions/31311872/error-while-using-install-github-devtools-timeout-issue) and [this](http://stackoverflow.com/questions/31293325/r-install-github-fails) question for example. If you still can't get it working, you might consider posting a new question about this specific installation problem (be sure to include the output of your `sessionInfo()` then). – Jaap Aug 21 '15 at 06:33