0

I am trying to merge two datasets

  1. se_lif_1 with columns SE_NO and TOT_CV_LIF
  2. ext_merchant_account with the 76 columns that include:

    "SE_NO","SEIMS_INDUS_DS_CD","CUR_MER_STA_CD","CLNT_MAN_CHAN_CD","MER_SETUP_DT"

By using below code:

se_lif_2 <- merge(se_lif_1,
     ext_merchant_account[(ext_merchant_account$CUR_MER_STA_CD %in%
     c("A","R")) & (ext_merchant_account$CLNT_MAN_CHAN_CD %in% 
     c("I","X", " ")) & (ext_merchant_account$MER_SETUP_DT < 
     S_date),"SEIMS_INDUS_DS_CD"],by = "SE_NO" )

But was getting below error:

"Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column"

Also S_date is a object with class "Date" containing a single record which I am using in one of the logical condition to subset the data from ext_merchant_account data frame.

I also checked the class of the variable in "by", which is same for both the dataset (i.e an integer).

I also tried using by.x & by.y but was getting the same error again.Could any one point out the error that I am doing in this code.

Thanks for your help in advance.

Cheers,

Amit

akrun
  • 874,273
  • 37
  • 540
  • 662
Amit Mishra
  • 33
  • 1
  • 10
  • Please show a reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – akrun Mar 25 '15 at 03:31
  • Do you have `SE_NO` column in the output of `ext_merchant_account[(ext_merchant_account$CUR_MER_STA_CD %in% c("A","R")) & (ext_merchant_account$CLNT_MAN_CHAN_CD %in% c("I","X", " ")) & (ext_merchant_account$MER_SETUP_DT < S_date),"SEIMS_INDUS_DS_CD"]`. By just looking at the code, it seems that you have only `"SEIMS_INDUS_DS_CD"` column. – akrun Mar 25 '15 at 03:51
  • I think you can use `by.x='SE_NO', by.y='SEIMS_INDUS_DS_CD)` if both columns represents the same 'SE_NO' – akrun Mar 25 '15 at 03:59
  • yes, but was getting the same error even after selecting all the columns – Amit Mishra Mar 25 '15 at 04:02
  • It was just a guess. Without an example data or str() of both datasets, it is hard to say what might be wrong in your code – akrun Mar 25 '15 at 04:03
  • @akrun : Thanks a lot .......there was some syntax error before when I tried to select all columns from the dataset "ext_merchant_account" ....now able to merge the dataset properly by using below code: merge(se_lif_1,ext_merchant_account[(ext_merchant_account$CUR_MER_STA_CD %in% c("A","R")) & (ext_merchant_account$CLNT_MAN_CHAN_CD %in% c("I","X"," ")) & (ext_merchant_account$MER_SETUP_DT < S_date),c("SE_NO","SEIMS_INDUS_DS_CD")], by = c('SE_NO')) – Amit Mishra Mar 25 '15 at 05:14

1 Answers1

1

My guess is that you failed to select SE_MO in your code. To use merge both data frames need to share a common variable. Most likely, one data frame is missing SE_MO.

The merge code should look like:

newDF=merge(df1,df2,by=SE_MO)

To rename a variable you may want to use:

names(df1)[1]=c("SE_MO") #'1' refers to location of column
User7598
  • 1,658
  • 1
  • 15
  • 28
  • But was getting the same error even when selecting all the columns from the dataset "ext_merchant_account" , will be sharing table structure and sample data soon. – Amit Mishra Mar 25 '15 at 04:01