I have two dataframes with unequal number of columns. I would like to subtract intensity values, column-wise (i.e, sample-wise), of rows of df2 from df1. My conditions are:
- In df1 there are multiple rows for peptide sequence (pep_seq) and their corresponding intensities per sample (int_sam) for every gene (gene_nm). The same gene appears multiple times, i.e, occupies several rows.
- In df2 the genes (rows) appears only once with its corresponding intensity values
- Therefore, df1 is much longer than df2 (e.g., 55000 rows vs 6000 rows)
- The number of intensity columns (int_samp) can be many. I have 3 in this example
Dataframe 1
pep_seq = c("aaaaaaaaa", "ababababba", "dfsfsfsfds", "xbbcbcncncc", "fbbdsgffhhh", "dggdgdgegeggerr",
"dfgthrgfgf", "wegregegg", "egegegergewge", "sfngegebser", "qegqeefbew", "qegqetegqt",
"qwtqtewr", "etghsfrgf", "sfsdfbdfbergeagaegr", "wasfqertsdfaefwe")
int_samp_1 = c("2421432", "24242424", "NA", "4684757849", "NA", "10485040", "NA",
"6849400", "40300", "NA", "NA", "NA", "556456466", "4646456466", "246464266", "4564242646")
int_samp_2 = c("NA", "5342353", "14532556", "43566", "46367367", "768769769", "797899", "NA", "NA", "NA",
"686899", "7898979", "678568", "NA", "68886", "488")
int_samp_3 = c("11351", "NA", "NA", "NA", "1354151345", "1351351354", "314534", "1535", "3145354", "4353455",
"324535", "3543445", "34535", "34535534", "NA", "NA")
gene_nm = c("A", "A", "A", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C")
df_1 = cbind.data.frame(pep_seq, int_samp_1, int_samp_2, int_samp_3, gene_nm)
Dataframe 2
int_samp_1a = c("2421432", "24242424", "NA")
int_samp_2a = c("NA", "5342353", "14532556")
int_samp_3a = c("11351", "NA", "NA")
gene_nm.a = c("A", "B", "C")
df_2 = cbind.data.frame(gene_nm.a, int_samp_1a, int_samp_2a, int_samp_3a)
Please suggest.