I am working with two different large data set and trying to make use of mapply()
to get iterative functions working.
The goal is to take each data point column wise from Data_1, and compare it against both the data points in column of Data_2. So, Data_1[1,1] will be compared against Data_2[1,1] and Data_2[2,1] only. To be more clear, data1 column in Data_1 will only be compared against dataA elements in Data_2, thus no cross column comparison.
Data_1: NxM
data1 data2 data3 data4
-0.710003 -0.714271 -0.709946 -0.713645
-0.710458 -0.715011 -0.710117 -0.714157
-0.71071 -0.714048 -0.710235 -0.713515
-0.710255 -0.713991 -0.709722 -0.713972
Data_2: PxQ
dataA dataB dataC dataD
-0.71097 -0.714059 -0.70928 -0.714059
-0.710343 -0.714576 -0.709338 -0.713644
I had earlier written a for()
while()
loop based algorithm, but the run time was too much as the original data is . Then, I moved to apply()
based logic, but still had loops within function I was calling, so that didn't speed up the code. Based on my earlier question, I am figuring out better way to do this with mapply()
.
The part I am not able to visualize is the column to row comparison and how mapply()
will navigate over it recursively. How can I use mapply()
or lapply()
to get this done efficiently?
Any suggestions will be helpful, thanks.