Forgive me if this is a blatantly obvious question, I am a beginner R user eager to learn.
I have a data frame of 4 columns with roughly 1.5 million rows containing coordinate information where each individual row represents a specific location. What I would like to do is run these data into a function that holds a series of if else statements that determine the area of the specific location within a larger box. For example, a point can be in the center, along the edge of the box within 1.5 inches, on the inside of the box but not on the edge nor at the center, or on the outside of the box.
Each if statement determines if a set of points is in a specified area, and, if it is, the result is the if statement putting a '1' in the corresponding row of another data frame.
Here is a visualization of what I am trying to do:
Take this location data from a data frame called 'dimensions':
sz_top | sz_bot | px | pz |
3.526 | 1.615| -1.165| 3.748 |
Run it through these statements (the real statements are much longer), where the 'else' condition means the point is outside the box completely:
if(in center) else if(on edge) else if(in box, but not in center or on edge) else
When the program finds which condition is true, it puts a 1 in ANOTHER data frame called 'call' in the corresponding column (these columns are columns 50-53). This is what the row would look like in the event the code found the point was in the center:
center| edge| other_in| out|
1 | 0 | 0 | 0|
One thing to note that could improve efficiency is that the coordinates are actually also contained in the 'calls' data frame in columns 22,23,26, and 27, but I moved them to 'dimensions' because it was easier for me to work with. This can definitely be changed.
I am now very unclear on how to proceed from here. I have all my if else statement written, but I am unclear on how my program will know which row it is on as to correctly mark the corresponding row with the result of the tests.
Please let me know if you would like any more information from me.
Thanks!
EDIT:
Here is a sample of the 'dimensions' data frame:
sz_top sz_bot px pz
1 3.526 1.615 -1.165 3.748
2 3.29 1.647 -0.412 1.9
3 3.29 1.647 -1.213 1.352
4 3.565 1.75 -1.041 2.419
5 3.565 1.75 -0.357 1.776
6 3.565 1.75 0.838 0.834
7 3.541 1.724 -1.619 3.661
8 3.541 1.724 -2.498 2.421
9 3.541 1.724 -1.673 2.348
10 3.541 1.724 -1.572 2.982
11 3.305 1.5 -1.316 2.842
Here is an example of one of my if statements. The others are fairly similar, just looking at different locations around the box in question:
if(
((as.numeric(as.character(dimensions$px))*12)>= -3)
&&
((as.numeric(as.character(dimensions$px))*12)<= 3)
&&
((as.numeric(as.character(dimensions$pz))*12)<=((as.numeric(as.character(dimensions$sz_top))*12-as.numeric(as.character(dimensions$sz_bot))*12)/2)+(as.numeric(as.character(dimensions$sz_bot))*12)+3)
&&
((as.numeric(as.character(dimensions$pz))*12)>=((as.numeric(as.character(dimensions$sz_top))*12-as.numeric(as.character(dimensions$sz_bot))*12)/2)+(as.numeric(as.character(dimensions$sz_bot))*12)-3)
){return(1)
}