4

I have two data set, one with only world grids "Ggrid" - (LON- -179.875, 179.875 and LAT- -89.875, 89.875. Making a total of 1036800 ) and the other with world grids and oxygen data at different depth "1JAN" - ( LON- -79.5, 179.5 and LAT- -89.5, 89.85.). I would like to merge this data by the World grids so that I would have a total of 1036800( 720 by 1440) and the cell without data should be NA.

And I tried this;

> ENV1<-read.csv('1JAN.csv')
> Ggrid<-read.csv('Ggrid.csv')
> head(Ggrid)
       LON     LAT
1 -179.875 -89.875
2 -179.875 -89.625
3 -179.875 -89.375
4 -179.875 -89.125
5 -179.875 -88.875
6 -179.875 -88.625

> ENV1 <- ENV1[,1:7]
> head(ENV1)
    LAT    LON   X0    X5   X10
1 -77.5 -178.5 8.28    NA    NA
2 -77.5 -174.5   NA    NA    NA
3 -77.5 -170.5 7.96 7.991 8.000
4 -77.5 -167.5 8.08 8.090 8.100
5 -77.5 -165.5 8.09 8.154 8.180
6 -77.5 -163.5 8.93 8.923 8.905


> m2 <- merge(Ggrid, ENV1, by = c("LAT","LON")all.x=T)


1   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
2   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
3   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
4   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
5   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA
6   NA   NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA

The problem is the coordinates do not match and all points could be located on Ggrid. I asked the question earlier and answers were given if the coordinate match but in this new case, the coordinates are different.

ENV1 looks like this:

LON LAT X0  X5  X10
-77.5   -178.5  8.28    NA  NA
-77.5   -178    7.28    NA  NA
-77.5   -177.5  8.06    NA  NA
-77.5   -177    7.65    7.43    NA
-77.5   -176.5  7.54    7.32    NA
-77.5   -176    7.43    7.21    NA
-77.5   -175.5  7.32    7.1 7.28
-77.5   -175    7.21    6.99    8.06
-77.5   -174.5  7.1 6.88    7.65
-77.5   -174    6.99    7.43    7.54
-77.5   -173.5  6.88    7.32    6.88
-77.5   -173    6.77    7.21    7.28
-77.5   -172.5  6.66    7.28    7.28

after merging with COO; it should look like this;

LON LAT X0  X5  X10
-77.675 -178.875    8.28    NA  NA
-77.675 -178.625    7.28    NA  NA
-77.675 -177.375    8.06    NA  NA
-77.675 -177.125    7.65    7.43    NA
-77.675 -176.875    7.54    7.32    NA
-77.675 -176.625    7.43    7.21    NA
-77.675 -175.375    7.32    7.1 7.28
-77.675 -175.125    7.21    6.99    8.06
-77.675 -174.875    7.1 6.88    7.65
-77.675 -174.625    6.99    7.43    7.54
-77.675 -173.375    6.88    7.32    6.88
-77.675 -173.125    6.77    7.21    7.28
-77.675 -172.875    6.66    7.28    7.28

Hope this help further. Thanks

Jaap
  • 81,064
  • 34
  • 182
  • 193
  • In the future it may be better to ask a new question rather than change the question. Future readers who came across this may not see the context between answers and the question (which have changed). You also have a higher chance of people seeing the question and trying to help when it's a new one. – Ricky Dec 19 '15 at 23:23
  • I did asked another question but people say it is a duplicate, so I was told I should edit this to accommodate the new question. Thanks – user5545418 Dec 20 '15 at 00:02
  • @user5545418 I did not tell you that. I told you that the[linked question and answer](http://stackoverflow.com/questions/31668163/geographic-distance-between-2-lists-of-lat-lon-coordinates) give a solution on how to add values from one dataframe to another dataframe based on proximity of latitude-longitude combinations. Please study that answer and try to apply that to your own data. Duplicate means that a similar question had been asked before. If you need more clarification, let me know with a comment. – Jaap Dec 20 '15 at 09:20

2 Answers2

5

I think you're looking for a left join. Try m2 <- merge(Ggrid, ENV1, by=c("LAT", "LON"), all.x=T)

Ricky
  • 4,616
  • 6
  • 42
  • 72
  • @ Ricky, you are the man. Thanks it works – user5545418 Dec 19 '15 at 02:26
  • that works only when the coordinates are the same, please can you help on this? thanks – user5545418 Dec 19 '15 at 20:36
  • Not sure if I understand you correctly when you said the "coordinates do not match", do you mean all the `LAT` and `LON` in `ENV1` are not found in `Ggrid` ? – Ricky Dec 19 '15 at 23:26
  • yes the coordinates are not found because ENV1 is 1° grid and Ggrid is 1/4° grid. Thanks – user5545418 Dec 19 '15 at 23:49
  • If it's not found, then it's the correct behaviour to say that it's not found (i.e. `NA`) isn't it? Maybe you should give an example of what you expect the output to be given the input, because I would have expected all `NA` to be correct. – Ricky Dec 20 '15 at 00:01
  • Yes NA is correct however, what I really want is that the points should be placed in the closest grid in Ggrid. – user5545418 Dec 20 '15 at 00:03
  • As my earlier comment: Can you give at least one manually worked example of what happened that's not what you want, and what's the actual output, so people who want to help can know more clearly what you have in mind. You may understand what you are trying to say, but I personally am drawing blanks. You can edit the question to add the example, so people who don't read this comment thread will also see. – Ricky Dec 20 '15 at 00:29
  • I just did. Thanks !!! – user5545418 Dec 20 '15 at 01:07
3

More eloquently put, use the package dplyr to achieve the same results:

Ggrid %>% left_join(ENV1, by=c("LAT", "LON"))
Steven_
  • 738
  • 2
  • 7
  • 20