Interesting use case!! (difficult to understand in first reading)
If you want to see only "duplicated" rows , dplyr's n() in summarise gives count of every grouped variable(s) and you can filter any with more than 1 value:
df %>%
group_by(Longitude, Latitude) %>%
summarise(N=n())%>%filter(N>1)
And if you want the summation of Prob column , then as suggested by Merijn van Tilborg simply sum(Prob).
But if u want what's mentioned in last line of your post, then slightly more work needed->
your question has more than just appears (slightly ambiguous too). With "AND of only to merge where Prob is 1 and 2"
"If two rows have the same longitude and latitude values, and if they show Prob 1 and 2, I want to "merge" these rows and make Prob 3."
then we need to combine the two:
some data similar to yours:
dx <- data.table(
long = c('a','b','d1','c','a','d1','e','f','a')
,lat = c('a','b1','d1','c','a','d1','e','f','g')
,Prob = c(1,0,2,1,0,1,0)
9 records
long lat Prob
1: a a 1
2: b b1 0
3: d1 d1 2
4: c c 1
5: a a 0
6: d1 d1 1
7: e e 0
8: f f 1
9: a g 0
Now in 9 records 1st and 5th are same but different Probs; Not 1 AND 2
but 3rd and 6th have 1 AND 2, which can be filtered as !=0;
So you want to merger only 3rd and 6th BUT not 1st and 5th and hence output shud have 8 records !
1st approach : with error (if you merge commons-> then it would be 7 with simple sum of Prob!=0 like this)
dx%>%
group_by(long,lat)%>%
summarise(N=n()
,Probs = sum(Prob[Prob!=0])
)
yields 7 records merging 1st and 5th (a,a):
long lat N Probs
<chr> <chr> <int> <dbl>
1 a a 2 1
2 a g 1 0
3 b b1 1 0
4 c c 1 1
5 d1 d1 2 3
6 e e 1 0
7 f f 1 1
But if you dont want to do that:
then grouping may not be required. (Generally that's first thing comes to mind for dplyr users like me)
2nd idea (it works) :
get the data with Prob==3 first and then add it to original data after removing the eligible rows.
p3 <- dx%>%
group_by(long,lat)%>%
summarise(Prob = sum(Prob[Prob!=0])
)%>%filter(Prob==3)
yields
long lat Prob
<chr> <chr> <dbl>
1 d1 d1 3
now your desired output :
dx%>%
arrange(long)%>%
filter(!(long %in% p3$long & lat %in% p3$lat))%>%
bind_rows(p3%>%select(long,lat,Prob))
as
long lat Prob
1: a a 1
2: a a 0
3: a g 0
4: b b1 0
5: c c 1
6: e e 0
7: f f 1
8: d1 d1 3
please try and let me know if it works.