3

So I'm trying to find every Node that has at least a common Node with another one. This is the request I'm using to do this:

MATCH (source:Article)--(neighbor)--(target:Article)
WHERE NOT (source.unique_url) = (target.unique_url)
WITH DISTINCT [source.unique_url, target.unique_url] AS combo, 
     source, target, neighbor 
RETURN combo, 
       source.unique_url AS source_unique_url, 
       source.title AS source_title, 
       source.url AS source_url, 
       target.unique_url AS target_unique_url, target._id AS target_id,
       target.title AS target_title,  
       count(neighbor) AS common_neighbors
ORDER BY common_neighbors DESCENDING

But sadly [source.unique_url, target.unique_url] is always duplicated, like for one Node having a common neighbour with another one, I always get results like this:

[url1, url2]
[url2, url1]
[url1, url2]
[url2, url1]

I checked and the data is not duplicated in the DB, so the request is duplicating them, anyone knows what might be causing this ? Thanks a lot !

Charlotte Skardon
  • 6,220
  • 2
  • 31
  • 42
Papotitu
  • 407
  • 1
  • 4
  • 16
  • I think you need to give an example of input data on which to verify your query. – stdob-- Oct 11 '18 at 20:58
  • @stdob-- How can I do this easily ? Like an export of the DB or a plain text example – Papotitu Oct 11 '18 at 21:18
  • 1
    Since your MATCH pattern is symmetric, you're going to get at least 2 rows per matching pair, just with the places switched for `source` and `target`. While an inequality on the ids of the nodes should fix that (`WHERE id(source) < id(target)` replacing your second line), your return isn't symmetric. Is there any reason why you're returning `target_id` but not the id of the source? – InverseFalcon Oct 12 '18 at 07:28
  • @InverseFalcon not really, I could return whatever I want, the id is more for debugging purposes. I tried replacing line two as you said, and now I have: `[url1, url2] [url1, url2]` which is better, but still not perfect – Papotitu Oct 12 '18 at 16:30
  • You may want to double check that you don't have duplicate nodes with the same `unique_url` property in your db. Try creating a unique constraint on `:Article(unique_url)` and see if it completes without error. If not, the property isn't unique and you probably have some cleanup to do. – InverseFalcon Oct 13 '18 at 07:57

1 Answers1

2

Try to change the start of your query like this.

  1. Add direction to relationships
  2. Add id(source) > id(target)
MATCH (source:Article)-->(neighbor<)--(target:Article)
WHERE id(source) > id(target)
WITH ...
Tomaž Bratanič
  • 6,319
  • 2
  • 18
  • 31