1

I have a pretty large network that contains privileged information. Because of this i can't share the exact graph but in its simplest form it can be abstracted as follows.

(Person{name:str, age:int})-[:KNOWS]-(Person{name:str, age:int})

I would like to know the closest person that is over an specific age.

I have the following query which works, but i dont think its running very efficient, because the queries take a long time even if there is a close connection that matches the requirements.

MATCH path=((p:Person{name:'Bob'})-[:KNOWS*BFS..]-(p2:Person))
WHERE p2.age > 80
RETURN p2 ,size(path) AS distance
ORDER BY distance
LIMIT 1

If my understanding is correct this will:

  1. Build a list with all the people over 80 that know (someone who knows, etc etc) Bob. Which is huge in a densly connected network and therefor inefficient.
  2. Find one of the shortest paths between Bob and this person.
  3. Repeat for all connected persons over 80.
  4. Find the shortest one of these paths.

Is there a way to stop the BFS search as soon as a person with age > 80 is found and not build the rest of all the paths?

I would also like to be able to expand this search to more than one Person over the age of 80.

The steps would be as follows:

  1. Match Bob
  2. Search all the people that Bob knows
  3. If one of these people is over 80 -> return this person
  4. If not -> search all the people that know someone that knows Bob.
  5. Repeat 3-5 recursivly

Thanks in advance!

Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38
Jasper
  • 65
  • 8

1 Answers1

0

from what I understood, you want to find the closest person over the age of 80. Basically, the query you provided is correct, and what this query will do is it will first search for a person named "Bob" and it will start expanding the search from "Bob" to other persons. Concerning steps, I would like to clarify the following:

  1. It will find "Bob"
  2. Start finding paths in the form of expansion from "Bob"
  3. To order by those paths, it needs to collect all the paths it found
  4. And it will return only one path

So order by is a kind of aggregator. I am not sure what do you mean by "build a list of ..." so I don't know what you are thinking of that, but it from an efficiency point of view, it is only collecting all the paths because of order by. What you can do, is explore the All Shortest algorithm we implemented in Memgraph to find the shortest paths between Bob and other Persons. It is basically doing all shortest path search between Bob and other persons, you can check docs here: https://memgraph.com/docs/memgraph/reference-guide/built-in-graph-algorithms#all-shortest-paths. This will return all shortest paths.

In order to find the shortest distance, you can try to set the upper limit of how many hops BFS will do, or you can let Memgraph explore all paths.

For the second question, I am not sure I get your step 4. Because first you are searching from Bob to somebody else, and then from anybody else to Bob.

I would suggest you jump on our Discord and ask that question, it will be easier to help because Discord is more informal: https://discord.gg/memgraph. This way we can keep discussing and understand what you really need.

  • I dont know how to clarify it any better but i will try anyway: I dont need a list of all the connections between Bob and people over 80. I only want the closest x amount of people. In my dataset there are hundreds of people indirectly connected to Bob over 80. I feel like this query is wasting a lot of proccessing time trying to find all the connections, while i want the search to stop after the closest x amount of people over 80 have been found. Thanks for the discord link btw, im joining as we speak – Jasper Jun 02 '23 at 11:52