Given a directed possibly cyclic networkx graph. I would like to search for paths which are described as a regular expression. The regexes will not describe cycles. An example would be:
nodeA, [^nodeB]* [nodeC|nodeD]
In english this should correspond to: I am looking for a path starting with nodeA,followed by zero, one or many nodes which are not nodeB, and finally the path should end with either nodeC or nodeD.
Example paths I would like to find would be:
- nodeA->nodeX->nodeC
- nodeA->nodeX->nodeY->nodeD
- nodeA->nodeC
- ...
I found this related post on SO. Yet, it is about creating a regex from all possible paths between two nodes. In my case, I provide a regex and would like to get the corresponding paths.
My incomplete (and probably insufficient) approach is this:
- Split the regex-described path up into parts starting with a single fixed node and ending with a single fixed node. Single fixed node means
some_node
(i.e. no special regex characters such as [],^, *, + etc.). - Search for this path using regular networkx functionality
- If a path is found, I go through it node by node and look at the "current" regex part (starting at the beginning of the regex).
- If the current regex part is a
[^some_node]
then I check if the current node is equal to that. I break if it is equal and advance the node and the regex part if not. - If the current regex part is a some_node , i.e. the node
some_node
must be there then I check if the current node is equal to that. If it is, I advance the node and the regex. If not, I only advance the node.
Before writing more code for the other cases, e.g. [some_node]*
or [some_node]+
I wanted to check if there exists some code or algorithm for this task already.
Is there any read-to-use module/lib/sample code or the like for this task ?
If not, do you know of an existing algorithm "regex to path search in graph"?
If not, could you outline an algorithm ?