1

How do I add existing Node properties of a node in the graph projection to the ML pipeline?

As far as I know, the gds.beta.pipeline.linkPrediction.addNodeProperty procedure takes different other procedures to creates node embeddings as new node properties to the pipeline but how do I add existing ones? Getting the below error even though the properties are projected in the memory graph.

Failed to invoke procedure gds.beta.pipeline.linkPrediction.train: Caused by: java.lang.IllegalArgumentException: Node properties [property1, property2, property3] defined in the feature steps do not exist in the graph or part of the pipeline

Meghana S
  • 75
  • 6

1 Answers1

0

You can add an existing node property to the link prediction pipeline by adding it to your graph projection ->

CALL gds.graph.project('test', 'Node', 'Relationship', {nodeProperties: ['property'1]})

Then you can use it the link prediction pipeline by defining the link feature:

CALL gds.beta.pipeline.linkPrediction.addFeature('pipe', 'hadamard', {
  nodeProperties: ['property1']
}) YIELD featureSteps

You can have multiple link features. The addProperty method is only used when you want to execute a graph algorithm to calculate the new property, like in the example with the fastRP embeddings.

CALL gds.beta.pipeline.linkPrediction.addNodeProperty('pipe', 'fastRP', {
  mutateProperty: 'embedding',
  embeddingDimension: 256,
  randomSeed: 42
})

If the node property is already present in the projected graph, you can skip the add node property step.

Tomaž Bratanič
  • 6,319
  • 2
  • 18
  • 31
  • Is the addFeature step necessary for defining a link feature? The thing is, 'prop1', 'prop2' are not shared properties. They are present only for a specific node say a :Person node. And hence if I want to define a link feature between :Person node and some other node, I can't do this ```CALL gds.beta.pipeline.linkPrediction.addFeature('pipe', 'hadamard', { nodeProperties: ['property1']})``` I cannot make 'property1' as shared by giving default values to the nodes that don't have the property. It will not be meaningful. So overall, is addFeature step required? If yes, why is it important? – Meghana S Aug 08 '22 at 07:00
  • Yes, add feature is important, because only link features are considered for training the model and not node features. – Tomaž Bratanič Aug 08 '22 at 10:07
  • 1
    Then how do I deal with the above scenario? My properties are not shared – Meghana S Aug 08 '22 at 11:31
  • The link prediction pipeline is not designed for heterogeneous graphs... perhaps write your example to devs directly at: https://github.com/neo4j/graph-data-science – Tomaž Bratanič Aug 08 '22 at 12:43