55

Any rule of thumb on where to use label vs node property vs relationship + node.

Let's have an example, say I have a store and I want to put my products in neo4j. Their identifier is the product sku, and I also want to have a categorization on them like this one is for clothes, food, electronics, and you get the idea. I'll be having a free search on my graph, like the user can search anything, and I'd return all the things related to that search string.

Would it be better to use:

  1. I have a node with sku 001, and I'll tag it a label of Food.
  2. I have a node with sku 001, and have property on this node called category:"Food"
  3. I have a node with sku 001, and I'll create another node for the Food, and will create a relationship of "category" to relate them.

I have read that if you'll be looking up a property, it's better off as a relationship + node, as traversing is much faster than looking up properties of node.

TIA

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
lorraine batol
  • 6,001
  • 16
  • 55
  • 114

2 Answers2

72

Whether you should use a property, a label or a node for the category depends on how you will be querying the data.

(I'll assume here that you have a fairly small, fairly fixed set of categories.)

Use a property if you won't be querying by category, but just need to return the category of a node that has been found by other means. (For example: what is the category of the item with sku 001?)

Use a label if you need to query by category. (For example: what are all the foods costing less than $10?)

Use a node if you need to traverse the category without knowing what it is. (For example: what are the ten most popular items in the same category as one that the user has chosen?)

Ben Butler-Cole
  • 2,011
  • 1
  • 17
  • 23
  • Is it safe to assume that the third option is most *robust* in the sense that you could easily accomplish all 3 goals with such a structure with lowest hit on performance as opposed to the other 2 methods? I'm imagining a structure for this question similar to: **1** `(:user)-[LIKES]->(:node:food {sku: 1, cost: 10})`, **2** `(:user)-[LIKES]->(:node {sku: 1, category: "food", cost: 10})`, and **3** `(:user)-[LIKES]->(:node {sku: 1, cost: 10})-[HAS_CATEGORY]->(:Category {name: "food"})` where option 3 allows for easiest querying of all 3 of your examples. – ctwheels Mar 11 '22 at 19:48
16

This blog post may also be helpful because of the benchmark it contains.

I modelled the ‘relationship’ in 4 different ways…

  • Using a specific relationship type (node)-[:HAS_ADDRESS]->(address)
  • Using a generic relationship type and then filtering by end node label (node)-[:HAS]->(address:Address)
  • Using a generic relationship type and then filtering by relationship property (node)-[:HAS {type:“address”}]->(address)
  • Using a generic relationship type and then filtering by end node property (node)-[:HAS]->(address {type: “address”})

<...>

So in summary…specific relationships #ftw!

Poliakoff
  • 1,592
  • 1
  • 21
  • 40
  • The benchmarks give a very good pointer - but they show that using specific relationships is faster than generic ones. They decision to using property vs label vs node is still contingent on the queries themselves. – ahmedhosny Jul 16 '20 at 02:35