0

If my nodes look like:

{id: 1, name: "John", last_name: "Doe", age: 40, city: "New York", credit_score: 5.5}
{id: 2, name: "Linda", last_name: "Lumbo", age: 32, city: "Washington", credit_score: 5.5}
{id: 3, name: "Greg", last_name: "Tanta", age: 28, city: "New York", credit_score: 5.5}
{id: 4, name: "Donald", last_name: "Greenboim", age: 64, city: "Tel Aviv", credit_score: 5.5}
{id: 5, name: "Leo", last_name: "Greenhouse", age: 98, city: "Paris", credit_score: 5.5}
{id: 6, name: "John", last_name: "Opelbaum", age: 80, city: "Moscow", credit_score: 1}
{id: 7, name: "John", last_name: "Vein", age: 21, city: "Los Angeles", credit_score: 0.35}
{id: 8, name: "Dino", last_name: "Lodz", age: 34, city: "New York", credit_score: 1.5}
{id: 9, name: "Kurt", last_name: "Kreston", age: 89, city: "New York", credit_score: 5.3}
{id: 10, name: "Alex", last_name: "Mulo", age: 22, city: "Moscow", credit_score: 2.5}
{id: 11, name: "John", last_name: "Tolo", age: 32, city: "Liverpool", credit_score: 0.5}
{id: 12, name: "Trent", last_name: "Benson", age: 57, city: "London", credit_score: 5.114}
{id: 13, name: "Tom", last_name: "Richardson", age: 23, city: "New York", credit_score: 0.986}
....

Consider all are interconnected and I want to apply the GraphSAGE algorithm on the attributes. For some reason I can't get the embeddings when my attributes are strings. Please guide me how can I apply the GraphSAGE algorithm on nodes with string type attributes? Or mixed (float, int, string).

Failed to invoke procedure gds.graph.create: Caused by: java.lang.UnsupportedOperationException: Loading of values of type String is currently not supported
SteveS
  • 3,789
  • 5
  • 30
  • 64

1 Answers1

2

If you want to apply to run GraphSAGE on the string type attributes, you need to apply one hot encoding or some other technique to transform them into a number of a list of numbers. The property type cannot be a mix of various data types, it has to be consistent across all properties. AFAIK, this is valid for any library that includes GraphSAGE, not just Neo4j GDS.

Probably you can skip the id property as it doesn't bring in any additional information. For the city, name, and last name you can use either one hot encoding or word embeddings to include those properties in GraphSAGE, the decision is yours.

Tomaž Bratanič
  • 6,319
  • 2
  • 18
  • 31
  • The issue is that my nodes will be with different number of attributes and types. I am trying to build a general solution where all string type attributes will be converted. How can I do this in Neo4j Cypher? Is there any procedure to apply on all string attributes? – SteveS Nov 29 '21 at 10:37
  • 1
    For now, you need to preprocess word embeddings outside of Neo4j – Tomaž Bratanič Nov 30 '21 at 16:24
  • Can you give me an example? @tomaz-bratanic – SteveS Dec 01 '21 at 14:29
  • https://www.shanelynn.ie/word-embeddings-in-python-with-spacy-and-gensim/ – Tomaž Bratanič Dec 01 '21 at 20:06