2

I can't find very much documentation on how to properly define the index function such that I can do a full text search on the information that I need.

I've used the Alchemy API to add "entities" json to my documents. For instance, I have a document with the following:

"_id": "redacted",
"_rev": "redacted",
"session": "20152016",
"entities": [


    {
      "relevance": "0.797773",
      "count": "3",
      "type": "Organization",
      "text": "California Constitution"
    },
    {
      "relevance": "0.690092",
      "count": "1",
      "type": "Organization",
      "text": "Governors Highway Safety Association"
    }
]

I haven't been able to find any code snippets showing how to construct a search index function that looks at nested json.

My stab at indexing the whole object appears to be incorrect. This is the full design document:

    {
  "_id": "_design/entities",
  "_rev": "redacted",
  "views": {},
  "language": "javascript",
  "indexes": {
    "entities": {
      "analyzer": "standard",
      "index": "function (doc) {\n  if (doc.entities.relevance > 0.5){\n      index(\"default\", doc.entities.text, {\"store\":\"yes\"});\n  }\n\n}"
    }
  }
}

And the search index formatted a little bit more clearly is

function (doc) {
  if (doc.entities.relevance > 0.5){
      index("default", doc.entities.text, {"store":"yes"});
  }

}

Adding the for loop as suggested below makes a lot of sense. However, I still am not able to return any results. My query is "https://user.cloudant.com/calbills/_design/entities/_search/entities?q=Governors"

Server response is: {"total_rows":0,"bookmark":"g2o","rows":[]}

Jen Scott
  • 679
  • 2
  • 7
  • 17

3 Answers3

3

The "for..in" style loop doesn't seem to work. However, I do get results using the more standard for loop loops.

function (doc) {
  if(doc.entities){
    var arrayLength = doc.entities.length;
    for (var i = 0; i < arrayLength; i++) {
    if (parseFloat(doc.entities[i].relevance) > 0.5)
    index("default", doc.entities[i].text);
}
}
}

Cheers!

Jen Scott
  • 679
  • 2
  • 7
  • 17
1

Your need to loop on the elements in the doc.entities array.

function (doc) {
  for(entity in doc.entities){
    if (parseFloat(entity.relevance) > 0.5){
      index("default", entity.text, {"store":"yes"});
    }
  }
}
gadamcox
  • 191
  • 6
  • Ok this makes sense. Why does doc.entities.relevance exist though? – Jen Scott Dec 04 '15 at 20:40
  • 1
    I think I had a typo. Look again. – gadamcox Dec 04 '15 at 20:46
  • 2
    Ah... the value for relevance is a string. You need to convert it to a float. – gadamcox Dec 04 '15 at 20:47
  • was just about to ask if that mattered. Even after changing that, I still get no results. In fact, I get no results even without the relevancy condition. Should the above be sufficient as a standalone indexer? Do I need to index the ids? Do I need to ensure that every document has doc.entities first? Thanks for the ongoing help – Jen Scott Dec 04 '15 at 20:51
  • Also, check out @brobes answer for [indexing and searching arrays with Cloudant Query](http://stackoverflow.com/questions/33262573/cloudant-selector-query/33835521#33835521) – bradnoble Dec 04 '15 at 22:02
0

This is what I tried :

function(doc){
   if(doc.entities){
   for( var p in doc.entities ){
         if (doc.entities[p].relevance > 0.5)
          { 
             index("entitiestext", doc.entities[p].text, {"store":"yes"});
           }      
        }
     }
}

Query String used :"q=entitiestext:California Constitution&include_docs=true" Result:

{
"total_rows": 1,
"bookmark": "xxxx",
"rows": [
    {
        "id": "redacted",
        "order": [
            0.03693288564682007,
            1
        ],
        "fields": {
            "entitiestext": [
                "Governors Highway Safety Association",
                "California Constitution"
            ]
        },
        "doc": {
            "_id": "redacted",
            "_rev": "4-7f6e6db246abcf2f884dc0b91451272a",
            "session": "20152016",
            "entities": [
                {
                    "relevance": "0.797773",
                    "count": "3",
                    "type": "Organization",
                    "text": "California Constitution"
                },
                {
                    "relevance": "0.690092",
                    "count": "1",
                    "type": "Organization",
                    "text": "Governors Highway Safety Association"
                }
            ]
        }
    }
]

}

Query String used: q=entitiestext:California Constitution

Result:

 {
"total_rows": 1,
"bookmark": "xxxx",
"rows": [
    {
        "id": "redacted",
        "order": [
            0.03693288564682007,
            1
        ],
        "fields": {
            "entitiestext": [
                "Governors Highway Safety Association",
                "California Constitution"
            ]
        }
    }
]

}

Sora
  • 409
  • 5
  • 6