3

I am looking for getting Estimate of documents whose "key" property value is starting with a string which is containing special character "/"

let query = 
    cts.andQuery([
        cts.jsonPropertyWordQuery("Key", "IBD/info/*", ["lang=en"],1), 
        cts.collectionQuery("documentCollection")
    ], [])

cts.estimate(query)

but the word-query() internally tokenizing "IBD/info/" as (cts:word("IBD"), cts:punctuation("/"), cts:word("info"), ...)

I created FIELD, with details as below

"field": [
        {
            "field-name": "key",
            "field-path": [
                {
                    "path": "/envelope/instance/Key",
                    "weight": 1
                }
            ],
            "stemmed-searches": "advanced",
            "field-value-searches": true,
            "field-value-positions": true,
            "trailing-wildcard-searches": true,
            "trailing-wildcard-word-positions": true,
            "tokenizer-override": [
                {
                    "character": "/",
                    "tokenizer-class": "word"
                }
            ]
        }
]

and tried below query but I am still getting false positive results

cts:search(
  fn:doc(),
  cts:and-query((
      cts:field-value-query("key","IBD/info/*"),
      cts:collection-query("documentCollection")
  )),
  "unfiltered"
)

How can I handle this situation?

mpuram
  • 149
  • 9
  • 1
    With `"trailing-wildcard-searches": true` and `"trailing-wildcard-word-positions": true` my `cts.fieldValueQuery` is working. Can you give an example of a false-positive document? – Mads Hansen May 21 '21 at 14:59
  • Thank you @MadsHansen, yes with "trailing-wildcard-searches": true and cts.fieldValueQuery is working perfect. I will update the detailed answer below. The false postive results what I was getting is the document is getting fetched with **"Key": "IBD/External/GWAS_IBD/info/studyinfo.yaml"** which was not expected – mpuram May 22 '21 at 17:11

1 Answers1

1

Create a FIELD with below details

"field": [
    {
      "field-name": "key",
      "field-path": [
        {
          "path": "/envelope/instance/Key",
          "weight": 1
        }
      ],
      "field-value-searches": true,
      "trailing-wildcard-searches": true,
      "three-character-searches": false,
      "tokenizer-override": [
        {
          "character": "/",
          "tokenizer-class": "word"
        },
        {
          "character": "_",
          "tokenizer-class": "word"
        }
      ]
    }
  ],
  "range-field-index": [
    {
      "scalar-type": "string",
      "field-name": "key",
      "collation": "http://marklogic.com/collation/",
      "range-value-positions": false,
      "invalid-values": "reject"
    }
  ]

After Re-indexing completed then query as below

let query = 
    cts.andQuery([
        cts.fieldValueQuery("key", "IBD/info/*"),
        cts.collectionQuery("documentCollection")
    ], [])
cts.search(query,"unfiltered")

Then the query will fetch only documents with "Key" value starting with "IBD/info/"

mpuram
  • 149
  • 9