3

Lunr is doing a great job finding most results, but I can't figure out why it won't return multi-word strings contained in JSON arrays.

Here's a sample JSON file to get a sense of how my data is structured:

[{
    "title": "Rolling Loud",
    "date": "May 5–7",
    "location": "Miami, FL, USA",
    "rock-artists": [],
    "hh-artists": ["Kendrick Lamar", "Future"],
    "electronic-artists": [],
    "other-artists": []
}]

When I search for "Miami" and "Future", lunr returns the festival. However when searching for "Kendrick" or "Kendrick Lamar", lunr doesn't return the festival.

Relevant code:

// initialize lunr
var idx = lunr(function () {
    this.field('id');
    this.field('title', { boost: 3 });
    this.field('date');
    this.field('location');
    this.field('rockArtists', { boost: 3 });
    this.field('hhArtists', { boost: 3 });
    this.field('electronicArtists', { boost: 3 });
    this.field('otherArtists', { boost: 3 });

    // add festivals to lunr
    for (var key in data) {
        this.add({
           'id': key,
           'title': data[key].title,
           'date': data[key].date,
           'location': data[key].location,
           'rockArtists': data[key]['rock-artists'],
           'hhArtists': data[key]['hh-artists'],
           'electronicArtists': data[key]['electronic-artists'],
           'otherArtists': data[key]['other-artists']
        });
    }
});

Thanks!

Corbin
  • 87
  • 1
  • 7
  • What is `this` within `for..in` loop? – guest271314 Apr 20 '17 at 21:56
  • Should I not be calling `add()` within the function? I was having problems calling `idx.add` from outside the loop so I placed it inside the function, accessing the variable through `this` instead. – Corbin Apr 20 '17 at 22:02
  • What does `console.log(this)` within `for..in` log? – guest271314 Apr 20 '17 at 22:05
  • It returns `Builder {}` with a ton of children including `_fields: ["id", "title", etc]` and `averageDocumentLength: 98.125`. – Corbin Apr 20 '17 at 22:10
  • Have not tried `lunrjs`. Can you reproduce issue at plnkr http://plnkr.co? – guest271314 Apr 20 '17 at 22:12
  • Here you go: http://plnkr.co/edit/HoGiymF45V9CH12UCM9U – Corbin Apr 20 '17 at 22:29
  • The response does not actually contain result other than `.ref` property? – guest271314 Apr 20 '17 at 22:48
  • Yeah, I think so; it looks like you need `.ref` to access the object's variables. But even before that point, you can see that calling `console.log()` on the `results` object returns `[]` if no results are found. Basic documentation can be found [here](https://lunrjs.com/docs/index.html). – Corbin Apr 21 '17 at 00:31
  • After poking around some more, it looks like pretty much the only characters that are not returning results OR errors are ` ` and `-`. Could it have to do with some properties of strings inside of arrays? – Corbin Apr 21 '17 at 00:47

1 Answers1

8

Lunr is indexing the hh-artists field, you should be able to confirm this by looking for one of the values in the index:

idx.invertedIndex['Kendrick Lamar']

When a document field is an array, lunr assumes that the elements of the array are already split into tokens for indexing. So instead of adding "Kendrick" and "Lamar" to the index as separate tokens "Kendrick Lamar" is added as a single token.

This causes issues when trying to search, because searching for "Kendrick Lamar" is actually searching for "Kendrick" OR "Lamar" since the search string is split on spaces to get tokens. Neither "Kendrick" nor "Lamar" are in the index and so there are no results.

To get the results you are hoping for you can convert the array into a string and let lunr handle splitting it into tokens:

this.add({
  'hhArtists': data[key]['hh-artists'].join(' ')
})
Oliver Nightingale
  • 1,805
  • 1
  • 17
  • 22