0

I am using a >= query again a collection. To test the script, I just have 4 entries in my collection.

My query is:

 ...
.where("Workdesc", ">=", "imple") // no returning as expected
.get()
.then(querySnapshot => {
    querySnapshot.forEach(function(doc) {
        console.log("Result");
        console.log(doc.id, " ===> ", doc.data());
    });
  });

Workdesc of all 4 docs are -

  • "kj implementation"
  • "hb implementation urgent"
  • "sharu implementation quick response needed"
  • "cb implementation urgent job"

Result according to me, it should have returned all 4 docs but it is returning only 2. I am attaching screenshot of the console log and Firebase console:

enter image description here

enter image description here

How can I get the result back with partial letter anywhere in the string.

halfer
  • 19,824
  • 17
  • 99
  • 186
user2828442
  • 2,415
  • 7
  • 57
  • 105
  • Hi user2828442. I've edited a lot of chatty material from your questions in the past, and you will have received the notifications from the Stack Overflow platform. Some of your material has a needy, begging quality, and suggests you think that displays of helplessness will cause people to pity your situation and/or feel coerced into the social obligation of assisting the less fortunate. This is not appropriate behaviour for a volunteer platform. Please stick to technical writing. – halfer Mar 28 '20 at 09:43

3 Answers3

0

Your query is working as expected. When you perform comparisons with strings, they are sorted lexicographically, or in other words, alphabetically. Here's the actual sort order of each value, and where "impl" sorts among them:

  1. "cb implementation urgent job"
  2. "hb implementation urgent"
  3. "impl"
  4. "kj implementation"
  5. "sharu implementation quick response needed"

Alphabetically, you can see that "k" and "s" come after "i". So, those are the only documents you're going to get from a query where Workdesc values are greater than "impl".

If you're trying to do a substring search to find all the Workdesc strings that contain "impl", that's not possible with Firestore. Firestore doesn't offer substring searches. You'll have to find another way (probably mirroring data to another database that supports it).

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • It's not the kind of query that can scale the way that Firestore needs to scale (and it intends to scales massively). Scanning every document for a substring just doesn't work out well when there are billions of documents in a collection. Firestore only allows queries the scale, and this is unfortunately not one of them. – Doug Stevenson Mar 28 '20 at 06:27
0

To build on Doug's answer, unfortunately Firestore does not support the type of string search you are looking to do. A potential solution that does away with text search is that you could create another field on your todo documents that stores whether you're dealing with an "implementation" or not.

For example, if you had a field isImplementation, which would be true for implementation todos and false for those that are not, you could add this field as part of your where clause to your query. This would ensure that you are fetching implementation todos only.

0

Once again building on @Doug's answer, Firestore is an indexed document database. To query for data, the query must be performed against an index in a single sweep to keep queries performant in the way the database is designed.

Firebase won't index fields that are strings by default because it isn't efficient and it is quite a taxing operation at scale. A different approach is often the best option.

Take for example the following function that splits an input string into searchable parts that can then be added to an index. As the length of the input string grows, the number of substrings contained within grows rapidly.

function shatter(str, minLength = 1) {
    let parts = [str]; // always have full string
    let i, subLength = minLength;
    let strLength = str.length;
    while (subLength < strLength) {
        for (i = 0; i < (strLength - subLength + 1); i++) {
            parts.push(str.substring(i, i + subLength));
        }
        subLength++;
    }
    return parts;
}

Here's an interactive snippet demonstrating this:

function shatter(str, minLength = 1) {
    let parts = [str]; // always have full string
    let i, subLength = minLength;
    let strLength = str.length;
    while (subLength < strLength) {
        for (i = 0; i < (strLength - subLength + 1); i++) {
            parts.push(str.substring(i, i + subLength));
        }
        subLength++;
    }
    return parts;
}

let str = prompt('Please type out a string to shatter:', 'This is a test string');
let partsOfMin1 = shatter(str, 1);
console.log('Shattering into pieces of minimum length 1 gives:', partsOfMin1);
let partsOfMin3 = shatter(str, 3);
console.log('Shattering into pieces of minimum length 3 gives:', partsOfMin3);
let partsOfMin5 = shatter(str, 5);
console.log('Shattering into pieces of minimum length 5 gives:', partsOfMin5);
alert('The string "' + str + '" can be shattered into as many as ' + partsOfMin1.length + ' pieces.\r\n\r\nThis can be reduced to only ' + partsOfMin3.length + ' with a minimum length of 3 or ' + partsOfMin5.length + ' with a minimum length of 5.');

However using that above function, we can repurpose it so that it saves the shattered pieces to Firestore at /substringIndex/todos/workDesc with a link back to the document containing the string.

const firebase = require('firebase');
firebase.initializeApp(/* config here */);

const arrayUnion = firebase.firestore.FieldValue.arrayUnion;
const TODOS_COL_REF = firebase.firestore().collection('todos');
const SUBSTRING_INDEX_COL_REF = firebase.firestore().collection('substringIndex');

// splits given string into segments ranging from the given minimum length up to the full length
function shatter(str, minLength = 1) {
    let parts = [str];
    let i, subLength = minLength;
    let strLength = str.length;
    while (subLength < strLength) {
        for (i = 0; i < (strLength - subLength + 1); i++) {
            parts.push(str.substring(i, i + subLength));
        }
        subLength++;
    }
    return parts;
}

// upload data
const testData = {
    workDesc: 'this is a prolonged string to break code',
    assignDate: firebase.firestore.Timestamp.fromDate(new Date()),
    assignTo: 'Ddy1QVOAO6SIvB8LfAE8Z0Adj4H3',
    followers: ['Ddy1QVOAO6SIvB8LfAE8Z0Adj4H3'],
    searchArray: ['v1', 'v2']
}

const todoDocRef = TODOS_COL_REF.doc();
const todoId = todoDocRef.id;
todoDocRef.set(testData)
    .then(() => console.log('Uploaded test data!'))
    .catch((err) => console.error('Failed to test data!', err));

// Note: in this example, I'm not waiting for the above promise to finish
// Normally, you would integrate it into the batched write operations below

// index each desired string field
const indexDocRef = SUBSTRING_INDEX_COL_REF.doc('todos');
const indexedFields = ["workDesc"];
const indexEntryMinLength = 3;

const indexUpdatePromises = indexedFields.map((fieldName) => {
    const indexColRef = indexDocRef.collection(fieldName);
    const fieldValue = testData[fieldName];
    if (typeof fieldValue !== 'string') return Promise.resolve(undefined); // skip non-string values

    const parts = shatter(fieldValue, indexEntryMinLength);

    console.log('INFO: Consuming ' + (parts.length * 2) + ' write operations to index ' + fieldName);

    // Each batched write can handle up to 500 operations, each arrayUnion counts as two
    const partsBatches = [];
    if (parts.length > 250) {
        for (let i = 0; i < parts.length; i += 250) {
            partsBatches.push(parts.slice(i, i + 250));
        }
    } else {
        partsBatches.push(parts);
    }

    const batchCommitPromises = partsBatches
        .map((partsInBatch) => {
            const batch = firebase.firestore().batch();
            partsInBatch.forEach((part) => {
                batch.set(indexColRef.doc(part), {ids: arrayUnion(todoId)}, { merge: true })
            })
            return batch.commit();
        });

    return Promise.all(batchCommitPromises);
})

Promise.all(indexUpdatePromises)
    .then(() => console.log('Uploaded substring index!'))
    .catch((err) => console.error('Failed to upload index!', err));

Then when you want to search for all documents containing "impl" you would use the following to get an array of matching document IDs:

firebase.firestore().doc('substringIndex/todos/workDesc/impl').get()
  .then(snap => snap.get('ids'))
  .then(console.log, console.error)

While the above code works, you will hit your read/write limits quite quickly as you update the index and you will also likely run into concurrency issues. I also consider it fragile in that non-English characters and punctuation will also trip it up - it is included as a demo only. These issues are why the relevant Firebase documentation recommends making use of a third-party search service like Algolia for full-text search.

TL:DR;

The best solution is to have a human-readable form of your data ("sharu implementation quick response needed") and a indexable form of your data ({implementation: true, urgent: true, pending: true}) as covered by @Luis in their answer.

samthecodingman
  • 23,122
  • 4
  • 30
  • 54