Is a sequence of stored procedures (with continuation token) slower than a single plain query in Azure DocumentDb?

Question

I'm testing performance of stored procedures on a project and I'm surprisingly disappointed. I'm using the C# SDK to execute the tests. Pricing tier is Standard, with throughput value of 400.

On my current dataset of ~1500 items, the plain query SELECT * FROM Store takes ~4 seconds.

Using the following stored procedure with continuation takes ~19 seconds.

function getPackages(continuationToken, pageSize) {
    var context = getContext();
    var response = context.getResponse();
    var collection = context.getCollection();
    var collectionLink = collection.getSelfLink();
    var nodesBatch = [];
    var lastContinuationToken;
    var responseSize = 0;

    var query = {
        query: 'SELECT * FROM Store'
    };

    getItems(continuationToken);

    function getItems(continuationToken) {
        // Tune the pageSize to fit your dataset.
        var requestOptions = {
            continuation: continuationToken,
            pageSize: 500
        };

        var accepted = collection.queryDocuments(collectionLink, query, requestOptions,
            function (err, documentsRead, responseOptions) {
                // The size of the current query response page.
                var queryPageSize = JSON.stringify(documentsRead).length;

                // DocumentDB has a response size limit of 1 MB.
                if (responseSize + queryPageSize < 1024 * 1024) {
                    // Append query results to nodesBatch.
                    nodesBatch = nodesBatch.concat(documentsRead);

                    // Keep track of the response size.
                    responseSize += queryPageSize;

                        // If there is no continutation token, we are done. Return the response.
                        response.setBody({
                            "length": nodesBatch.length,
                            "message": "Query completed succesfully.",
                            "queryResponse": nodesBatch,
                            "continuationToken": responseOptions.continuation
                        });
                } else {
                    // If the response size limit reached; run the script again with the lastContinuationToken as a script parameter.
                    response.setBody({
                        "length": nodesBatch.length,
                        "message": "Response size limit reached.",
                        "lastContinuationToken": lastContinuationToken,
                        "queryResponse": nodesBatch
                    });
                }
            });

        if (!accepted) {
            // If the execution limit reached; run the script again with the lastContinuationToken as a script parameter.
            response.setBody({
                "length": nodesBatch.length,
                "message": "Execution limit reached.",
                "lastContinuationToken": lastContinuationToken,
                "queryResponse": nodesBatch
            });
        }
    }
}

Code was adapted from this question to allow external continuation. I would like to load batches of items to let the user know that something is going on.

I'm ok if the execution of some stored procedures takes more time, but I'm quite surprised by the different (almost 5 times).

stored procedures are compiled only first time if they are not modified later. so they take time on first execution. but you cannot say SPs are slower — Vivek Nuna, Oct 16 '16 at 08:43
I'm not saying that they are slower... I'm asking if there is something (maybe even obvious) that I'm missing causing an issue with the SP. The SP is executed 6 times with that dataset and every round takes ~ the same time, so it's absolutely not a problem of compilation. — fra, Oct 16 '16 at 08:46
Not sure about the difference; but are you aware that you can load items in the .net SDK with continuation tokens as well, which means you may be able to achieve your goal w/o using SPs in the first place? — flytzen, Oct 16 '16 at 10:12
@Frans I'm actually using that mechanism with the .AsDocumentQuery() extension method. Anyway I'd like to find the problem (is there is one) with the SP or with its usage through the DocumentClient. — fra, Oct 16 '16 at 10:25
Makes sense. If you can't get to the bottom of it, I have found the dev to be extremely responsive if you raise a support issue with them. I'm curious as to the reason now :) — flytzen, Oct 16 '16 at 10:26
" // If there is no continutation token" I don't understand this. What's checking for whether or not there is a continuation token? — claudekennilol, Oct 19 '17 at 13:56

score 2 · Answer 1 · answered Oct 16 '16 at 15:38

2

I've received an answer by the DocumentDb team. The reason is that SP is tied to a primary replica, while a query can reach out to other location to get the data faster.

Additional suggestion: one thing you will notice with SP is varying "RU" consumption. Use SP for mostly "write" scenarios which require tx across documents.

answered Oct 16 '16 at 15:38

fra

3,488
5
38
61

While the above answer is right, I'm surprised to see such a large difference in execution time. Can you show how you are executing the plain query? Are you using the continuation approach there as well with the same page size(of 500) as you do with stored procedure? – Rajesh Nagpal Oct 17 '16 at 00:15
C# code used for testing performance: https://gist.github.com/fleed/4e19a248b1b2ebce275df5f4d8d5a415. And this is once more the SP: https://gist.github.com/fleed/adea9452d4f1cdb24897321a5ff1542e. I've tested it with different values of pageSize (100, 200, 500). Anyway, I've noticed that the first query always returns a max of 261 values. According to the DocumentDB team, it is preemption to make sure that the result is returned within 5s. – fra Oct 17 '16 at 04:10

Is a sequence of stored procedures (with continuation token) slower than a single plain query in Azure DocumentDb?

1 Answers1