Does search latency increase with the document size?

Question

Does the search latency increase when data keeps on growing in a document type? As we don't directly manage shard type configurations in Vespa, how does it manage it?

Is creating multiple document types a good practice for handling scaling requirements?

score 3 · Accepted Answer · answered Jun 23 '20 at 12:12

Vespa distributes documents evenly (using the CRUSH algorithm) over the available nodes in the content cluster. If you add (or remove) nodes in a cluster, Vespa will automatically redistribute in the background.

Typically, latency is proportional to the number of documents per content node, adding more content nodes reduces latency. You can do that at any point while in production.

As you can see from this, you never want to add more search definition (schemas) to scale.

score 2 · Answer 2 · answered Jun 23 '20 at 09:55

See https://docs.vespa.ai/documentation/performance/sizing-search.html. Yes, generally if your queries are text queries, latency increases with increasing document volume given fixed number of nodes. Vespa allows live re-distribution of data so adding new nodes will balance the latency.

Does search latency increase with the document size?

2 Answers2