4

Does the search latency increase when data keeps on growing in a document type? As we don't directly manage shard type configurations in Vespa, how does it manage it?

Is creating multiple document types a good practice for handling scaling requirements?

Harsh Choudhary
  • 475
  • 5
  • 12

2 Answers2

3

Vespa distributes documents evenly (using the CRUSH algorithm) over the available nodes in the content cluster. If you add (or remove) nodes in a cluster, Vespa will automatically redistribute in the background.

Typically, latency is proportional to the number of documents per content node, adding more content nodes reduces latency. You can do that at any point while in production.

As you can see from this, you never want to add more search definition (schemas) to scale.

Jon
  • 2,043
  • 11
  • 9
2

See https://docs.vespa.ai/documentation/performance/sizing-search.html. Yes, generally if your queries are text queries, latency increases with increasing document volume given fixed number of nodes. Vespa allows live re-distribution of data so adding new nodes will balance the latency.

Jo Kristian Bergum
  • 2,984
  • 5
  • 8