View all the documents loaded into vespa

Question

Is there any way to fetch all the documents loaded into vespa?

I tried querying with regular expressions, but it didn't work as expected.

select * from entity where ID matches "[.]+";

ID is not an attribute, but I tried with an attribute field, both didn't respond with any values.

score 5 · Accepted Answer · answered Jan 25 '19 at 07:46

Using visiting instead of search, either with the vespa-visit tool or using visiting in the document/v1 REST API is usually preferable for dumping documents.

If you want to use search, use this query to match all documents of a type:

select * from yourdocumenttype where sddocname contains 'yourdocumenttype';

To iterate over all documents with this, it will be more efficient to use a some field in your document to partition the document set into smaller chunks and query for one chunk at a time (e.g if you have a timestamp field, add a range condition to the query to retrieve documents for a slice of time in each query).

(Regular expressions are only supported in streaming mode.)

score 3 · Answer 2 · answered Jan 25 '19 at 07:33

To dump all documents from Vespa, use vespa-visit:

"visit" is a different interface than the search interface - it is built for large data transfers with high throughput, but not necessarily low latency

Teams use visit to extract a full dump or a subset, using a selection expression

View all the documents loaded into vespa

2 Answers2