4

I'm trying to get all index document using python client but the result show me only the first document This is my python code :

res = es.search(index="92c603b3-8173-4d7a-9aca-f8c115ff5a18", doc_type="doc", body = {
'size' : 10000,
'query': {
    'match_all' : {}
}
})
print("%d documents found" % res['hits']['total'])
data = [doc for doc in res['hits']['hits']]
for doc in data:
    print(doc)
    return "%s %s %s" % (doc['_id'], doc['_source']['0'], doc['_source']['5'])
J.Ghassen
  • 101
  • 1
  • 1
  • 12

5 Answers5

8

try "_doc" instead of "doc"

res = es.search(index="92c603b3-8173-4d7a-9aca-f8c115ff5a18", doc_type="_doc", body = {
'size' : 100,
'query': {
    'match_all' : {}
}
})
Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Habib Mezghani
  • 113
  • 1
  • 4
4

Elasticsearch by default retrieve only 10 documents. You could change this behaviour - doc here . The best practice for pagination are search after query and scroll query. It depends from your needs. Please read this answer Elastic search not giving data with big number for page size

To show all the results:

for doc in res['hits']['hits']:
    print doc['_id'], doc['_source']
Lupanoide
  • 3,132
  • 20
  • 36
0

You can try the following query. It will return all the documents.

result = es.search(index="index_name", body={"query":{"match_all":{}}})
Tushar Nitave
  • 519
  • 4
  • 13
0

You can also use elasticsearch_dsl and its Search API which allows you to iterate over all your documents via the scan method.

import elasticsearch
from elasticsearch_dsl import Search

client = elasticsearch.Elasticsearch()
search = Search(using=client, index="92c603b3-8173-4d7a-9aca-f8c115ff5a18")

for hit in search.scan():
    print(hit)
kluu
  • 2,848
  • 3
  • 15
  • 35
  • search.scan() can go through all docs but it is very slow. Is there a way to improve it? – H.C.Chen Nov 04 '21 at 07:12
  • Now I found the doc https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#pagination seems like that's the best it can do. – H.C.Chen Nov 04 '21 at 08:00
  • It is unfortunate that this is as fast as it goes. I would be much interested if there was a way to speed things up? – j7skov Feb 08 '22 at 21:20
0

I dont see mentioned that the index must be refreshed if you just added data. Use this:

es.indices.refresh(index="index_name")
juniper25
  • 63
  • 5