5

Can someone point me to how to extract the results _source from the generator when using the scan API in the elasticsearch dsl python client?

for example, i'm using (from this example, elasticsearch-dsl scan)

for hit in s.scan():
    print(hit)

I get the following

<Hit(beacon/INDEX/_Mwt9mABoXXeYV0uwSC-): {'client_number': '3570', 'cl...}>

How do I extract the dictionary from the hit generator?

Ami Hollander
  • 2,435
  • 3
  • 29
  • 47
Arun Ramachandran
  • 241
  • 1
  • 6
  • 14

2 Answers2

9

Every Hit has to_dict(), hence you can just do hit.to_dict():

for hit in s.scan():
    print(hit.to_dict())

Note: hit.to_dict() doesn't convert meta info, you can get the meta from the meta object, i.e.:

hit_dict = hit.to_dict()
hit_dict['meta'] = hit.meta.to_dict()
Ami Hollander
  • 2,435
  • 3
  • 29
  • 47
3

In addition to @ami-hollander answer - .to_dict() did not convert meta info (id for example), if you need this info you can do something like:

hit_dict = hit.to_dict()
hit_dict['meta'] = hit.meta.to_dict()
Alexey Shrub
  • 1,216
  • 13
  • 22
  • I have to give it to the meta part. Thanks a lot. FYI: `scan()` gives back `null` scores, you can however extract the score from `explanation` (given you have set the `extra` for `explain`) and use it to sort the results as well. Cheers – Asif Ali Jul 20 '20 at 14:32