0

I have to read a line-json and extract the key from each line. Eventually, this is to be deleted from ES' index.

However, upon reading the file, the values extracted are: b'74298dcbd08507175b94fbe5c2a6a87d' instead of 74298dcbd08507175b94fbe5c2a6a87d. The code that reads the lines(from files) is:

from elasticsearch import Elasticsearch, helpers
import json

es = Elasticsearch("a.b.c.d:9200")
delete_patch_destination = "delete.json"
index_name = "some_index"

with open(delete_patch_destination) as delete_json_file:
    for line in delete_json_file:
        # print(line)
        line_content = json.loads(line)
        # line_content = json.loads(line)
        # for es_key in line_content.items():
        for es_key in line_content.keys():
            print (es_key)
            # es.delete(index=index_name, doc_type="latest",id=es_key)

The json file comprises of lines:

{"b'af2f9719a205f0ce9ae27c951e5b7037'": "\"b'af2f9719a205f0ce9ae27c951e5b7037'\""}
{"b'2b2781de47c70b11576a0f67bc59050a'": "\"b'2b2781de47c70b11576a0f67bc59050a'\""}
{"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'": "\"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'\""}
{"b'ceaf66243d3eb226859ee5ae7eacf86a'": "\"b'ceaf66243d3eb226859ee5ae7eacf86a'\""}
{"b'164a12ea5947e1f51566ee6939e20a2e'": "\"b'164a12ea5947e1f51566ee6939e20a2e'\""}
{"b'42e9bb704c424b49fb5e6adb68157e6f'": "\"b'42e9bb704c424b49fb5e6adb68157e6f'\""}
aviral sanjay
  • 953
  • 2
  • 14
  • 31

2 Answers2

1

decode the string like:

How to convert 'binary string' to normal string in Python3?

b'a_string'.decode('utf-8')

you will get 'a_string'

CY_
  • 7,170
  • 1
  • 15
  • 26
1

The input could be improved to avoid those convolutions, but to fix your immediate problem:

your dictionary seems to consist of a key and the same data as value (even more "stringified, we'll ignore that part)

first evaluate using ast.literal_eval, then decode the key to convert to string:

>>> import ast
>>> s = "b'af2f9719a205f0ce9ae27c951e5b7037'"
>>> ast.literal_eval(s).decode()
'af2f9719a205f0ce9ae27c951e5b7037'

(as opposed to eval this method of evaluation doesn't have security issues: Using python's eval() vs. ast.literal_eval()?)

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219