I have this file (MYFILE.txt
) that is the output of a list of lists with dictionaries inside:
[{'entity_group': 'literal', 'score': 0.99999213, 'word': 'DNA', 'start': 0, 'end': 3}, {'entity_group': 'metaphoric', 'score': 0.9768174, 'word': 'loop', 'start': 4, 'end': 8}, {'entity_group': 'literal', 'score': 0.9039155, 'word': 'ing,', 'start': 8, 'end': 12}, {'entity_group': 'metaphoric', 'score': 0.99962616, 'word': 'in', 'start': 13, 'end': 15}, {'entity_group': 'literal', 'score': 0.9949911, 'word': 'which a protein or protein complex interacts simultaneously', 'start': 16, 'end': 75}, {'entity_group': 'metaphoric', 'score': 0.59057885, 'word': 'with', 'start': 76, 'end': 80}, {'entity_group': 'literal', 'score': 0.9983214, 'word': 'two separated sites on a DNA molecule, is a recurring theme', 'start': 81, 'end': 140}, {'entity_group': 'metaphoric', 'score': 0.9998679, 'word': 'in', 'start': 141, 'end': 143}, {'entity_group': 'literal', 'score': 0.9997542, 'word': 'transcription', 'start': 144, 'end': 157}, {'entity_group': 'metaphoric', 'score': 0.7964442, 'word': 'regula', 'start': 158, 'end': 164}, {'entity_group': 'literal', 'score': 0.99982435, 'word': 'tion [', 'start': 164, 'end': 170}]
I want to group the "literal" in order to get the text only, and leave the metaphoric as it is. I tried with this code below but it says string indices must be integers
, and I also think that I could make it HTML and color it to visualize better the result, but I'm sure there's a quicker solution.
with open(r'MYFILE.txt', 'r') as res:
texty = res.read()
for group in texty[::-1]:
ent = group["entity_group"]
if ent != 'literal':
text2 = replace_at(ent, group['end'], group['end'], text)
print(text2)