How to get text between symbols in string Python

Question

I'm trying to make a program that answers questions on a site automatically and I have the answers stored in a file, but I cant figure out how to read between symbols that are in the file. I need to get the question and the answer which are both in bold.

This is what each question looks like in the file. the comma at the end is the divide between the first and second question.

{" Complete the following statement: changing state from ---(1)--- to gas is known as ---(2)---.": {"['1: liquid; 2: evaporation', '1: liquid; 2: melting', '1: solid; 2: evaporation', '1: solid; 2: melting']": "1: liquid; 2: evaporation", "['1: liquid; 2: deposition', '1: liquid; 2: sublimation', '1: solid; 2: deposition', '1: solid; 2: sublimation']": "1: solid; 2: sublimation"},

Can you show us what you have tried already? What do you mean by "symbols"? — ChaddRobertson, Oct 17 '21 at 12:08
@ChaddRobertson the symbols are The commas and brackets between the text and I haven't tried anything yet as I have no clue where to start. thanks — Bigboyal, Oct 17 '21 at 12:29
@AndreyF I use compress_json to decompress it. is this the same thing? — Bigboyal, Oct 17 '21 at 12:42
please share some of your code. The sample of the data you shared looks like a json. Try using json module (https://docs.python.org/3/library/json.html ) to parse the file. — AndreyF, Oct 17 '21 at 12:49

score 0 · Answer 1 · answered Oct 17 '21 at 12:23

0

You can try to convert the string to a dict and then you could use dict.items()

answered Oct 17 '21 at 12:23

Richard Römer

1
1

I did think about doing that but there are several thousand questions in the file so unless there is a way I can automatically convert it to a dict it would take a very long time. thanks – Bigboyal Oct 17 '21 at 12:33
You could iterate trough the questions and then make each question-string to a dict https://stackoverflow.com/questions/988228/convert-a-string-representation-of-a-dictionary-to-a-dictionary – Richard Römer Oct 17 '21 at 12:35

score 0 · Answer 2 · answered Oct 17 '21 at 13:49

Assuming you have your data in text format (i.e in text file with .txt extension)

# To read text from .txt file

with open("temp.txt", "r") as f:
    content = f.read()

arr = content.split("},") 
# Above line will return an array but "}," will be removed from the string. We don't want "," but we need "}"
# For that below code will help.

i = 0
while i < len(arr)-1:  # "-1" because, "}" will not be removed from the last, so we need to keep the last element as it is
    arr[i] += "}"
    i += 1

# Now we have a list of strings, which can be converted into dicts
# For that...
from ast import literal_eval

i = 0
while i<len(arr):
    arr[i] = literal_eval(arr[i])
    i += 1


# Now you have your data in the form of array of dicts
# Sample code to get questions, options and answers

questions, options, answers = [], [], []

for dicts in arr:
    i = 0
    for key, val in dicts.items():
        if i == 0:
            questions.append(key)

        temp = val
        for key, val in temp.items():
            options.append(key)
            answers.append(val)

# Now you have arrays of questions, options and answers.
# 0 indexed question related to 0 indexed options and 0 indexed answer, similarly for 1, 2, 3 and so on.

# EXAMPLE
print("que1 = ", questions[0])
print("options = ", options[0])
print("ans = ", answers[0])

How to get text between symbols in string Python

2 Answers2