0

I have a file with the following structure and a varying number of newlines between the entries:

n Name1 MiddleName1 Surname1
multiline
string1

n Name2 MiddleName2 Surname2
multi
line
string2


n Name3 MiddleName3 Surname3
multiline
string3

How can I read this file into a dictionary which contains:

{"n Name1 MiddleName1 Surname1" : "multiline\nstring1", ...}

I attempted to extract the keys with a regular expression, like so:

with open('file') as infile:
    content = infile.read()
    match = re.search(r'n .*', content)

But I don't know where to go from there. All similar questions I was able to find have some sort of split (like '=') which can be used to seperate the keys from the objects.

  • 4
    Welcome to Stack Overflow. Please show your current attempt from your research on this problem. – roganjosh Sep 14 '18 at 11:16
  • 3
    Possible duplicate of [Read a text file and form a dictionary in python](https://stackoverflow.com/questions/44310764/read-a-text-file-and-form-a-dictionary-in-python) – SpghttCd Sep 14 '18 at 11:19

1 Answers1

1

that shouldnt be a big problem if the format in the text file is consistently as stated above. read the file line by line, if the current line is unequal to '\n' (which would correspond to the empty lines) then treat the current line as your key (you might want to strip of the trailing '\n' though) and concatenate the next two lines as the value for your dictionary. then update you dictionary with these and repeat until line == "". That should do it. see a possible solution below. There might be other more elegant solutions though.

filename = ".//users.db"
users = {}
with open(filename,"r") as fin:
    line = fin.readline()

    # read until end of file
    while line != "":
        # check if you reached an empty line
        if line != "\n":
            content = ""
            next = fin.readline()
            # to allow for multiline you can use the while loop
            # just check if the next line is "\n" or "" to get out of the loop
            while next != "\n" and next != "":
                # for the value part of the dict just concat the next lines
                content += next
                next = fin.readline()
                # update the dict with 'line' as key and 'content' as value
            users.update({line.rstrip():content})
        # eat, sleep, repeat
        line = fin.readline() ### line adjusted for correct intendation

print(users)

my output:

{'n Name1 MiddleName1 Surname1': 'multiline\nstring1\n', 'n Name2 MiddleName2 Surname2': 'multiline\nstring2\n', 'n Name3 MiddleName3 Surname3': 'multiline\nstring3'}
Zapho Oxx
  • 275
  • 1
  • 16
  • thanks for the answer! I forgot to mention that the number of lines in each object is varying too. Sorry about that. I edited my question to reflect that. Is there a way to do this without fixing the number of lines in a variabe? – user10363312 Sep 14 '18 at 11:48
  • instead of the for loop you could to a while loop that checks if you reached an empty line aka '\n' and continue concatenating until then. – Zapho Oxx Sep 14 '18 at 11:50
  • hi, I updated the code with the mentioned change. – Zapho Oxx Sep 14 '18 at 12:04
  • that worked, thank you very much! – user10363312 Sep 14 '18 at 12:13
  • hi, just an adjustment. the last 'line = fin.readline() should not be intended as in the listing above. I adjusted that just now – Zapho Oxx Sep 14 '18 at 12:16