-1

I have this following string in a text file

InfoType 0 :

string1

string2

string3

InfoType 1 :

string1

string2

string3

InfoType 3 :

string1

string2

string3

Is there a way to create a dictionary that would look like this:

{'InfoType 0':'string1,string2,string3', 'InfoType 1':'string1,string2,string3', 'InfoType 3':'string1,string2,string3'}
rdas
  • 20,604
  • 6
  • 33
  • 46
MGA
  • 195
  • 1
  • 1
  • 8
  • 1
    is this a consistent format where 1st line is infotype and next three are the related strings, and this repeats? – fireball.1 Apr 22 '20 at 13:17
  • 3
    could you share any attempt you've made, and what went wrong? – Adam.Er8 Apr 22 '20 at 13:17
  • I think this is what you're looking for: https://stackoverflow.com/a/6740968/12684122 – olenscki Apr 22 '20 at 13:18
  • @fireball1 At the moment it's consistent but I wanna try and make it to get everything in between Infotype x - Infotype x+1 : as the value for the dictionary – MGA Apr 22 '20 at 13:19
  • I don't think there is an easy way, I could write a parser function but firstly, the dictionary values supposed to be one long string? and not "each string is value"? e.g. `{'InfoType 0': 'string1', 'string2', string3' }` instead of `{'InfoType 0': 'string1,string2,string3' }` – Missilexent Apr 22 '20 at 13:20
  • @Adam every time I post what i have attempted to do I get downvoted, so I figured why not just give the information I have and what I wanna accomplish if its something text manipulation related – MGA Apr 22 '20 at 13:20
  • @MGA this is kinda funny because you might have a probable solution (parsing function or something) which needs small tweaks instead of us writing a whole new function – Missilexent Apr 22 '20 at 13:21
  • When you read the file you could practically save rows in a list. Then get the indeces of rows that contain `InfoType` (which are your keys) and the difference between two consecutive rows will be the value for the key. Also the link provided by @olenscki does not answer this question. – Ralvi Isufaj Apr 22 '20 at 13:23
  • @Ralvi, on it, im not sure how it will be done when the file ends and there is no infotype there for it to match it, I'll try – MGA Apr 22 '20 at 13:26

2 Answers2

2

Something like this should work:

def my_parser(fh, key_pattern):
    d = {}
    for line in fh:
        if line.startswith(key_pattern):
            name = line.strip()
            break

    # This list will hold the lines
    lines = []

    # Now iterate to find the lines
    for line in fh:
        line = line.strip()
        if not line:
            continue

        if line.startswith(key_pattern):
            # When in this block we have reached 
            #  the next record

            # Add to the dict
            d[name] = ",".join(lines)

            # Reset the lines and save the
            #  name of the next record
            lines = []
            name = line

            # skip to next line
            continue

        lines.append(line)

    d[name] = ",".join(lines)
    return d

Use like so:

with open("myfile.txt", "r") as fh:
    d = my_parser(fh, "InfoType")
# {'InfoType 0 :': 'string1,string2,string3',
#  'InfoType 1 :': 'string1,string2,string3',
#  'InfoType 3 :': 'string1,string2,string3'}

There are limitations, such as:

  • Duplicate keys
  • The key needs processing

You could get around these by making the function a generator and yielding name, str pairs and processing them as you read the file.

Alex
  • 6,610
  • 3
  • 20
  • 38
  • Can you describe the result? – Missilexent Apr 22 '20 at 13:39
  • The returned dict is `{'InfoType 0 :': ',string1,,string2,,string3,', 'InfoType 1 :': ',string1,,string2,,string3,'}` I need to find a way to remove extra spaces and :, if I manage i'll edit the post and add the solution and for the last infotype nothing is written to the dict ... not really working at its current state – MGA Apr 22 '20 at 13:44
  • I've handled blank lines and fixed the missed record. You can modify the function to do whatever you need with the keys – Alex Apr 22 '20 at 13:52
0

This will do:

dictionary = {}

# Replace ``file.txt`` with the path of your text file.
with open('file.txt', 'r') as file:
    for line in file:
        if not line.strip():
            continue

        if line.startswith('InfoType'):
            key = line.rstrip('\n :')
            dictionary[key] = ''
        else:
            value = line.strip('\n') + ','
            dictionary[key] += value