0

I've got a file like this and i want to convert it into a python dictionary.

Week1=>
    Monday=>
        Math=8:00
        English=9:00
    Tuesday=>
        Spanish=10:00
        Arts=3:00

The output should be:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}

This is my actual code:

def ReadFile(filename, extension) -> dict:
        content = {} # Content will store the dictionary (from the file).

        prefsTXT = open(f"{filename}.{extension}", "r") # Open the file with read permissions.
        lines = prefsTXT.readlines() # Read lines.

        content = LinesToDict(lines) # Calls LinesToDict to get the dictionary.

        prefsTXT.close() # Closing file.

        return content # Return dictionary with file content.
        
def LinesToDict( textList: list) -> dict:
    result = {} # Result

    lines = [i.replace("\n", "") for i in textList] # Replace the line break with nothing.

    for e, line in enumerate(lines): # Iterate through each line (of the file).

        if line[0].strip() == "#": continue # If first character is # ignore the whole line.

        keyVal, lines = TextToDict(e, lines) # Interpret each line using TextToDict function and set the value to KeyVal (and lines because is modified in TexToDict).
        keyVal = list( keyVal.items() )[0] # COnvert the dictionary to a list of tuples
        
        result[keyVal[0]] = keyVal[1] # Add key:value to result

    return result # Returns result

def TextToDict(textIndx: int, textList: list) -> dict, list:
    result = {} # Result

    text = textList[textIndx].strip() # Strip the passed line (because of the tabulations).

    if text[0].strip() == "#": return # If first character is # ignore the whole line.

    keyVal = text.split("=", 1) # Split line by = which is the separator between key:value.
    
    if keyVal[1] == ">": # If value == ">"
        indentVal, textList = TextToDict(textIndx + 1, textList) # Calls itself to see the content of the next line.
        textList.pop(textIndx + 1) # Pop the interpeted content.
    
    else: # If value isn't ">"
        indentVal = keyVal[1] # Set indentVal to the val of keyVal

    result[keyVal[0]] = indentVal # Setting keyVal[0] as key and indentVal as value of result dictionary

    return result, textList # Return result and modified lines list


file = ReadFile("trial", "txt")
print(file)

ReadFile reads the file and pass the lines to LinesToDict. LinesToDict Iterate trough the lines and pass each line to TextToDict.
TextToDict splits the line by = and checks if the val (split[1]) is == ">", if it is calls itself with the next line, and store the value in a dictionary to return.

But i get this:

{'Week1': {'Monday': {'Math': '8:00'}}, 'English': '9:00', 'Tuesday': {'Spanish': '10:00'}, 'Arts': '3:00'}

Instead of this:

{"Week1": {"Monday": {"Math": "8:00", "English": "9:00"}, "Tuesday": {"Spanish":"10:00", "Arts": "3:00"}}}
Patitotective
  • 35
  • 1
  • 7

1 Answers1

3

The format looks close to yaml but the separators

$ pip install pyyaml

import re, yaml, json

mytext = open("/tmp/txt.txt").read() # read your content here
mydict = yaml.safe_load(re.sub("(\w+)\=(.*)", "\\1: \"\\2\"", re.sub("=>\n", ":\n", mytext)))
print(json.dumps(mydict, indent=True))

output:

{
 "Week1": {
  "Monday": {
   "Math": "8:00",
   "English": "9:00"
  },
  "Tuesday": {
   "Spanish": "10:00",
   "Arts": "3:00"
  }
 }
}
Ilya Kharlamov
  • 3,698
  • 1
  • 31
  • 33