3

i am wondering how you would extract a text into dictionary in python. the text file is formatted as such(see below) and extract in way so that object earth for example is the key and its radius, period and all are within its key.

RootObject: Sun

Object: Sun

Satellites: Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris

Radius: 20890260

Orbital Radius: 0

Object: Earth

Orbital Radius: 77098290

Period: 365.256363004

Radius: 6371000.0

Satellites: Moon

Object: Moon

Orbital Radius: 18128500

Radius: 1737000.10

Period: 27.321582
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
tom smith
  • 71
  • 3
  • 11
  • 1
    What do you want for a result? A regular dictionary won't quite work, since some of your keys are duplicated. – acjay Nov 21 '12 at 04:28
  • looking to animate a solar system into quickdraw – tom smith Nov 21 '12 at 04:36
  • @tomsmith - can you update the question with the output in the format you'd want for the example input? Its a bit hard to tell just what you want from the question. – Blair Nov 21 '12 at 04:40
  • are there supposed to be blank lines in the input? it looks like there would be, but that may be just a formatting issue...? – Petri Nov 21 '12 at 05:21
  • 2
    I keep seeing this assignment come up...what is your teacher's deal? Can't they give you JSON like a rational instructor? Why are they having students parse text files for `:` characters? – yurisich Nov 22 '12 at 21:53

3 Answers3

3

Using a modification of one of the above you would get something like the following:

def read_next_object(file):    
        obj = {}               
        for line in file:      
                if not line.strip(): continue
                line = line.strip()                        
                key, val = line.split(": ")                
                if key in obj and key == "Object": 
                        yield obj                       
                        obj = {}                              
                obj[key] = val

        yield obj              
planets = {}                   
with open( "test.txt", 'r') as f:
        for obj in read_next_object(f): 
                planets[obj["Object"]] = obj    

print planets                  

Fix the case for the RootObject and I believe this is the final dictionary that you are looking for from the example data that you have posted. It is a dictionary of planets where each planet is a dictionary of it's information.

print planets["Sun"]["Radius"]

Should print the value 20890260

The output from the above looks like the following:

{   'Earth': {   'Object': 'Earth',
             'Orbital Radius': '77098290',
             'Period': '365.256363004',
             'Radius': '6371000.0',
             'Satellites': 'Moon'},
     'Moon': {   'Object': 'Moon',
            'Orbital Radius': '18128500',
            'Period': '27.321582',
            'Radius': '1737000.10'},
     'Sun': {   'Object': 'Sun',
           'Orbital Radius': '0',
           'Radius': '20890260',
           'RootObject': 'Sun',
           'Satellites': 'Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris'}}
sean
  • 3,955
  • 21
  • 28
  • can you improve this so numbers are parsed into python decimals and the satellites into tuples? – Petri Nov 21 '12 at 05:38
  • I would add that, but as the OP did not include their intention or rather their origin attempt they can add these trivial additions to their code. I feel that the solution should answer the question for the most part as well. – sean Nov 21 '12 at 05:50
  • Traceback (most recent call last): File "a4.py", line 14, in planets[obj]["Object"] = obj TypeError: unhashable type: 'dict' – tom smith Nov 21 '12 at 15:35
  • thanks how would you make "Root Object" a main key as well like "Sun" "Moon" and how would you split just the satellites the one with like 10 into a list – tom smith Nov 22 '12 at 17:03
  • You can use the `split` function on python strings to split the satallites into a list. As for the other one, just look for it and return it like a planet then modify the second part that puts the planets into the main list so that it add's it correctly. – sean Nov 26 '12 at 00:01
3
nk="""
RootObject: Sun

Object: Sun
Satellites: Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris
Radius: 20890260
Orbital Radius: 0

Object: Earth
Orbital Radius: 77098290
Period: 365.256363004
Radius: 6371000.0
Satellites: Moon

Object: Moon
Orbital Radius: 18128500
Radius: 1737000.10
Period: 27.321582

"""

my_test_dict={}
for x in nk.splitlines():
    if ':' in x:
        if x.split(':')[0].strip()=='RootObject':
            root_obj=x.split(':')[1].strip()
        elif x.split(':')[0].strip()=='Object':
            my_test_dict[x.split(':')[1].strip()]={}
            current_dict=x.split(':')[1].strip()
            if x.split(':')[1].strip()!=root_obj:
                for x1 in my_test_dict:
                    if 'Satellites' in my_test_dict[x1]:
                        if x.split(':')[1].strip() in my_test_dict[x1]['Satellites'].split(','):
                            my_test_dict[x.split(':')[1].strip()]['RootObject']=x1
        else:
            my_test_dict[current_dict][x.split(':')[0].strip()]=x.split(':')[1].strip()

print my_test_dict

output:

{
    'Sun':
        {
        'Satellites': 'Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris',
        'Orbital Radius': '0',
        'Radius': '20890260'
        },

    'Moon':
        {
        'Orbital Radius': '18128500',
        'Radius': '1737000.10',
        'Period': '27.321582',
        'RootObject': 'Earth'
         },

    'Earth':
        {
        'Satellites': 'Moon',
        'Orbital Radius': '77098290',
        'Radius': '6371000.0',
        'Period': '365.256363004',
        'RootObject': 'Sun'
        }
    }
namit
  • 6,780
  • 4
  • 35
  • 41
  • 1
    OP: Be sure to tell your teacher to give their future students JSON, this is madness to force students to do this sort of tedious work to get to more interesting topics. – yurisich Nov 22 '12 at 21:55
0

Assuming you want elements with comma-separated values as lists, try:

mydict={}
with open(my_file,'r') as the_file:
    for line in the_file:
        if not line.strip(): continue # skip blank lines
        key,val=line.split(": ")
        val = val.split(",")
        mydict[key] = val if len(val) > 1 else val[0]
acjay
  • 34,571
  • 6
  • 57
  • 100
IT Ninja
  • 6,174
  • 10
  • 42
  • 65
  • Actually, even my edits aren't sufficient, due to collisions. Need more info from OP – acjay Nov 21 '12 at 04:27
  • yes- unfortunately, while doing what it says, this answer does not really answer the question.. and it also loses data in the process – Petri Nov 21 '12 at 05:35