importing a file as a dictionary in python

Question

I have a file with data like this. The '>' serves as identifier.

>test1
this is line 1
hi there
>test2
this is line 3
how are you
>test3
this is line 5 and
who are you

I'm trying to create a dictionary

{'>test1':'this is line 1hi there','>test2':'this is line 3how are you','>test3':'this is line 5who are you'}

I've imported the file but I'm unable to do it in this fashion. I want to delete the newline character at the end of each line so as to get one line. Spaces not required as seen. Any help would be appreciated

This is what I've tried so far

new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt")

for line in db:
    if '>' in line:
        new_dict[line]=''
    else:
        new_dict[line]=new_dict[line].append(line)

This question appears to be off-topic because it is about getting us to show you teh codez — Marcin, Jul 08 '14 at 16:44
Your code indicates that you want a list as the value for each key but you example shows a string. Which is it? — dawg, Jul 08 '14 at 17:10
@dawg I think he is just assuming that strings also have an `append` method that is equivalent to `+=`. — chepner, Jul 08 '14 at 17:20
it's too bad there isn't a generalized version of the `locals` built-in function -- i.e., a function you can call on any namespace (e.g., a module) that returns all of the variables from that namespace in a dict. — abcd, Mar 30 '15 at 21:08

user2963623 · Accepted Answer · 2014-07-08T17:28:37.240

3

Using your approach it would be:

new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt", 'r')

for line in db:
    if '>' in line:
        key = line.strip()    #Strips the newline characters
        new_dict[key]=''
    else:
        new_dict[key] += line.strip()

edited Jul 08 '14 at 17:28

answered Jul 08 '14 at 17:21

user2963623

2,267
1
14
25

the wolf · Answer 2 · 2014-07-09T03:07:37.150

Here is a solution using groupby:

from itertools import groupby

kvs=[]
with open(f_name) as f:
    for k, v in groupby((e.rstrip() for e in f), lambda s: s.startswith('>')):
        kvs.append(''.join(v) if k else '\n'.join(v))    

print {k:v for k,v in zip(kvs[0::2], kvs[1::2])}

The dict:

{'>test1': 'this is line 1\n\nhi there', 
 '>test2': 'this is line 3\n\nhow are you', 
 '>test3': 'this is line 5 and\n\nwho are you'}

dawg · Answer 3 · 2014-07-08T17:14:40.640

0

You can use a regex:

import re

di={}
pat=re.compile(r'^(>.*?)$(.*?)(?=^>|\Z)', re.S | re.M)
with open(fn) as f:
    txt=f.read()
    for k, v in ((m.group(1), m.group(2)) for m in pat.finditer(txt)):
        di[k]=v.strip()

print di       


# {'>test1': 'this is line 1\nhi there', '>test2': 'this is line 3\nhow are you', '>test3': 'this is line 5 and\nwho are you'}

edited Jul 08 '14 at 17:14

answered Jul 08 '14 at 16:35

dawg

98,345
23
131
206

importing a file as a dictionary in python

3 Answers3