4

I know how to do it if its just two columns, but what if the file is like:

01 asd 023 green
01 dff 343 blue
02 fdf 342 yellow
02 fff 232 brown
02 kjf 092 green
03 kja 878 blue

and say, I would like column 2 to be the key to my dictionary and column 4 to my content for that key? I was thinking, a way to go around this problem would be to totally delete the other useless columns so that only the two I need remain, then I can use a script which I also saw on this website to make the dictionary

Python - file to dictionary?

Of course, this is a way around the problem, any tip is greatly appreciated.

Community
  • 1
  • 1
Dergyll
  • 891
  • 1
  • 8
  • 13
  • How do you know your keys are in column 2 and the values in column 4? How are the columns delimited? Always by spaces like in your example? – Lukas Graf Dec 06 '12 at 19:16
  • yes, sorry about that. They are separated by spaces, and this is an example, lets say for this example, my keys are always on 2 and values always on 4 – Dergyll Dec 06 '12 at 19:18
  • Ok, then I would go with @NPE's solution. – Lukas Graf Dec 06 '12 at 19:21

4 Answers4

5
d = {}
with open('data.txt') as f:
  for line in f:
    tok = line.split()
    d[tok[1]] = tok[3]
print(d)

This produces

{'kja': 'blue', 'kjf': 'green', 'fdf': 'yellow', 'asd': 'green', 'fff': 'brown', 'dff': 'blue'}

split() (without an argument) splits the lines into lists of strings. tok[1] and tok[3] then use list indexing to address the second and fourth values in those lists, assigning them to a dictionary's keys and values (d[key] = value).

NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • thank you for the help, what you did was read the file in (lines) and split the columns up. The d[tok[1]] = tok[3] part means you use the tok[1] as column 1, and tok[3] as column 4 right? Thanks alot for the simple to understand code, I hope my beginner's vocabulary didn't annoy anyone. Thanks for the rapid response as well! – Dergyll Dec 06 '12 at 19:22
  • @Dergyll: You're welcome. Yes, `tok[1]` (column 2) is the key and `tok[3]` (column 4) is the value. – NPE Dec 06 '12 at 19:23
  • @NPE Just to note that in `line.strip().split()` the `strip()` is redundant because of the behaviour of `split()` / `split(None)` – Jon Clements Dec 06 '12 at 19:29
  • Hello NPE, I get a funky error: "AttributeError: 'list' object has no attribute 'split'" what does that mean? Is my original file in a weird format or something? Its a txt file... – Dergyll Dec 06 '12 at 19:50
  • @Dergyll: This exact code works for me. Without seeing the exact code that you're running and the exception stack trace, it's hard to say what's going wrong in your case. – NPE Dec 06 '12 at 19:52
2

Something like

from operator import itemgetter
keyval = itemgetter(1, 3)
with open('file') as fin:
    keyvals = (keyval(line.split()) for line in fin)
    my_dict = dict(keyvals)

Notes:

This differs from @NPE's answer in the sense it uses the builtin dict for initialisation, rather than declaring it outside the loop. It also utilises itemgetter as a key retrieval function which takes the 2nd and 4th values from each line (when split by spaces) and uses a generator expression to apply that to each line in the file.

There's also a slight advantage (although, usually not that important) that should my_dict = dict(keyvals) fail, then the name never ends up being bound, while if something occurs by assigning key by key, then it's possible a dict declared outside the with statement ends up "dirty".

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
0
d = {}
with open("file.txt") as f:
    for line in f:
        (key1, val1, key2, val2) = line.split()
        d[int(key1)] = val
        d[int(key2)] = val2

Will get you all of them. Otherwise, you can do something along the lines on NPE.

Emil Ivanov
  • 37,300
  • 12
  • 75
  • 90
0
for line in f.readlines():
    my_dic[line.split(' ')[1]] = line.split(' ')[3]
Jon Martin
  • 3,252
  • 5
  • 29
  • 45