105

I have a file comprising two columns, i.e.,

1 a 
2 b 
3 c

I wish to read this file to a dictionary such that column 1 is the key and column 2 is the value, i.e.,

d = {1:'a', 2:'b', 3:'c'}

The file is small, so efficiency is not an issue.

martineau
  • 119,623
  • 25
  • 170
  • 301
Darren J. Fitzpatrick
  • 7,159
  • 14
  • 45
  • 49

11 Answers11

178
d = {}
with open("file.txt") as f:
    for line in f:
       (key, val) = line.split()
       d[int(key)] = val
Vlad H
  • 3,629
  • 1
  • 19
  • 13
  • 1
    Could you explain the with statement ? – VGE Jan 26 '11 at 11:34
  • 17
    `with` is used here to handle the file clean up. When you leave the block (either just by normal execution flow or by an exception) there file will be automatically closed. You can read more about context-managers in Python here: http://effbot.org/zone/python-with-statement.htm – Vlad H Jan 26 '11 at 11:49
  • 1
    `for line in open("file.txt"):` do cleanup the same way. And if f is a local value the `f` is released when the scope is lost. The only case where this statement is useful is for long function (not good for quality), or if you use a global variable. – VGE Jan 28 '11 at 08:41
  • 1
    @VGE, `for line in open('file.txt')` does *not* do cleanup the same way. Not all Python implementations are the same. `with` guarantees the file will be closed when the block is exited. When the `for` line is complete, `close` *may* be called. `CPython` it will, but versions like `IronPython` have lazy garbage collectors. – Mark Tolonen Apr 02 '13 at 02:04
  • 2
    Is int really necessary here? Perhaps he wanted the numbers to be strings? – GL2014 Oct 08 '14 at 17:01
  • For those with a Java background, the `with` keyword is similar to Java 7's [try-with-resources](https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html) – mkobit Nov 25 '15 at 18:26
  • reading empty lines will throw a `ValueError: need more than 0 values to unpack` - ignore empty lines & comments after the `for` with: `if not line.startswith('#') and line.strip():` – Stuart Cardall Dec 13 '16 at 20:56
  • is there a faster way to set the dictionary such as JSON.stringify the file instead of reading each line? if it is a large file – Ridhwaan Shakeel Jun 26 '19 at 16:10
  • key, val = line.split()[0],line.split()[1] – Ridhi Mar 23 '20 at 18:00
  • @VladH This was a **Brilliant** solution! Thanks! Still works over here, in Corona 2021. //Wishes! – William Martens Apr 07 '21 at 16:47
18

This will leave the key as a string:

with open('infile.txt') as f:
  d = dict(x.rstrip().split(None, 1) for x in f)
wim
  • 338,267
  • 99
  • 616
  • 750
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 4
    A simple `dict([line.split() for line in f])` is sufficient, imo. – user225312 Jan 26 '11 at 12:02
  • @sukhbir: if you read question, you'll see that's not what op wants. – SilentGhost Jan 26 '11 at 12:04
  • @SilentGhost: I read that the OP wants keys as integers, but Ignacio's solution (as well as the one I deleted), has keys as a string (as pointed out by Ignacio himself). – user225312 Jan 26 '11 at 12:08
  • I was confused why we don't need [] when passing in the dict argument. ie `dict([x.rstrip().split(None, 1) for x in f])` instead of `dict(x.rstrip().split(None, 1) for x in f)`. For those thinking the same thing, the former is a generator expression instead of list comprehension as explained here: https://www.python.org/dev/peps/pep-0289(PEP-289). Learnt something new! – peaxol Oct 07 '17 at 18:59
  • 1
    @peaxol: We use a generator expression instead of a list comprehension in order to not create an intermediate list. – Ignacio Vazquez-Abrams Oct 07 '17 at 19:00
11

You can also use a dict comprehension like:

with open("infile.txt") as f:
    d = {int(k): v for line in f for (k, v) in [line.strip().split(None, 1)]}
wim
  • 338,267
  • 99
  • 616
  • 750
  • Yeah, you can absolutely use comprehension here. But I find myself less and less doing it, as it breaks a few of the Zen of Python rules ("Explicit is better than implicit" and "Readabilty counts") – Peter Kassenaar Oct 24 '22 at 07:44
5
def get_pair(line):
    key, sep, value = line.strip().partition(" ")
    return int(key), value

with open("file.txt") as fd:    
    d = dict(get_pair(line) for line in fd)
tokland
  • 66,169
  • 13
  • 144
  • 170
3

By dictionary comprehension

d = { line.split()[0] : line.split()[1] for line in open("file.txt") }

Or By pandas

import pandas as pd 
d = pd.read_csv("file.txt", delimiter=" ", header = None).to_dict()[0]
Samer Ayoub
  • 981
  • 9
  • 10
  • By pandas only takes the first column – Maulik Madhavi Jan 28 '20 at 02:28
  • 1
    @Samer Ayoub The above solution(dictionary comprehension) works if both keys and value are one word long. If my text file has following data.How do I make year as keys and winning team as values. 1903 Boston Americans 1904 No World Series 1905 New York Giants 1906 Chicago White Sox 1907 Chicago Cubs 1908 Chicago Cubs – Ridhi Mar 23 '20 at 17:55
  • 1
    @Ridhi Sorry for belated reply. You can either split on the first space only https://stackoverflow.com/questions/30636248/split-a-string-only-by-first-space-in-python Or Use a regular expression as argument for split() – Samer Ayoub Mar 29 '20 at 17:08
  • @SamerAyoub- Thank you. – Ridhi Mar 29 '20 at 17:53
2

Simple Option

Most methods for storing a dictionary use JSON, Pickle, or line reading. Providing you're not editing the dictionary outside of Python, this simple method should suffice for even complex dictionaries. Although Pickle will be better for larger dictionaries.

x = {1:'a', 2:'b', 3:'c'}
f = 'file.txt'
print(x, file=open(f,'w'))    # file.txt >>> {1:'a', 2:'b', 3:'c'}
y = eval(open(f,'r').read())
print(x==y)                   # >>> True
A. West
  • 571
  • 5
  • 12
0

IMHO a bit more pythonic to use generators (probably you need 2.7+ for this):

with open('infile.txt') as fd:
    pairs = (line.split(None) for line in fd)
    res   = {int(pair[0]):pair[1] for pair in pairs if len(pair) == 2 and pair[0].isdigit()}

This will also filter out lines not starting with an integer or not containing exactly two items

Holger Bille
  • 2,421
  • 1
  • 16
  • 20
0

If you love one liners, try:

d=eval('{'+re.sub('\'[\s]*?\'','\':\'',re.sub(r'([^'+input('SEP: ')+',]+)','\''+r'\1'+'\'',open(input('FILE: ')).read().rstrip('\n').replace('\n',',')))+'}')

Input FILE = Path to file, SEP = Key-Value separator character

Not the most elegant or efficient way of doing it, but quite interesting nonetheless :)

sarathrami
  • 356
  • 1
  • 3
  • 14
0

I had a requirement to take values from text file and use as key value pair. i have content in text file as key = value, so i have used split method with separator as "=" and wrote below code

d = {}
file = open("filename.txt")
for x in file:
    f = x.split("=")
    d.update({f[0].strip(): f[1].strip()})

By using strip method any spaces before or after the "=" separator are removed and you will have the expected data in dictionary format

  • Hi there, welcome to Stack Overflow! Your approach is distinct from other users, but could you edit it to replace the `=` with a ` ` to answer the question? – Prunus Persica Sep 17 '20 at 08:32
-1

Here's another option...

events = {}
for line in csv.reader(open(os.path.join(path, 'events.txt'), "rb")):
    if line[0][0] == "#":
        continue
    events[line[0]] = line[1] if len(line) == 2 else line[1:]
Robel Robel Lingstuyl
  • 1,341
  • 1
  • 11
  • 28
-1
import re

my_file = open('file.txt','r')
d = {}
for i in my_file:
  g = re.search(r'(\d+)\s+(.*)', i) # glob line containing an int and a string
  d[int(g.group(1))] = g.group(2)
VGE
  • 4,171
  • 18
  • 17
  • I don't think this is the best approach. – Donovan Jan 26 '11 at 11:31
  • @Seafoid said "The file is small, so efficiency is not an issue." `split()` does not work almost silently if the file format is not sane. – VGE Jan 26 '11 at 12:49