5

i've searched pretty hard and cant find a question that exactly pertains to what i want to..

I have a file called "words" that has about 1000 lines of random A-Z sorted words...

10th
1st
2nd
3rd
4th
5th
6th
7th
8th
9th
a
AAA
AAAS
Aarhus
Aaron
AAU
ABA
Ababa
aback
abacus
abalone
abandon
abase
abash
abate
abater
abbas
abbe
abbey
abbot
Abbott
abbreviate
abc
abdicate
abdomen
abdominal
abduct
Abe
abed
Abel
Abelian

I am trying to load this file into a dictionary, where using the word are the key values and the keys are actually auto-gen/auto-incremented for each word

e.g {0:10th, 1:1st, 2:2nd} ...etc..etc...

below is the code i've hobbled together so far, it seems to sort of works but its only showing me the last entry in the file as the only dict pair element

f3data = open('words')

mydict = {}

for line in f3data:
    print line.strip()
    cmyline = line.split()
    key = +1
    mydict [key] = cmyline

print mydict
jed
  • 123
  • 1
  • 3
  • 11

7 Answers7

3
key = +1

+1 is the same thing as 1. I assume you meant key += 1. I also can't see a reason why you'd split each line when there's only one item per line.

However, there's really no reason to do the looping yourself.

with open('words') as f3data:
    mydict = dict(enumerate(line.strip() for line in f3data))
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • i was using split() to remove the "\n" char at the end of each word. I tried .strip() but it threw and error. – jed Nov 02 '11 at 02:00
  • You're going to have to be more specific than that. `.strip()` is tried, tested and true. – Karl Knechtel Nov 02 '11 at 02:01
  • Ahh, thank you. I was working this on my own and while others have a much more python way of doing it, I was trying to make the original code work. I think the key increment was the only line I didn't consider... – billinkc Nov 02 '11 at 02:07
  • @KarlKnechtel I think it was because I was doing the split then strip -- which is what was throwing the error. Yes, your absolutely correct, its tried/true/tested and of course now works without error for me – jed Nov 02 '11 at 03:11
  • Of course; splitting creates a list, which can't be stripped because lists aren't strings. – Karl Knechtel Nov 02 '11 at 03:12
1
dict(enumerate(x.rstrip() for x in f3data))

But your error is key += 1.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
1

The use of zero-based numeric keys in a dict is very suspicious. Consider whether a simple list would suffice.

Here is an example using a list comprehension:

>>> mylist = [word.strip() for word in open('/usr/share/dict/words')]
>>> mylist[1]
'A'
>>> mylist[10]
"Aaron's"
>>> mylist[100]
"Addie's"
>>> mylist[1000]
"Armand's"
>>> mylist[10000]
"Loyd's"

I use str.strip() to remove whitespace and newlines, which are present in /usr/share/dict/words. This may not be necessary with your data.

However, if you really need a dictionary, Python's enumerate() built-in function is your friend here, and you can pass the output directly into the dict() function to create it:

>>> mydict = dict(enumerate(word.strip() for word in open('/usr/share/dict/words')))
>>> mydict[1]
'A'
>>> mydict[10]
"Aaron's"
>>> mydict[100]
"Addie's"
>>> mydict[1000]
"Armand's"
>>> mydict[10000]
"Loyd's"
johnsyweb
  • 136,902
  • 23
  • 188
  • 247
1
f3data = open('words')
print f3data.readlines()
Lachezar
  • 6,523
  • 3
  • 33
  • 34
0

With keys that dense, you don't want a dict, you want a list.

with open('words') as fp:
    data = map(str.strip, fp.readlines())

But if you really can't live without a dict:

with open('words') as fp:
    data = dict(enumerate(X.strip() for X in fp))
Cat Plus Plus
  • 125,936
  • 27
  • 200
  • 224
0
{index: x.strip() for index, x in enumerate(open('filename.txt'))}

This code uses a dictionary comprehension and the enumerate built-in, which takes an input sequence (in this case, the file object, which yields each line when iterated through) and returns an index along with the item. Then, a dictionary is built up with the index and text.

One question: why not just use a list if all of your keys are integers?

Finally, your original code should be

f3data = open('words')

mydict = {}

for index, line in enumerate(f3data):
    cmyline = line.strip()
    mydict[index] = cmyline

print mydict
li.davidm
  • 11,736
  • 4
  • 29
  • 31
0

Putting the words in a dict makes no sense. If you're using numbers as keys you should be using a list.

from __future__ import with_statement

with open('words.txt', 'r') as f:
    lines = f.readlines()


words = {}

for n, line in enumerate(lines):
    words[n] = line.strip()

print words
Acorn
  • 49,061
  • 27
  • 133
  • 172