1

I am making a flashcard program in which I take a text file that contains several columns, such as english word, french equivalent, gender, type of word, etc. My idea was to create a loop that read each line of the text file, separating by tabs, and makes an instance of a user-defined Word object for each line.

In the following block code I import the text file, process it into a list, then attempt to create an instance of a previously defined object: Word. I would like the object to have the second item on the list for it's name so that it is easily searchable, but it's not letting me do this, please can somebody help me with the code:

    file = (open('dictionary.txt', 'r')).readline()
    import re
    line_list = re.split(r'\t', file.rstrip('\n')) 

    line_list[1] = Word(line_list[0], line_list[1], line_list[2], line_list[3]) 
user2980115
  • 45
  • 2
  • 9
  • What does "not letting me do this" mean? Are you getting an exception? What is the expected order of the arguments to your `Word` class's constructor? – Blckknght Nov 12 '13 at 00:08

5 Answers5

3

Create a dict of instances and use the second item of the lists as key. It's a bad idea to create dynamic variables.

import re
instance_dict = {}
with open('dictionary.txt') as f:
    for line in f:
        line_list = re.split(r'\t', line.rstrip('\n')) 
        instance_dict[line_list[1]] = Word(*line_list[:4]) 

Why the with statement?

It is good practice to use the with keyword when dealing with file objects. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way.

Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • Hi, thanks for the comment. I can't quite work out what is going on in your script. Are you making a dictionary where the key is line_list[1] and the value is a Word object? If so, what is the Object named as I can't seem to make it print it's attributes. Thanks – user2980115 Nov 12 '13 at 13:49
  • Ok, I've had another look at this and it works perfectly - exactly what I was looking for, thanks – user2980115 Nov 14 '13 at 13:38
1

You can also use the csv module:

import csv

instances = {}
with open('dictionary.txt', 'rb') as f:
    reader = csv.reader(f, delimiter='\t')
    instances = {line[1]: Word(*line) for line in reader}
Maciej Gol
  • 15,394
  • 4
  • 33
  • 51
0

Here's a cleaner solution using a namedtuple. You'll end up with a dict called "words" which you use to lookup each by name.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pprint
from collections import namedtuple

Word = namedtuple('Word', ['name', 'french', 'gender', 'type_'])

words = {}
with open('dictionary.txt', 'rU') as fin:
    for word in (Word(*r.rstrip('\n').split('\t')) for r in fin):
        words[word.name] = word

pprint.pprint(words)
TkTech
  • 4,729
  • 1
  • 24
  • 32
0

Firstly, it's better to use with, as statements to get input from files, as the closing procedures are automatically taken care of. Secondly, to read ALL of the lines from a file, you must use readlines() rather than readline(). Try something like this :

with open('dictionary.txt','r') as file : 
    line_list = file.readlines() 
splitLineList = [] 
for lines in line_list : 
    splitLineList.append(re.split(r'\t',lines.strip('\n')) 
Monkeyanator
  • 1,346
  • 2
  • 14
  • 29
0

You may have an appropriate solution depending on few clarification on your requirements

"My idea was to create a loop that read each line of the text file, separating by tabs, and"

If the text file is already pre-validated or reliable to ignore error-handling (e.g. not evenly separated by single tabs).

with open('dictionary.txt', 'r') as f:
    [line.strip().split("\t") 
              for line in f.read().split("\n") 
                                  if line.strip()]

will get you the (comprehensive) list required to create Word object instances, without using re

"then attempt to create an instance of a previously defined object: Word."

with open('dictionary.txt', 'r') as f:
    [Word(line.strip().split("\t"))
              for line in f.read().split("\n") 
                                  if line.strip()]

"I would like the object to have the second item on the list for it's name so that it is easily searchable,"

Can you rewrite this with an example?

but it's not letting me do this,

  line_list[1] = Word(line_list[0], line_list[1], line_list[2], line_list[3]) 

Sorry I am loosing you here, why are using line_list[1] to refer newly created Word instances where line_list[1] itself is an argument ?

With your clarification, I would have something like this Reworked Code:

from pprint import pprint

My assumption on your Class definition:

class Word():
    def __init__(self, **kwargs):
        self.set_attrs(**kwargs)

    def __call__(self):
        return self.get_attr("swedish_word")

    def set_attrs(self, **kwargs):
        for k, v in kwargs.iteritems():
            setattr(self, k, v)

    def get_attr(self, attr):
        return getattr(self, attr)

    def get_attrs(self):
        return ({attr.upper():getattr(self, attr) for attr in self.__dict__.keys()})

    def print_attrs(self):
        pprint(self.get_attrs())


if __name__ == '__main__':

# sample entries in dictionary.txt
#    swedish_word    english_word    article           word_type
#    hund            dog              ett                noun
#    katt            cat              ett                noun
#    sova            sleep            ett                verb

    with open('dictionary.txt', 'r') as f:
        header = f.readline().strip().split("\t")


        instances = [Word(**dict(zip(header, line.strip().split("\t"))))
                              for line in f.read().split("\n")
                                                  if line.strip()]

#        for line in f.read().split("\n"):
#             data = dict(zip(header, line.strip().split("\t")))
#             w = Word(**data)

You can get instance properties for a given swedish_word like this

def print_swedish_word_properties(swedish_word):
    for instance in instances:
       if instance() == swedish_word:
           print "Properties for Swedish Word:", swedish_word
           instance.print_attrs()

print_swedish_word_properties("hund")

to have output like this

Properties for Swedish Word: hund
{'ARTICLE': 'ett',
 'ENGLISH_WORD': 'dog',
 'SWEDISH_WORD': 'hund',
 'WORD_TYPE': 'noun'}

or you can use any other class methods to search instances on various attributes

user2390183
  • 975
  • 8
  • 17
  • Hi, I think the commenters above have already sorted this, although I haven't tried it yet. An example would be for the swedish word 'hund' - I would want to use the word 'hund' as the name of the object instance so as to be easily searchable, then within the object I would have the data self.englishword = 'dog', self.article = "ett", self.wordtype = 'noun', and also self.swedishword = 'hund', although that isn't necessary as it's already the name. According to earlier posters, if I understood, it's bad practice to use a variable to name an object. I will try their suggestions now. Thanks all – user2980115 Nov 12 '13 at 11:35
  • pls check Reworked Code above and let me know. thanks – user2390183 Nov 12 '13 at 16:00
  • Many thanks for the detailed response - it goes beyond my current knowledge of python however, so I'm not at all sure what a lot of your code means. What is the purpose of the double underscores? I'm familiar with it only in the context of the __init__ method when defining an object. – user2980115 Nov 14 '13 at 13:07
  • __ __ are reserved for Python. They are called as magic attributes/objects. They provide handy information as in the example above, without you writing a lot of code. As per PEP suggestions, you should never create one in your code; only use them. More details here: http://stackoverflow.com/questions/8689964/python-why-do-some-functions-have-underscores-before-and-after-the-functio – user2390183 Nov 15 '13 at 09:51