1

I'm very new to Python and not a programmer. I have this:

y1990=open('Documents/python/google-python-exercises/babynames/baby1990.html', 'r', encoding='utf8')
y1992=open('Documents/python/google-python-exercises/babynames/baby1992.html', 'r', encoding='utf8')
y1994=open('Documents/python/google-python-exercises/babynames/baby1994.html', 'r', encoding='utf8')
y1996=open('Documents/python/google-python-exercises/babynames/baby1996.html', 'r', encoding='utf8')
y1998=open('Documents/python/google-python-exercises/babynames/baby1998.html', 'r', encoding='utf8')
y2000=open('Documents/python/google-python-exercises/babynames/baby2000.html', 'r', encoding='utf8')
y2002=open('Documents/python/google-python-exercises/babynames/baby2002.html', 'r', encoding='utf8')
y2004=open('Documents/python/google-python-exercises/babynames/baby2004.html', 'r', encoding='utf8')
y2006=open('Documents/python/google-python-exercises/babynames/baby2006.html', 'r', encoding='utf8')
y2008=open('Documents/python/google-python-exercises/babynames/baby2008.html', 'r', encoding='utf8')

I want to write a more succint code, so I've thought of this:

path='Documents/python/google-python-exercises/babynames/baby'
years=[year for year in range(1990,2010,2)]
open(path+str(years[0])+'.html') # works

On the other hand

'y'+str(years[0]) #works fine and creates string 'y1990'

However when I try to

'y'+str(years[0])=open(path+str(years[0])+'.html')
  File "<stdin>", line 1
SyntaxError: can't assign to operator

As you can see I'm trying to create the variable name and open files dynamically. I've tried multiple ways along these lines and all produce similar errors. I've also found other posts dealing with what I think are similar issues but I fail to see how the answers solve my situation (might very well be my lack of experience with Python). People mention that lists or dictionaries are the way to go, does this apply to my problem too? How would I go about to solve this? Is this even the right Python way?

Community
  • 1
  • 1
xv70
  • 922
  • 1
  • 12
  • 27
  • 1
    Yes, that advice *always* applies whenever you find yourself wanting to dynamically create variables. – Daniel Roseman Jul 02 '14 at 18:11
  • Thank you all for the answers, really clarified my approach. I would upvote but I don't even have reputation to do that. You people rock. – xv70 Jul 02 '14 at 18:35

4 Answers4

1

The problem you are seeing is because you are attempting to assign a value to an expression when they can only be bound to names or container elements. A common beginner mistake is to try and create variable names dynamically. This is almost invariably a bad idea (what if a variable your data creates overwrites one that your program is using, for example).

Fortunately the dict, a convenient key-value store, comes to the rescue. You can create a dict with the simple statement

files = {}

and add to it using

files[year] = open(path+str(years[0])+'.html')

You can then reference the files and read them using, for example

files[1990].readline()

In fact the dict values can be used in just the same way as any other file.

holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • I see, so I end up with the dictionary files={'1990': 'text_in_file_1', '1992': 'text_in_file_2', ..., '2008': text_in_file_10}, then call each file by its key and read it or whatever needed, right? – xv70 Jul 02 '14 at 18:27
  • Yes, though the way your code is written the values aren't the _filenames_ but the open files themselves, so you can call all the usual file methods (`read()`, `readline()`, `readlines()`, etc.) – holdenweb Jul 02 '14 at 20:14
1

what you need is a dictionary :

years = {}
for year in range(1990, 2010,2):
    years[year] = open('Documents/python/google-python-exercises/babynames/baby{y}.html'.format(y=year), 'r', encoding='utf8')

That should work.

you can access the data like this :

years[1990] or
years[1992]
holdenweb
  • 33,305
  • 7
  • 57
  • 77
Tony Suffolk 66
  • 9,358
  • 3
  • 30
  • 33
1

This is tough to explain if you're not a programmer, but the issue here is that you can't have dynamic variable names. The names from the top bit of code (e.g. y1992) have to be written explicitly in the code. This means that doing something like

y199 + 2 = ...
y199 + 4 = ...

is not legal in python (or any other programming language I know of).

The good news is that there exist data structures that can store multiple things for easy access later. In this case you are trying to store a bunch of open files. In python you can use a list or a dict. A list is an ordered collection that is accessible via indices 0,1,2,etc whereas a dict is a collection that lets you access items via a key.

Using a list might look like

myfiles = []  #create an empty list
myfiles.append(open(path+str(years[0])+'.html'))
myfiles.append(open(path+str(years[1])+'.html'))
...
print(myfiles[1])

Using a dict might look like

myfiles = {} #create an empty dict
myfiles[years[0]] = open(path+str(years[0])+'.html')
myfiles[years[1]] = open(path+str(years[1])+'.html')
...
print(myfiles["y1992"])

Both these could be made more succinct my using a loop instead of having a bunch of individual statements that I represent with the ...

Dict example with a loop:

myfiles = {} #create an empty dict
for year in years:
    myfiles[year] = open(path+str(year)+'.html')
print(myfiles["y1992"])
turbulencetoo
  • 3,447
  • 1
  • 27
  • 50
  • So actually the approach of creating variables is itself wrong huh? It requires a small but not trivial change in approach from variables to collection objects. Thanks for the clear answer. – xv70 Jul 02 '14 at 18:29
0

Here is the solution I came up with after reading the input of others in this thread:

path='/home/monorhesus/Documents/python/google-python-exercises/babynames/baby'
keys=[year for year in range(1990,2010,2)]
values=[open(path+str(year)+'.html').read() for year in years]
files=dict(zip(keys, values))

For those who might have the same issue: the first line generates a string with the path name, the second line is a list comprehension that creates dictionary keys, third is a list comprehension that creates dictionary values (notice the .read, so it's the actual file dump), and the last one creates the dictionary from two lists.

xv70
  • 922
  • 1
  • 12
  • 27