2

I have a config file that contains a list of strings. I need to read these strings in order and store them in memory and I'm going to be iterating over them many times when certain events take place. Since once they're read from the file I don't need to add or modify the list, a tuple seems like the most appropriate data structure.

However, I'm a little confused on the best way to first construct the tuple since it's immutable. Should I parse them into a list then put them in a tuple? Is that wasteful? Is there a way to get them into a tuple first without the overhead of copying/destroying the tuple every time I add a new element.

Falmarri
  • 47,727
  • 41
  • 151
  • 191

5 Answers5

2

As you said, you're going to read the data gradually - so a tuple isn't a good idea after all, as it's immutable.

Is there a reason for not using a simple list for holding the strings?

adamk
  • 45,184
  • 7
  • 50
  • 57
  • Not at all. I was just curious what the most pythonic way of doing it is I guess. Is parsing it as a list then converting to a tuple a reasonable approach? I know I'm prematurely optimizing, but at this point It's mostly for my own understanding – Falmarri Sep 14 '10 at 06:34
  • @Falmarri : Look at another conversation on so below. The access time is almost the same. Only the creation time is slightly different. Since you will be modifying the elements, you will end up converting back and forth between tuple and list. The faster creation time advantage is mostly lost in the process. – pyfunc Sep 14 '10 at 06:38
1

I wouldn't worry about the overhead of first creating a list and then a tuple from that list. My guess is that the overhead will turn out to be negligible if you measure it.

On the other hand, I would stick with the list and iterate over that instead of creating a tuple. Tuples should be used for struct like data and list for lists of data, which is what your data sounds like to me.

Arlaharen
  • 3,095
  • 29
  • 26
1

Since your data is changing, I am not sure you need a tuple. A list should do fine.

Look at the following which should provide you further information. Assigning a tuple is much faster than assigning a list. But if you are trying to modify elements every now and then then creating a tuple may not make more sense.

Community
  • 1
  • 1
pyfunc
  • 65,343
  • 15
  • 148
  • 136
  • Once the list/tuple is created in memory, it won't be changing until the program is rerun. Presumably it will be running for a long time before being restarted. – Falmarri Sep 14 '10 at 06:37
  • @Falmarri : Once the list is created , access time is mostly the same. So any advantage gained is at best negligible and any way you will need a list during creation time as the elements are identified only during startup. – pyfunc Sep 14 '10 at 06:40
0
with open("config") as infile:
    config = tuple(infile)
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
0

You may want to try using chained generators to create your tuple. You can use the generators to perform multiple filtering and transformation operations on your input without creating intermediate lists. All of the generator processing is delayed until iteration. In the example below the processing/iteration all happens on the last line.

Like so:

f = open('settings.cfg')
step1 = (tuple(i.strip() for i in l.split(':', 1)) for l in f if len(l) > 2 and ':' in l)
step2 = ((l[0], ',' in l[1] and 'Tag' in l[0] and l[1].split(',') or l[1]) for l in step1)
t = tuple(step2)
freegnu
  • 793
  • 7
  • 11