1

I am planning to break a long document in various strings, and save them as they are (so each string can have different length, i will not break them to strings with specific length).

I was planning to use an array, so as i scroll trough the document, i can just add each string to a position of the array and increment the index; and once that i am done, i can print the string one by one, keeping the original format of the document, and then nuke the array and start again with the next sequence.

Is this something that sounds correct or is there another data structure type that I could use, that would fit my needs and keep memory usage and processor cycles down to a reasonable level ( the documents that I have to process are pretty long and full of data, so I am concerned to keep too much busy the processor and use too much memory).

Thanks in advance!

user1006198
  • 4,681
  • 6
  • 21
  • 18

1 Answers1

3

All you've said you want to do with these strings is collect them in order, and then print them out in order. Based on that, a list sounds totally reasonable.

Laurence Gonsalves
  • 137,896
  • 35
  • 246
  • 299
  • I want to collect them, print them in the same order and with the same formatting that they have in the original doc and then just wipe out the data structure to make room for more strings; this will be done for each document. So a list would be more performing and a better solution for a task like this? – user1006198 Oct 27 '11 at 00:31
  • Of course, if nothing in the list needs to be changed, then a tuple is better since it's immutable. But both should work fine. The general consensus is, use a list if it needs to be changed, else use a immutable datatype (tuple in this case). Edit: Doesn't look like this applies though, since you want to change the data. And yes, a list should work just fine. :) – John Doe Oct 27 '11 at 00:33
  • 1
    According to [this discussion](http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use), you are only better off using arrays in some special cases, which do not include yours - a collection of strings which you are not going to change. Even outside Python, an array is better than list if you need random access, which once again isn't your case - you will just append strings to the collection and output them consecutively. – egor83 Oct 27 '11 at 00:37
  • @JohnDoe I agree that immutable types are preferable when they'll work, but I had the impression that they want to incrementally build up the sequence of strings. – Laurence Gonsalves Oct 27 '11 at 00:41
  • 4
    @egor83 I think there's some sort of confusion here... a "list" in Python has O(1) random access. The Python "list" type is equivalent to what Java (and I believe C#) calls an "ArrayList", or what C++'s STL calls a vector: it's a list backed by an array (a C array, in Python's case) of values, and it'll reallocate the array if it needs to grow. – Laurence Gonsalves Oct 27 '11 at 00:48
  • Thanks everyone; I will go for the list then! – user1006198 Oct 28 '11 at 20:56