3

I want to use Python to create a file that looks like

# empty in the first line
this is the second line
this is the third line

I tried to write this script

myParagraph = []
myParagraph[0] = ''
myParagraph[1] = 'this is the second line'
myParagraph[2] = 'this is the third line'

An error is thrown: IndexError: list index out of range. There are many answers on similar questions that recommend using myParagraph.append('something'), which I know works. But I want to better understand the initialization of Python lists. How to manipulate a specific elements in a list that's not populated yet?

Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
F.S.
  • 1,175
  • 2
  • 14
  • 34
  • Use `myParagraph.append(line_add)` – dawg Aug 08 '16 at 21:42
  • @dawg please see the last part in my question. My question is how (or if it's possible) to directly access a specific element of a list. – F.S. Aug 08 '16 at 21:45
  • 2
    A list must be completely populated -- if `my_list[6]` exists, then `my_list[0]` - `my_list[5]` must also exist. In your case, if you assigned something to `myParagraph[5]`, what would you expect to be placed in the earlier positions? If you know how many lines you'll end up with, you could start with `myParagraph = [''] * num_lines`. Or if you want to store values for arbitrary indexes, and only those indexes, you could use a dictionary: `myParagraph = {}`. – Matthias Fripp Aug 08 '16 at 21:57
  • @mfripp thank you for the clarification. Coming from a Matlab background (which allows this action), there are many different rules in Python that I'm trying to get used to. – F.S. Aug 08 '16 at 22:07
  • 1
    @Chris I thought I remembered that syntax from somewhere! By the way, if you're working with Matlab-style vectors and matrices, you should check out the numpy package. But for ad hoc data types, Python's lists are great (and easier syntax than a Matlab cell array, at least for me). – Matthias Fripp Aug 08 '16 at 22:12
  • @mfripp Yeah I'm also learning Numpy. I guess the strength of Matlab is its easy matrix manipulations. Cells are indeed not as nice as "list" in other languages like Python and R. – F.S. Aug 08 '16 at 22:40

6 Answers6

2

Since you want to associate an index (whether it exists or not) with an element of data, just use a dict with integer indexes:

>>> myParagraph={}
>>> myParagraph[0] = ''
>>> myParagraph[1] = 'this is the second line'
>>> myParagraph[2] = 'this is the third line'
>>> myParagraph[99] = 'this is the 100th line'
>>> myParagraph
{0: '', 1: 'this is the second line', 2: 'this is the third line', 99: 'this is the 100th line'}

Just know that you will need to sort the dict to reassemble in integer order.

You can reassemble into a string (and skip missing lines) like so:

>>> '\n'.join(myParagraph.get(i, '') for i in range(max(myParagraph)+1))
dawg
  • 98,345
  • 23
  • 131
  • 206
  • 1
    Also, if you start with `myParagraph = collections.defaultdict(str)`, you could automatically return an empty string for any unassigned rows. (e.g., via `[myParagraph[i] for i in range(99)]` or `[myParagraph[i] for i in range(max(myParagraph.keys()))]`). Or with a standard dictionary you could use `[myParagraph.get(i, '') for i in range(max(myParagraph.keys()))]`. – Matthias Fripp Aug 08 '16 at 22:19
  • Gotcha. Yeah I can see it doesn't make much sense to use these tricks instead of just .append(). It's totally fine to write a=[]; a(2)=2; in Matlab. That's essentially where I got confused in Python. – F.S. Aug 08 '16 at 22:29
  • simpler comment, too late to edit: If you use the above solution and you know how many lines there will be (and you fill them all in), then you can retrieve the lines in order with something like `[myParagraph[i] for i in range(99)]`. If you don't know how long the paragraph will be, and you want to use blanks for any unassigned lines, you could retrieve the lines via `[myParagraph.get(i, '') for i in range(max(myParagraph.keys()))]`. – Matthias Fripp Aug 08 '16 at 22:29
1

A list doesn't have an unkown size - len(myParagraph) will give you its length

alex314159
  • 3,159
  • 2
  • 20
  • 28
1

You can do a limited form of this by assigning to a range of indexes starting at the end of the list, instead of a single index beyond the end of the list:

myParagraph = []
myParagraph[0:] = ['']
myParagraph[1:] = ['this is the second line']
myParagraph[2:] = ['this is the third line']

Note: In Matlab, you can assign to arbitrary positions beyond the end of the array, and Matlab will fill in values up to that point. In Python, any assignment beyond the end of the array (using this syntax or list.insert()) will just append the value(s) into the first position beyond the end of the array, which may not be the same as the index you assigned.

Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
1

You can define a function that will do this for you:

def set_at(xs, idx, x, default=None):
    if len(xs) <= idx:
        xs.extend([default] * (idx - len(xs) + 1))
    xs[idx] = x

Then to use it:

myParagraph = []
set_at(myParagraph, 1, 'this is the second line', default='')
set_at(myParagraph, 2, 'this is the third line')
set_at(myParagraph, 20, 'this is the twenty-first line', default='')
OmnipotentEntity
  • 16,531
  • 6
  • 62
  • 96
0
myParagraph = []
myParagraph.append('')
myParagraph.append('this is the second line')
myParagraph.append('this is the third line')

for i,item in enumerate(myParagraph):
    print "i:"+str(i)+": item:"+item

result:

i:0: item:
i:1: item:this is the second line
i:2: item:this is the third line
Volantines
  • 76
  • 2
  • please see the last part in my question. My question is that if it's possible to avoid using .append(). Say, directly defining the third line. – F.S. Aug 08 '16 at 21:48
  • 1
    yes, but before initializing myParagraph = [None]*3 – Volantines Aug 08 '16 at 21:53
0

append is the easiest way to get around this, but if it makes you more comfortable having those indices then you should consider using insert:

Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).

myParagraph = []
myParagraph.insert(0, '\n')
myParagraph.insert(1, 'this is the second line\n')
myParagraph.insert(2, 'this is the third line\n')

And don't forget the new line character '\n' when writing to a file.

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • cool! I knew there must be a function to do the job. thanks for pointing it out for me – F.S. Aug 08 '16 at 21:58
  • 3
    Note that if you use _any_ index beyond the end of your list, the value will just get appended to the end, not necessarily at the index number you expect. By the way you can also get the same effect with `myParagraph[1:1] = ['this is the second line']`. – Matthias Fripp Aug 08 '16 at 22:09
  • 1
    `.insert()` only works to the extent the previous elements exist. If you do `li=[]; li.insert(2, 'line')` the first two missing list elements are not created... – dawg Aug 08 '16 at 22:09