3

I'd like to write a simple wrapper for the python list type that forces it to start indexing at 1 instead of 0. I've got a a fairly complex program based on some discrete probability distributions of duration data, with integer-length buckets, but I don't have any durations of less than 1. Anyway, it would greatly simplify a some significant portions of my code to be able to seamlessly index starting at 1. I was using a dict at first but I found several properties of them to be too cumbersome.

I've never written a wrapper for a Python class before, much less a built-in type, but I feel like what I want to do is fairly simple. At the bare minimum, I should be able to do stuff like this:

>>> p = one_list([1,2,3,4,5])
>>> for i in range(1,6):
    print i, p[i]

1 1
2 2
3 3
4 4
5 5
>>> len(p)
5

However it would be good if I could overwrite some of the other relevant built-in methods of the list class as well, such as index.

>>> len(p)
5
>>> p.index(p[-1])
5

Please share your tips as to how I might go about doing something like this. I was considering whether or not to just do it with a custom class, but that seems like it might be overkill. I also welcome any recommendations as to useful methods to overwrite.

Edit: Afterword

I'd just like to note that it was not worth the trouble to do this, and the reason I've accepted the answer below is not because I tried to implement it in the way he describes, but because he helped me realize that lists are good enough on their own.

machine yearning
  • 9,889
  • 5
  • 38
  • 51
  • Don't. Your code will be impossible for anybody else to maintain. Why do you want to do this? – Jim Garrison Jul 15 '11 at 21:55
  • @Jim: I don't agree. It is a very intuitive way to look at and index my data, which I explained above. I'll probably keep the usage of this class restricted to 1 or 2 modules in my file space, and I will be very clear with comments on how to use it. I think the tradeoff for intuitive code readability/writability is worth it. It simply seems like a better fit. – machine yearning Jul 15 '11 at 22:01
  • 1
    Well, this is as simple as overriding [`__getitem__`](http://docs.python.org/reference/datamodel.html#object.__getitem__), [`__setitem__`](http://docs.python.org/reference/datamodel.html#object.__setitem__), and [`__delitem__`](http://docs.python.org/reference/datamodel.html#object.__delitem__) -- but I have to agree with Jim Garrison that there's probably a better and even simpler way. What would the problem be with (say) just allocating an extra item and leaving `lst[0]` empty? – senderle Jul 15 '11 at 22:05
  • @machine yearning, please justify _"it would greatly simplify some significant portions of my code to be able to seamlessly index starting at 1."_ List the difficulties, show us code snippets with indexing-from-1 versus -from-0. We can suggest solutions. – smci Jul 15 '11 at 22:17
  • @senderle: I guess it would be reasonable to do so, but then I've just got a bunch of lists with `lst[0]==None', which isn't conducive to maintainability either. Also, then `len(lst)` would be one-greater than the number of data points I actually have. Furthermore, I use a lot of generator expressions to create/manipulate lists in my code. IMO it's the difference between addressing this issue EVERY time I create/manipulate/access a data structure, or addressing it once (overriding) and then letting the code fly freely. – machine yearning Jul 15 '11 at 22:18
  • Actually, thinking about it more, it wouldn't be nearly as simple as I said just now. Consider my answer below. – senderle Jul 15 '11 at 22:54

3 Answers3

3

Here's a complete (I think) implementation of a 1-based list, correctly handling slicing (including extended slices), indexing, popping, etc. It's slightly more tricky to get this right than you might think, especially the slicing and the negative indexes. In fact I'm still not 100% sure it works exactly as it should, so caveat coder.

class list1(list):
    """One-based version of list."""

    def _zerobased(self, i):
        if type(i) is slice:
            return slice(self._zerobased(i.start),
                         self._zerobased(i.stop), i.step)
        else:
            if i is None or i < 0:
                return i
            elif not i:
                raise IndexError("element 0 does not exist in 1-based list")
            return i - 1

    def __getitem__(self, i):
        return list.__getitem__(self, self._zerobased(i))

    def __setitem__(self, i, value):
        list.__setitem__(self, self._zerobased(i), value)

    def __delitem__(self, i):
        list.__delitem__(self, self._zerobased(i))

    def __getslice__(self, i, j):
        print i,j
        return list.__getslice__(self, self._zerobased(i or 1),
                                 self._zerobased(j))

    def __setslice__(self, i, j, value):
        list.__setslice__(self, self._zerobased(i or 1),
                          self._zerobased(j), value)

    def index(self, value, start=1, stop=-1):
        return list.index(self, value, self._zerobased(start),
                          self._zerobased(stop)) + 1

    def pop(self, i):
        return list.pop(self, self._zerobased(i))

senderle's ExtraItemList is going to have better performance, though, because it doesn't need to adjust the indices constantly nor does it have an extra layer of (non-C!) method calls between you and the data. Wish I'd thought of that approach; maybe I could profitably combine it with mine...

kindall
  • 178,883
  • 35
  • 278
  • 309
  • Thank you for providing a thorough answer. You've sufficiently convinced me that this is too deep of a solution to implement, and in fact I should just use normal lists. – machine yearning Jul 16 '11 at 00:45
  • 1
    @machine yearning, if this is what convinced you, then you should accept it instead of mine. I was being lazy :). Btw, nice work kindall. You were being lazy too, but in a pretty smart way I think. – senderle Jul 16 '11 at 00:54
  • @machine yearning: I'm glad that you have repented. However I'm very curious as to how you are going to "just use normal lists". Your "problem" is just a special case of handling histograms. Even if the values are integers x such that lo <= x < hi, lo may be much larger than 1, or negative (e.g. histogram of husband's age - wife's age) ... your [None] kludge runs out of steam very rapidly. – John Machin Jul 16 '11 at 01:56
  • @John Machin: In the case of my data, the x-values should all range from 1 to n such that n >= 1. I will choose to ignore the fact that semantically I will be working with a one-off error most of the time, that is, I will be using lists without imagining a literal interpretation of the index values. If I needed to keep track of a special range of x-values I could do this separately. In not so many words, I will be using plain-old lists, full of data, with no `None` values anywhere. – machine yearning Jul 16 '11 at 07:28
  • I wrote a source-to-source compiler where the source language had 1-Indexed lists (yes, it is hell), and therefore I used that implementation. A problem with your code was that you return Python lists for slices, and then it destroys the whole program, since you mix list and list1 usage. – jcklie Apr 07 '14 at 18:58
1

Ok, well, you seem very determined. As I said above, you have to override __getitem__, __setitem__, and __delitem__. Actually you'll also have to override __getslice__, __setslice__, __delslice__ (otherwise slicing starts at 0 as usual). Then __add__, __mul__, __iadd__, __imul__ to return WonkyLists as well (otherwise concatenation won't work the way you'd want). You'll have to override index because it will return a wrong value (I tested it). Also insert, remove, and pop. Here's something to get you started:

class WonkyList(list):
    def __getitem__(self, index):
        return super(WonkyList, self).__getitem__(index - 1)
    def __setitem__(self, index, val):
        super(WonkyList, self).__setitem__(index - 1, val)
    def __delitem__(self, index):
        super(WonkyList, self).__delitem__(index - 1)

Tested:

>>> w = WonkyList(range(10))
>>> w[1]
0
>>> del w[5]
>>> w
[0, 1, 2, 3, 5, 6, 7, 8, 9]
>>> w[1] = 10
>>> w
[10, 1, 2, 3, 5, 6, 7, 8, 9]

Here are some ways it will fail until you override all the necessary methods:

>>> w
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> w[5]
4
>>> w.pop(5)
5
>>> w.insert(5, 5)
>>> w
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del w[2:4]
>>> w
[0, 1, 4, 5, 6, 7, 8, 9]
>>> 
>>> w.index(1)
1

Or you could try something else. Have you considered, for example...

def ExtraItemList(list):
    def __init__(self, seq):
        self[0] = 0
        self[1:] = seq
    def n_items(self):
        return len(self) - 1

That seems to take care of the issues you list, and it still behaves in a basically list-like manner, even if the initializer is a little weird. Though to be honest, looking at the examples you gave, I'm starting to feel like even this is an ugly solution -- none of the functions you gave show the advantage of starting indexing from 1, they just show the bad effects of adding an extra item to the beginning of the list. Based on those examples, you should just use a regular list.

Perhaps the best approach would be to write custom get and set methods that use the desired offset. That way the lists would behave normally, but you'd have an alternative interface when needed.

senderle
  • 145,869
  • 36
  • 209
  • 233
  • I accept defeat... sorry for being so stubborn. I guess I didn't think through all of the details. It seemed strange to be consistently indexing my data one-off from its actual indices, but when I consider the even stranger alternative, I think I'll stick with simplicity. Forget padding the `lis[0]` index, I'll just use regular lists, and be careful with my output. Thank you for the thoughtful response. – machine yearning Jul 16 '11 at 00:49
  • @machine yearning, I too was being stubborn, if you didn't notice :). Good luck by the way. – senderle Jul 16 '11 at 01:02
1

Here's a basic implementation to get you started. I thought it would be nice to generalize it, so you can have lists with indices starting at any integer you like. This led me to discover __slots__, so thank you for asking this question!

class OffsetList(list):
    __slots__ = 'offset'
    def __init__(self, init=[], offset=-1):
        super(OffsetList, self).__init__(init)
        self.offset = offset
    def __getitem__(self, key):
        return super(OffsetList, self).__getitem__(key + self.offset)
    def __setitem__(self, key, value):
        return super(OffsetList, self).__setitem__(key + self.offset, value)
    def __delitem__(self, key):
        return super(OffsetList, self).__delitem__(key + self.offset)
    def index(self, *args):
        return super(OffsetList, self).index(*args) - self.offset

>>> a = OffsetList([1,2,3])
>>> a
[1, 2, 3]
>>> a[1]
1
>>> a.index(2)
2

You may find you want to implement __getslice__, __setslice__, __delslice__ too, not to mention pop and insert.

Gareth Rees
  • 64,967
  • 9
  • 133
  • 163