281

I have gone through most of the documentation of __getitem__() in the Python docs, but I am still unable to grasp the meaning of it.

So all I can understand is that __getitem__() is used to implement calls like self[key]. But what is the use of it?

Lets say I have a python class defined in this way:

class Person:
    def __init__(self,name,age):
        self.name = name
        self.age = age

    def __getitem__(self,key):
        print ("Inside `__getitem__` method!")
        return getattr(self,key)

p = Person("Subhayan",32)
print (p["age"])

This returns the results as expected. But why use __getitem__() in the first place? I have also heard that Python calls __getitem__() internally. But why does it do it?

Can someone please explain this in more detail?

Super Kai - Kazuya Ito
  • 22,221
  • 10
  • 124
  • 129
Subhayan Bhattacharya
  • 5,407
  • 7
  • 42
  • 60
  • 1
    This may be of interest for one example use: [How to properly subclass dict and override __getitem__ & __setitem__](http://stackoverflow.com/questions/2390827/how-to-properly-subclass-dict-and-override-getitem-setitem) – roganjosh Apr 26 '17 at 07:20
  • 9
    The `__getitem__` use in your example doesn't make a lot of sense, but imagine that you need to write a custom list- or dictionary-like class, that has to work with existing code that uses `[]`. That's a situation where `__getitem__` is useful. – Pieter Witvoet Apr 26 '17 at 07:31
  • 1
    The primary use case, in my opinion, is when you are writing a custom class that represents a collection of things. This allows you to use the familiar list/array indexing like `planets[i]` to access a given item even though `planets` is not actually a list (and it could, under the covers, use any data structure it chooses, such as a linked list or graph, or implement any non-list functions that it chooses, which a list could not). – jarmod Oct 06 '20 at 11:15

11 Answers11

313

Cong Ma does a good job of explaining what __getitem__ is used for - but I want to give you an example which might be useful. Imagine a class which models a building. Within the data for the building it includes a number of attributes, including descriptions of the companies that occupy each floor :

Without using __getitem__ we would have a class like this :

class Building(object):
     def __init__(self, floors):
         self._floors = [None]*floors
     def occupy(self, floor_number, data):
          self._floors[floor_number] = data
     def get_floor_data(self, floor_number):
          return self._floors[floor_number]

building1 = Building(4) # Construct a building with 4 floors
building1.occupy(0, 'Reception')
building1.occupy(1, 'ABC Corp')
building1.occupy(2, 'DEF Inc')
print( building1.get_floor_data(2) )

We could however use __getitem__ (and its counterpart __setitem__) to make the usage of the Building class 'nicer'.

class Building(object):
     def __init__(self, floors):
         self._floors = [None]*floors
     def __setitem__(self, floor_number, data):
          self._floors[floor_number] = data
     def __getitem__(self, floor_number):
          return self._floors[floor_number]

building1 = Building(4) # Construct a building with 4 floors
building1[0] = 'Reception'
building1[1] = 'ABC Corp'
building1[2] = 'DEF Inc'
print( building1[2] )

Whether you use __setitem__ like this really depends on how you plan to abstract your data - in this case we have decided to treat a building as a container of floors (and you could also implement an iterator for the Building, and maybe even the ability to slice - i.e. get more than one floor's data at a time - it depends on what you need.

Messa
  • 24,321
  • 6
  • 68
  • 92
Tony Suffolk 66
  • 9,358
  • 3
  • 30
  • 33
  • 48
    Just to share something I learned only after reading the answer multiple times: once you have a __getitem__ you don't have to explicitly call that function. When he calls `building1[2]` that call itself internally calls the getitem. So the point @tony-suffolk-66 is making is that, any property/variable of the class can be retrieved during run time by simply calling objectname[variablename]. Just clarifying this since it wasn't clear for me initially and writing it here hoping it helps someone. Delete if redundant please – mithunpaul Jan 11 '19 at 03:52
  • 3
    @mithunpaul the object[index] notation isn't used to get a property/variable/attribute of a class - it isindexing on a container object - for instance retrieving a child object from a parent where the parent maintains a list of it's children. In my example - the Building class is a container (in this case of Floor names), but it could be a container class for Floor classes. – Tony Suffolk 66 Jan 13 '19 at 03:02
  • Except it will not support `len()`, and you will get a `TypeError`: `TypeError: object of type 'Building' has no len()` – Ciasto piekarz May 03 '19 at 21:41
  • Supporting len (and other features such as iteration etc ) wasn’t the purpose of my example. Implementing a dunder_len method is trivial though. – Tony Suffolk 66 May 08 '19 at 17:52
  • @TonySuffolk66: is this correct that ____len____ determines the iterable for index (floors) in your example on which ____getitem____ loops? – Alex Aug 20 '19 at 15:12
  • 1
    If you implement __len__ on that class it would be iterable (since __getitem__ uses integer indexes). __len__ doesn't 'determine the iterable' - it identifies that it is sequence of known length. – Tony Suffolk 66 Sep 09 '19 at 22:40
  • @TonySuffolk66 hi, what could you elaborate on your above comment," - the Building class is a container (in this case of Floor names), but it could be a container class for Floor classes". I was thinking,I understood the example until I read this. – amarykya_ishtmella Nov 03 '20 at 19:51
  • The Building class is just a wrapper around a list, it could contain other attrributes too if you needed it too. There is also nothing stopping that list conatining objects other than strings - so if you had a Floor Class (which contains details of the Floor for example) then the Building class could contain instances of that Floor Class. – Tony Suffolk 66 Nov 04 '20 at 01:19
  • And, your answer nicely cribbed here : https://www.pythonprogramming.in/example-of-getitem-and-setitem-in-python.html – RichieHH Apr 01 '21 at 16:02
  • Badly cribbed - who in their right mind would call it Counter - ah well. – Tony Suffolk 66 Apr 01 '21 at 23:14
138

The [] syntax for getting item by key or index is just syntax sugar.

When you evaluate a[i] Python calls a.__getitem__(i) (or type(a).__getitem__(a, i), but this distinction is about inheritance models and is not important here). Even if the class of a may not explicitly define this method, it is usually inherited from an ancestor class.

All the (Python 2.7) special method names and their semantics are listed here: https://docs.python.org/2.7/reference/datamodel.html#special-method-names

Cong Ma
  • 10,692
  • 3
  • 31
  • 47
10

The magic method __getitem__ is basically used for accessing list items, dictionary entries, array elements etc. It is very useful for a quick lookup of instance attributes.

Here I am showing this with an example class Person that can be instantiated by 'name', 'age', and 'dob' (date of birth). The __getitem__ method is written in a way that one can access the indexed instance attributes, such as first or last name, day, month or year of the dob, etc.

import copy

# Constants that can be used to index date of birth's Date-Month-Year
D = 0; M = 1; Y = -1

class Person(object):
    def __init__(self, name, age, dob):
        self.name = name
        self.age = age
        self.dob = dob

    def __getitem__(self, indx):
        print ("Calling __getitem__")
        p = copy.copy(self)

        p.name = p.name.split(" ")[indx]
        p.dob = p.dob[indx] # or, p.dob = p.dob.__getitem__(indx)
        return p

Suppose one user input is as follows:

p = Person(name = 'Jonab Gutu', age = 20, dob=(1, 1, 1999))

With the help of __getitem__ method, the user can access the indexed attributes. e.g.,

print p[0].name # print first (or last) name
print p[Y].dob  # print (Date or Month or ) Year of the 'date of birth'
user3503692
  • 319
  • 3
  • 9
  • Great example! I was searching all over about how to implement __getitem__ when there are multiple parameters in __init__ and I was struggling to find a proper implementation and finally saw this! Upvoted and thank you! – Rahul P Feb 06 '20 at 02:04
  • 9
    using __getitem__ to access attributes like this is horrible (in my opinion) - far better to write a property and create a read-only virtual attribute. Think about readability. your p[y].dob it reads as if p is a container - not that p is an instance with attributes. A virtual attribute would read as far nicer to the code using your module. you could also - if you insist - use __getattr_ to implement a virtual attribte but a property is a cleaner solution. – Tony Suffolk 66 Oct 10 '20 at 22:07
  • @TonySuffolk66 Can you provide an example of what you mean please? Perhaps rewrite the solution in this answer using your suggestions. Thanks. – Jose Quijada Aug 02 '21 at 16:36
  • 1
    @JoseQuijada Any solution would be too long for a comment - but it is clear to me that when you use the `p[0]` syntax the code is read as implying that p is a container and therefore could well be a `p[1]`, `p[2]` and than `len(p)` will return an integer. Your code though doesn't do that - for you `p[0]` is a modifier of `p`. Although it works it will make code difficult to read for others. A more readable system would be to write properties for 'firstname', 'lastname', 'dob_year' etc - there it will be obvious what each does - rather than `p[0].name` magically meaning the first name. – Tony Suffolk 66 Aug 02 '21 at 19:43
4

As a side note, the __getitem__ method also allows you to turn your object into an iterable.

Example: if used with iter(), it can generate as many int squared values as you want:

class MyIterable:
    def __getitem__(self, index):
        return index ** 2


obj = MyIterable()
obj_iter = iter(obj)

for i in range(1000):
    print(next(obj_iter))
neuling
  • 41
  • 2
2

For readability and consistency. That question is part of why operator overloading exists, since __getitem__ is one of the functions that implement that.

If you get an unknown class, written by an unknown author, and you want to add its 3rd element to its 5th element, you can very well assume that obj[3] + obj[5] will work.

What would that line look like in a language that does not support operator overloading?? Probably something like obj.get(3).add(obj.get(5))?? Or maybe obj.index(3).plus(obj.index(5))??

The problem with the second approach is that (1) it's much less readable and (2) you can't guess, you have to look up the documentation.

blue_note
  • 27,712
  • 9
  • 72
  • 90
1

A common library that uses this technique is the 'email' module. It uses the __getitem__ method in the email.message.Message class, which in turn is inherited by MIME-related classes.

Then in the and all you need to get a valid MIME-type message with sane defaults is add your headers. There's a lot more going on under the hood but the usage is simple.

message = MIMEText(message_text)
message['to'] = to
message['from'] = sender
message['subject'] = subject
bbuck
  • 129
  • 10
1

Django core has several interesting and nifty usages for magic methods, including __getitem__. These were my recent finds:

  1. Django HTTP Request

    • When you submit GET/POST data in Django, it will be stored in Django's request object as request.GET/request.POST dict. This dict is of type QueryDict which inherits from MultiValueDict.

    • When you submit data, say user_id=42, QueryDict will be stored/represented as:

      <QueryDict: {'user_id': ['42']}>

      So, the passed data becomes

      'user_id': ['42']

      instead of the intuitive

      'user_id': '42'

      MultiValueDict's docstring explains though why it needs to auto-convert this to list format:

      This class exists to solve the irritating problem raised by cgi.parse_qs, which returns a list for every key..

    • Given that the QueryDict values are transformed into lists, they will need to be accessed then like this (same idea with request.GET):

      • request.POST['user_id'][0]

      • request.POST['user_id'][-1]

      • request.POST.get('user_id')[0]

      • request.POST.get('user_id)[-1]

        But, these are horrible ways to access the data. So. Django overridden the __getitem__ and __get__ in MultiValueDict. This is the simplified version:

        def __getitem__(self, key):
            """
            Accesses the list value automatically 
            using the `-1` list index.
            """
            list_ = super().__getitem__(key)
            return list_[-1]
        
        def get(self, key, default=None):
            """
            Just calls the `__getitem__` above.
            """
            return self[key]
        

        With these, you could now have a more intuitive accessors:

        • request.POST['user_id']
        • request.POST.get('user_id')
  2. Django Forms

    • In Django, you could declare forms like this (includes ModelForm):

      class ArticleForm(...):
          title = ...
      
    • These forms inherit from BaseForm, and have these overridden magic methods (simplified version):

      def __iter__(self):
         for name in self.fields:
             yield self[name]
      
      def __getitem__(self, name):
          return self.fields[name]
      

      resulting to these convenient patterns:

      # Instead of `for field in form.fields`.
      # This is a common pattern in Django templates.
      for field in form
          ...
      
      # Instead of `title = form.fields['title']`
      title = form['title']
      

In summary, magic methods (or their overrides) increase code readability and developer experience/convenience.

Ranel Padon
  • 525
  • 6
  • 13
1

The use of __getitem__ includes implementing control flow measures that for some weird reason cannot be performed lower in the execution stack:

class HeavenlyList(list):
    """don't let caller get 666th element"""
    
    def __getitem__(self, key):
        """return element"""
        if isinstance(key, slice):
            return [
                super().__getitem__(i)
                for i in range(key.start, key.stop, key.step)
                if i != 666
            ]
        return super().__getitem__(key) if key != 666 else None

A similar, but more interesting reason is to allow slice-based access to elements in container/sequence types that ordinarily don't allow it:

class SliceDict(dict):
    """handles slices"""
    
    def __setitem__(self, key, value):
        """map key to value"""
        if not isinstance(key, int)
            raise TypeError("key must be an integer")
        super().__setitem__(key, value)
    
    def __getitem__(self, key):
        """return value(s)"""
        if not isinstance(key, slice):
            return super().__getitem__(key)
        return [
            super().__getitem__(i)
            for i in range(key.start, key.stop, key.step)
        ]

Another interesting use is overriding str.__getitem__ to accept str objects as well as ints and slices, such that the str input is a regular expression, and the return value is the match object iterator returned by re.finditer:

from re import finditer

class REString(str):
    """handles regular expressions"""
    
    re_flags = 0
    
    def __getitem__(self, key):
        """return some/all of string or re.finditer"""
        if isinstance(key, str):
            return finditer(key, self, flags=self.re_flags)
        return super().__getitem__(key)

A real-world problem where overriding dict.__getitem__ in particular proves useful is when a program requires information that is distributed over the internet and available over HTTP. Because these information are remote, the process can employ some level of laziness-- only retrieving data for items it doesn't have or that have changed. The specific example is having a dictionary instance lazily retrieve and store Python Enhancement Proposals. There are many of these documents, sometimes they are revised, and they all reside on hosts known by the domain name peps.python.org. Therefore the idea is to make a HTTP GET request for the PEP number passed into __getitem__, fetching it if the dictionary doesn't already contain it or the PEPs HTTP ETAG changed.

from http import HTTPStatus, client


class PEPDict(dict):
    """lazy PEP container"""
    
    conn = client.HTTPSConnection("peps.python.org")
    
    def __getitem__(self, pep):
        """return pep pep"""
        
        # if lazy for too long
        if self.conn.sock is None:
            self.conn.connect()
        
        # build etag check in request header
        requestheaders = dict()
        if pep in self:
            requestheaders = {
                "if-none-match": super().__getitem__(pep)[0]
            }
        
        # make request and fetch response
        self.conn.request(
            "GET",
            "/%s/" % str(pep).zfill(4),
            headers=requestheaders
        )
        response = self.conn.getresponse()
        
        # (re)set the pep
        if response.status = HTTPStatus.OK:
            self.__setitem__(
                pep, (
                    response.getheader("etag"),
                    response.read()
                )
            )
        
        # raise if status is not ok or not modified
        if response.status != HTTPStatus.NOT_MODIFIED:
            raise Exception("something weird happened")
        
        return super().__getitem__(pep)[1]
        

A good resource for understanding further what is the use of it is to review its associated special/dunder methods in the emulating container types section of Python's data model document.

thebadgateway
  • 433
  • 1
  • 4
  • 7
0

OK I'll just leave this here. OP questions the very basics of software engineering.

This is about defining class interface. Consistency, readability or whatever else is secondary.

First of all this is about how different parts of the project can talk to your object.

Imagine function which calls [] on some object. Now you are tasked to do exactly what this function does with some new type object that you have. But your object is not a list or dict, or tuple.

Now you don't need to implement anything but define a __getitem__ for the class of your object.

Interfaces create building blocks out of bunch of internal implementations. Define them wisely.

0

Further examples of more complex cases

The following example shows exactly what you get when calling []/__getitem__ with various inputs, which should help clarify how it works:

class C(object):
    def __getitem__(self, k):
        return k

# Single argument is passed directly.
assert C()[0] == 0

# Multiple indices generate a tuple.
assert C()[0, 1] == (0, 1)

# Slice notation generates a slice object.
assert C()[1:2:3] == slice(1, 2, 3)

# Empty slice entries become None.
assert C()[:2:] == slice(None, 2, None)

# Ellipsis notation generates the Ellipsis class object.
# Ellipsis is a singleton, so we can compare with `is`.
assert C()[...] is Ellipsis

# Everything mixed up.
assert C()[1, 2:3:4, ..., 6, :7:, ..., 8] == \
       (1, slice(2,3,4), Ellipsis, 6, slice(None,7,None), Ellipsis, 8)

What you do with the argument of __getitem__ is then arbitrary. Of course, anything besides array-like indexing would likely make for an insane API. But nothing prevents you from going wild!

I also covered Ellipsis at: What does the Ellipsis object do?

Tested in Python 3.5.2 and 2.7.12.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
0

getitem():

  • is used when some actions are needed by getting an items.
  • is often used with setitem() which is used when some actions are needed by setting items.

For example, the code below counts how many times items are set and got. *You can also see the actual example of Django session which uses __getitem__() and __setitem__():

class Test:
    def __init__(self):
        self.item = {}
        self.get_count = 0
        self.set_count = 0

    def __getitem__(self, key):
        self.get_count += 1
        return self.item.get(key)

    def __setitem__(self, key, value):
        self.item[key] = value
        self.set_count += 1

test = Test()
print(f'set_count:{test.set_count}') # set_count:0
print(f'get_count:{test.get_count}') # get_count:0

# Set items 2 times
test['name'] = 'John'
test['age'] = 36

# Get items 3 times
print(test['name']) # John
print(test['name']) # John
print(test['age']) # 36

print(f'set_count:{test.set_count}') # set_count:2
print(f'get_count:{test.get_count}') # get_count:3
Super Kai - Kazuya Ito
  • 22,221
  • 10
  • 124
  • 129