3

I'm working on a Python script at work that is used to interact with XLS/XLSX/CSV spreadsheets. There are three main classes which are nested inside one another (not extending one another, the classes are literally inside the other class)

The three primary classes are explained below:

  1. The primary Workbook class, which is a factory method for the XLS/XLSX/CSV classes. This is accessible externally
  2. The private __Worksheet class within the Workbook class, which is used to open a specific spreadsheet or worksheet within the file itself. This is accessible only via the Workbook.worksheet() method
  3. The private __Cell class within the __Worksheet class, which interacts with the cells themselves. This shouldn't be accessible externally, rather only accessible via the __Worksheet class

Heres a simplified version of the class structures thus far:

class Workbook( object ):

    def __init__( self, file_name ):
        self.__file_name = file_name

    def worksheet( self, name ):
        return self.__Worksheet( self, name )

    class __Worksheet():
        def __init__( self, workbook, worksheet ):
            self.__workbook = workbook

        def cell( self, cell_id, data = None ):
            return self.__Cell( cell_id, data )

        class __Cell():
            def __init__( self, cell, data = None ):
                self.__cell = cell
                self.__data = data

            def setVal( self, data ):
                self.__data = data

            def __str__( self ):
                return self.__data

workbook = Workbook( 'test-file.csv' )
worksheet = workbook.worksheet( 'First Worksheet' )
cell_A1 = worksheet.cell('A1', 'Foo...')

print("Before - Cell A1: %s" % cell_A1) # => Before - Cell A1: Foo...
cell_A1.setVal('Bar...')
print("After - Cell A1: %s" % cell_A1)  # => Before - After - Cell A1: Bar...

So the question I have is - Is it considered bad practice to have a class within a class, within a class?

I'm somewhat new to Python, my experience is mostly PHP/JS/Perl. It doesn't seem too uncommon in Python to have a class within a class, but for some reason, nesting a class 3 levels deep just seems wrong. If it is, and there's a better way to do it, then that would be great.

I know the alternative is to not nest the classes, and just check if an instance of Workbook is given to Worksheet as a parameter. Then create a method in Workbook which just returns an instance if Worksheet, while handing self as one of the parameters used to initiate it.

Example:

class Workbook( object ):
    def __init__( self, file_name ):
        self.__file_name = file_name

    def worksheet( self, name ):
        return self.Worksheet( self, name )

class Worksheet( object ):
    def __init__( self, workbook, worksheet = 0 ):
        if not isinstance( workbook, Workbook ):
            raise Exception( 'Expected the workbook to be an instance of the Workbook class' )

        self.__workbook = workbook

    def cell( self, cell_id, data = None ):
        return self.Cell( cell_id, data )

class Cell( object ):
    def __init__( self, worksheet, cell, data = None ):
        if not isinstance( worksheet, Worksheet ):
            raise Exception( 'Expected the worksheet to be an instance of the Worksheet class' )

        self.__cell = cell
        self.__data = data

    def setVal( self, data ):
        self.__data = data

    def __str__( self ):
        return self.__data

# Example Usage One
workbook = Workbook( 'test-file.xls' )
worksheet = workbook.worksheet( 'First Worksheet' )
cell_A1 = worksheet.cell('A1', 'Foo...')

print("Before - Cell A1: %s" % cell_A1) # => Before - Cell A1: Foo...
cell_A1.setVal('Bar...')
print("After - Cell A1: %s" % cell_A1)  # => Before - After - Cell A1: Bar...

# Example Usage Two
workbook = Workbook( 'test-file.xlsx' )
worksheet = Worksheet( workbook, 'First Worksheet' )
cell_A1 = Cell( worksheet, 'A1', 'Foo...')

print("Before - Cell A1: %s" % cell_A1) # => Before - Cell A1: Foo...
cell_A1.setVal('Bar...')
print("After - Cell A1: %s" % cell_A1)  # => Before - After - Cell A1: Bar...

# Failed Example
worksheet = Worksheet( 'Not worksheet', 1 ) # => Exception, as expected

However, this alternative means that the Worksheet and Cell classes are accessible externally and can be initiated manually... but I guess thats not a terrible thing.

Let me know what you think the best route is! One of the comments on this post provides a link to another SO post, in which a user posts 3 advantages of nesting classes, the first of which is:

Logical grouping of classes: If a class is useful to only one other class then it is logical to embed it in that class and keep the two together. Nesting such "helper classes" makes their package more streamlined.

Which is exactly what I was thinking. I just thought it was somewhat awkward nesting them to 3 layers, as I've only done 2 before.

Community
  • 1
  • 1
Justin
  • 1,959
  • 5
  • 22
  • 40
  • " and there's a better way to do it, then that would be great. " - don't? – BartoszKP Aug 03 '16 at 16:15
  • 2
    Why would you do that? You're not getting any benefit compared to making them all module-level. – user2357112 Aug 03 '16 at 16:15
  • [I don't think it's bad practice](http://stackoverflow.com/questions/719705/what-is-the-purpose-of-pythons-inner-classes). – David Gomes Aug 03 '16 at 16:18
  • 1
    Is your code readable? In the Python shell type `import this` to see the Zen of Python. Note in particular "Flat is better than nested" and "Readability counts". On the other hand, there is no need to be dogmatic here. If your code *is* clearer the way it is, why not just leave it that way? – John Coleman Aug 03 '16 at 16:19
  • I changed the original post, added an alternative, in which all 3 classes are accessible externally, but the dependencies are checked in the classes `__init__` methods. I know this works just as well, I just personally dont like it when classes that are dependent on other classes, are accessible externally and able to be initiated manually. I guess that is kinda weird though – Justin Aug 03 '16 at 16:39
  • @DavidGomes - Thanks for the link, the first advantage he lists is exactly what I was trying to say – Justin Aug 03 '16 at 16:42
  • @Justin Please stop adding "Thanks" back again. The same concerns "Update" "PS" and other irrelevant labels. Only what's necessary to understand your question matters (which is in fact too long now and hard to comprehend. Big font bold markers won't help here.) – BartoszKP Aug 04 '16 at 08:14
  • @Justin As a side note, please stop using double underscore to make your variables "private", just use one underscore for that purpose. – Markus Meskanen Aug 04 '16 at 08:22
  • Ad rem: " and can be initiated manually" - even if they are nested they can be created manually. – BartoszKP Aug 04 '16 at 08:24
  • Also, do you really need all these classes? What does `Workbook` actually do? And what's a `Cell`, does it need a class or would a simple data structure be enough? – Markus Meskanen Aug 04 '16 at 08:24

2 Answers2

6

Turning your question around: is there any advantage in writing the classes as nested? Unlike functions, classes don't use "lexical scoping" (i.e. unlike functions, a name that can't be resolved from the current class's namespace will not be resolved from surrounding classes). This is why you have to refer to the __Worksheet class relative to an instance of Workbook.

This in turn means that there is no perceptible advantage to the nesting that you have employed, though it will work. Most experienced Python programmers would probably write your example without using nesting. This simplifies the code (because the class names are all global to the containing module).

Note that this is quite different from declaring a class inside a function body. In that case the code of the class can refer to variables from the function's namespace, and Python goes to great lengths to ensure that those references are still available to the class even after the function call has terminated and the local namespace associated with the call has been destroyed.

holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • Would you suggest using the alternative method in last part of the updated post? – Justin Aug 03 '16 at 16:40
  • @Justin The answer you linked at the bottom of your question is terrible. I would stick to this one. – BartoszKP Aug 04 '16 at 08:24
  • The Zen of Python (obtainable with `python -m this` or (in the interpreter) `import this`. One of the lines reads "Flat is better than nested". I don't personally regard the Zen as a stone tablet, but it's full of good guidelines for the inexperienced. – holdenweb Jan 15 '21 at 16:53
2

Setting aside some of the questions raised elsewhere, like whether you need all these classes, I'm going to say no: it's better not to nest these classes. It's better to merely document how they are intended to be used. Consider alternate names that make the classes themselves less attractive, for example WorksheetReference and CellReference. Throw in a few docstrings, even if they just say things like:

WorksheetReference is not meant to be instantiated directly. You should access it through the worksheet() method of a Workbook.

The rest of this answer will examine why.

What are the costs and benefits of nesting?

Nesting the classes in your case really only provides additional nesting. This makes it harder to talk about the classes that are nested: it's not a Worksheet, it's a Workbook.__Worksheet (really it's a Workbook._Workbook__Worksheet, but that's even more cumbersome to talk about). Since it's harder to talk about, it's harder to document the interface so someone knows how to use what comes back from a call to myworkbook.worksheet. Would your docs say "returns an object with a cells method" or "returns a Workbook.__Worksheet object" or take a different approach entirely?

None of this nesting prevents someone from trying to instantiate the class by poking around. Thus if you are truly concerned with preventing "misuse", you would still need to check the values passed to the worksheet's __init__ function (more on this later).

Why do you care about enforcing your usage pattern?

One of python's principles is that of "consenting adults". Namely we're better off showing and documenting the right way to use something, and trusting that people won't subvert it without good reason. If there's no benefit to creating a Worksheet without a Workbook, or a Cell without a Worksheet, then nobody will bother.

Another principle is that of duck typing. This is largely the idea that it's beneficial to allow programming to loosely specified interfaces. If someone wants to make a class that implements enough of the methods of Workbook, it's often good enough to be used wherever a Workbook instance could be used. Even if it does something very different internally. Combined with consenting adults, this suggests it can be beneficial to let someone try to do this. Both nesting and type-based verification in the __init__ method fight against this.

Attempting to prevent misuse

As I just covered, I don't generally recommend doing this at all. However if you do, you should do it well. The proposed check in your alternate __init__ functions is extra troublesome, as it encourages a bad practice. Since it throws an exception of type Exception, the only way to catch it is to catch all exceptions. Instead, since you are verifying the type of an argument, you should consider throwing a more specific exception like TypeError, or perhaps (in a larger body of code) even a custom subclass thereof.

Additional tweaks to the approach

I agree with a comment that asked why Worksheet and Cells exist at all. As is, there's very little benefit to having the helper class; it could make sense for the helper classes to be implementation details, or possibly omitted entirely. To make it simpler to document such an implementation detail, you may want to consider implementing one or both of them via __getitem__ and/or __setitem__, enabling code that looks like this:

worksheet = workbook['First Worksheet']
worksheet['A1'] = 'Foo...'

I'm making the assumption that in your real code, they will have a lot more functionality than the methods you've already shown. You may still want to consider __getitem__ and/or __setitem__ accessors, but there will be sufficient other reason to have exposed and documented classes.

As part of the additional functionality, it may be helpful to cleanly separate the methods that access existing sheets and cells from those that can create them.

When might it be right to nest?

This particular example doesn't seem to match this case, but sometimes it makes sense to group things by similar interfaces. Say you might have multiple subclasses of Workbook that need to instantiate different subclasses of Worksheet that in turn might need to instantiate different subclasses of Cell. It can be useful to leverage the class inheritance machinery to store that in the class.

Note that even this scenario does not require the class itself be nested, however; it would be sufficient to include lines like this:

class Workbook:
    # : : :
    worksheet_type = Worksheet
    # : : :
    def worksheet(self, name):
        return self.worksheet_type(self, name)

class WorkbookVariant(Workbook):
    # : : :
    worksheet_type = WorkbookVariant
Community
  • 1
  • 1
Michael Urman
  • 15,737
  • 2
  • 28
  • 44