10

I'm seeking advice about design of my code.

Introduction

I have several classes, each represents one file type, eg: MediaImageFile, MediaAudioFile and generic (and also base class) MediaGenericFile.

Each file have two variants: Master and Version, so I created these classes to define theirs specific behaviour. EDIT: Version represents resized/cropped/trimmed/etc variant of Master file. It's used mainly for previews.

EDIT: The reason, why I want to do it dynamically is that this app should be reusable (it's Django-app) and therefore it should be easy to implement other MediaGenericFile subclass without modifying original code.

What I want to do

  1. First of all, user should be able to register own MediaGenericFile subclasses without affecting original code.

  2. Whether file is version or master is easily (one regexp) recognizable from filename.

    /path/to/master.jpg                   -- master
    /path/to/.versions/master_version.jpg -- version
    
  3. Master/Version classes use some methods/properties of MediaGenericFile, like filename (you need to know filename to generate new version).

  4. MediaGenericFile extends LazyFile, which is just lazy File object.

Now I need to put it together…

Used design

Before I start coding 'versions' feature, I had factory class MediaFile, which returns appropriate file type class according to extension:

>>> MediaFile('path/to/image.jpg')
<<< <MediaImageFile 'path/to/image.jpg'>

Classes Master and Version define new methods which use methods and attributes of MediaGenericFile and etc.

Approach 1

One approach is create dynamically new type, which inherits Master (or Version) and MediaGenericFile (or subclass).

class MediaFile(object):
    def __new__(cls, *args, **kwargs):
        ...  # decision about klass
        if version:
            bases = (Version, klass)
            class_name = '{0}Version'.format(klass.__name__)
        else:
            bases = (Master, klass)
            class_name = '{0}Master'.format(klass.__name__)

        new_class = type(class_name, bases, {})
        ...
        return new_class(*args, **kwargs)

Approach 2

Second approach is create method 'contribute_to_instance' in Master/Version and call it after creating new_class, but that's more tricky than I thought:

classs Master(object):
    @classmethod
    def contribute_to_instance(cls, instance):
        methods = (...)
        for m in methods:
            setattr(instance, m, types.MethodType(getattr(cls, m), instance))

class MediaFile(object):
    def __new__(*args, **kwargs):
        ...  # decision about new_class
        obj = new_class(*args, **kwargs)
        if version:
            version_class = Version
        else:
            version_class = Master

        version_class.contribute_to_instance(obj)
        ...
        return obj

However, this doesn't work. There are still problems with calling Master/Version's methods.

Questions

What would be good way to implement this multiple inheritance?

How is this problem called? :) I was trying to find some solutions, but I simply don't know how to name this problem.

Thanks in advance!

Note to answers

ad larsmans

Comparison and instance check wouldn't be problem for my case, because:

  1. Comparisons are redefined anyway

    class MediaGenericFile(object):
        def __eq__(self, other):
            return self.name == other.name
    
  2. I never need to check isinstance(MediaGenericFileVersion, instance). I'm using isinstance(MediaGenericFile, instance) and isinstance(Version, instance) and both works as expected.

Nevertheless, creating new type per instance sounds to me as considerable defect.

Well, I could create both variations dynamically in metaclass and then use them, something like:

>>> MediaGenericFile.version_class
<<< <class MediaGenericFileVersion>
>>> MediaGenericFile.master_class
<<< <class MediaGenericFileMaster>

And then:

class MediaFile(object):
    def __new__(cls, *args, **kwargs):
        ...  # decision about klass
        if version:
            attr_name = 'version_class'
        else:
            attr_name = 'master_class'

    new_class = getattr(klass, attr_name)
    ...
    return new_class(*args, **kwargs)

Final solution

Finally the design pattern is factory class. MediaGenericFile subclasses are statically typed, users can implement and register their own. Master/Version variants are created dynamically (glued together from several mixins) in metaclass and stored in 'cache' to avoid perils mentioned by larsmans.

Thanks everyone for their suggestions. Finally I understand the metaclass concept. Well, at least I think that I understand it. Push origin master…

Tomáš Ehrlich
  • 6,546
  • 3
  • 25
  • 31
  • 2
    Isn't this called `mixins` or something like that? – Jonas Schäfer Jul 21 '12 at 11:15
  • What's the difference between mixins and multiple inheritance? I've seen Mixins a lot in Django, I understand that you create many 'pieces' called mixins and then glue them together to create class with desired behaviour. I may need to do that 'dynamically'. – Tomáš Ehrlich Jul 21 '12 at 11:30
  • @elvard: a mixin is a kind of "secondary" abstract base class that offers additional functionality. E.g., a `MediaImageFileMaster` might be primarily considered a `MediaImageFile`, but with the functionality of a `Master` "mixed in". – Fred Foo Jul 21 '12 at 11:34
  • @larsmans: In that case I understand mixins well. My problem is probably just how to create them dynamically without wasting types. – Tomáš Ehrlich Jul 21 '12 at 11:44

5 Answers5

4

I'd certainly advise against the first approach of constructing classes in __new__. The problem with it is that you create a new type per instance, which causes overhead and worse, causes type comparisons to fail:

>>> Ham1 = type("Ham", (object,), {})
>>> Ham2 = type("Ham", (object,), {})
>>> Ham1 == Ham2
False
>>> isinstance(Ham1(), Ham2)
False
>>> isinstance(Ham2(), Ham1)
False

This violates the principle of least surprise because the classes may seem entirely identical:

>>> Ham1
<class '__main__.Ham'>
>>> Ham2
<class '__main__.Ham'>

You can get approach 1 to work properly, though, if you construct the classes at the module level, outside of MediaFile:

classes = {}
for klass in [MediaImageFile, MediaAudioFile]:
    for variant in [Master, Version]:
        # I'd actually do this the other way around,
        # making Master and Version mixins
        bases = (variant, klass)
        name = klass.__name__ + variant.__name__
        classes[name] = type(name, bases, {})

then, in MediaFile.__new__, look the required class up by name in classes. (Alternatively, set the newly constructed classes on the module instead of in a dict.)

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • That's a good point, thank you. Creating new type per instance is definitely considerable effect. On the other hand, comparisons and instance check wouldn't be my problems, as they're redefined (see edited post for more info) – Tomáš Ehrlich Jul 21 '12 at 11:28
  • @elvard: the problem is not comparison of instances, it's comparison of *types*. And the fact that you don't need to do `isinstance` now doesn't mean you never will. I find that, whenever I write factory functions, I will at some point have to check exactly what type of data come out -- usually in the unit test. – Fred Foo Jul 21 '12 at 11:32
  • Yep, it definitely should be fixed somehow. Your suggestion to module level classes is good variant, but I would probably prefer mine -- create those classes in metaclass, as users should be able to extend this application by providing their own file type classes. – Tomáš Ehrlich Jul 21 '12 at 11:41
  • @elvard: if `Master` and `Version` are only two mixins that you'll ever use a class factory in the module might also work. Otherwise, your approach might indeed be better. – Fred Foo Jul 21 '12 at 12:03
4

I'm not sure how dynamic you want it to be, but using a "factory pattern" (here using a class factory), is fairly readable and understandable and may do what you want. This could serve as a base... MediaFactory could be smarter, and you could register multiple other classes, instead of hard-coding MediaFactoryMaster etc...

class MediaFactory(object):

    __items = {}

    @classmethod
    def make(cls, item):
        return cls.__items[item]

    @classmethod
    def register(cls, item):
        def func(kls):
            cls.__items[item] = kls
            return kls
        return func

class MediaFactoryMaster(MediaFactory, Master): pass
class MediaFactoryVersion(MediaFactory, Version): pass

class MediaFile(object):
    pass

@MediaFactoryMaster.register('jpg') # adapt to take ['jpg', 'gif', 'png'] ?
class MediaFileImage(MediaFile):
    pass

@MediaFactoryVersion.register('mp3') # adapt to take ['mp3', 'ogg', 'm4a'] ?
class MediaFileAudio(MediaFile):
    pass

other possible MediaFactory.make

@classmethod
def make(cls, fname):
    name, ext = somefunc(fname)
    kls = cls.__items[ext]
    other = Version if Version else Master
    return type('{}{}'.format(kls.__name__,other.__name__), (kls, other), {})
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • 1
    +1 for suggesting decorators. I'm not sure, why to create Master/Version variations of MediaFactory. I want to final interface as simple as possible, and calling MediaFactory(filename), which returns MediaFileMaster or MediaFileVersion according to filename sounds easy enogh for me. Could you please explain your suggestion more in detail? – Tomáš Ehrlich Jul 21 '12 at 12:02
  • @elvard I've added another `MediaFactory.make` to the bottom of the answer. It's similar to your first example, but in essence is used to determine which of Master/Version to mixin with a registered class - while the registered classes can maintain a normal inheritance chain as defined in code. – Jon Clements Jul 21 '12 at 12:28
  • Heureka, I see now! :) I'll will fix my original post. The idea behind master/version is that version is just resized/cropped/trimmed/etc version of master, used mainly for thumbnails. Therefore, there's no reason for two factories. Even Audio files have master/version, where version would be short preview. – Tomáš Ehrlich Jul 21 '12 at 12:38
2

How come you're not using inheritance but are playing around with __new__?

class GenericFile(File):
    """Base class"""

class Master(object):
    """Master Mixin"""

class Versioned(object):
    """Versioning mixin"""

class ImageFile(GenericFile):
    """Image Files"""

class MasterImage(ImageFile, Master):
    """Whatever"""

class VersionedImage(ImageFile, Versioned):
    """Blah blah blah"""

...

It's not clear why you're doing this though. I think there's a weird code smell here. I'd recommend fewer classes with a consistent interface (duck-typing) rather than a dozen classes and isinstance checks throughout the code to make it all work.

Perhaps you can update your question with what you'd like to do in your code and folks can help either identify the real pattern or a suggest a more idiomatic solution.

stderr
  • 8,567
  • 1
  • 34
  • 50
  • Indeed, I should mention that one should be able to implement own MediaGenericFile subclasses without modifying original code (it's Django app, see edited post). It seems to me easier to create these Master/Version variants dynamically, just to save typing those 4 lines. – Tomáš Ehrlich Jul 21 '12 at 11:55
  • 4 lines of code are much better than a dynamically generated maintenance nightmare any day. :-) – stderr Jul 21 '12 at 12:03
  • Well, 4 lines of code for each filetype. I'm still not convinced that static solution is nether more readable or more maintanable. Probably due to lack of experience. – Tomáš Ehrlich Jul 21 '12 at 12:17
  • Eh, I don't take it personally. I'd simply be explicit about my object hierarchy. If you're looking for a more powerful dynamic solution look at class decorators and metaclasses. Be careful, they can be trouble and you might poke your eye out. – stderr Jul 21 '12 at 12:19
  • I'm not offended :) I mean it seriously. Practical programming is far beyond my skills. – Tomáš Ehrlich Jul 21 '12 at 12:22
1

You do not have to create a new class for each instance. Don't create the new classes in __new__ create them in __metaclass__. define a metaclass in the base or in the base_module. The two "variant" subclasses are easily saved as as class attributes of their genaric parent and then __new__ just looks at the filename according to it's own rules and decides which subclass to return.

Watch out for __new__ that returns a class other than the one "nominated" during the constructor call. You may have to take steps to invoke __init__ from withing __new__

Subclasses will either have to:

  1. "register" themselves with a factory or parent to be found
  2. be imported and then have the parent or factory find them through a recursive search of cls.__subclasses (might have to happen once per creation but that's probably not a problem for file handeling)
  3. found through the use of "setuptools" entry_points type tools but that requires more effort and coordination by the user
Community
  • 1
  • 1
Phil Cooper
  • 5,747
  • 1
  • 25
  • 41
0

The OOD question you should be asking is "do the various classes of my proposed inheritance share any properties at all?"

The purpose of inheritance is to share common data or methods that the instances naturally have in common. Aside from both being files, what do Image files and Audio files have in common? If you really want to stretch your metaphors, you could conceivably have AudioFile.view() which could present — for example — a visualization of the power spectra of the audio data, but ImageFile.listen() makes even less sense.

I think your question side-steps this language independent conceptual issue in favor of the Python dependent mechanics of an object factory. I don't think you have a proper case of inheritance here, or you've failed to explain what common features your Media objects need to share.

msw
  • 42,753
  • 9
  • 87
  • 112
  • As I mentioned: "Classes Master and Version define new methods which use methods and attributes of MediaGenericFile and etc." So they really share some attributes (like filename, path, etc). – Tomáš Ehrlich Jul 21 '12 at 11:58
  • So you agree that which MediaFiles share are that they are Files. Python already has a file type, and writing your own adds complexity without adding capability. That's probably a Bad Idea. – msw Jul 21 '12 at 12:03
  • @msw: I believe the OP means that they instances "refer to" files by their path; Python's `file` doesn't know about paths at all. – Fred Foo Jul 21 '12 at 12:04
  • @msw: Indeed, MediaGenericFile extends File. I'll fix it in post. – Tomáš Ehrlich Jul 21 '12 at 12:12