Object oriented DesignProblem in python, has-a vs. is-a

Question

I'm still finding my feet with OOP and I am in a conundrum on how to procede with a design. I have some files. They all have headers and they all have data. There are several (25 or thereabouts) file types. Each file type has: a name, a header type, a data format, and a data reader (i.e. a method I will write to read the data of a particular file type). A particular "data reader" might read data from more than one file type.

Ultimately, I need data sets. A data set has all of the above attributes plus: all the data read in from the source file, and a couple of other bits of info such as the file name.

Like I said, I will have 25 or so file types, and a run of the code might deal with a few hundred data sets. file types will be very stable. More might be added as time goes on, but the attributes of existing file types will hardly ever change, and certainly not within the course of a run. In a data set, the actual data might change with processing, but the attributes of its associated file type will not.

The first step in dealing with a file will be to read its header and figure out what file type it belongs to. Next, the data set will be constructed by calling the data reader for the appropriate file type.

Now, I am stuck on the best way to get the file type attributes into the data set. Will it be more workable to say that a data set is-a "file type" and let data set inherit from (or just simply instantiate) file type, or better to say that a data set has-a file type and let file type be a data set attribute? I have to do this in python. Are there any python-specific considerations in answering this question. Thanks for your help.

Please **do not** use backtick code formatting to emphasize non-code text. I'd say most (possibly all) of the emphasized words in your question do not need any emphasis. — ThiefMaster, Jan 24 '13 at 22:56
Just write a [class factory](http://stackoverflow.com/questions/2526879/what-exactly-is-a-class-factory/2949205#2949205) which determines what class is required or constructs an appropriate one and returns it. classes are object's (instances of their metaclass) in Python. — martineau, Jan 24 '13 at 23:19

score 1 · Accepted Answer · answered Jan 24 '13 at 23:02

If I understand you correctly, I would do something like this:

class DataReader: #Your base for all readers
    @classmethod
    def read (_, file): pass #magic happens here

class DataReaders:
    def __init__ (self):
        self.__readers = # something like {FileType1: reader1, FileType2: reader1, FileType3: reader3}
    def __getitem__ (self, fileType): return self.__readers [fileType]


class DataSet:
    def __init__ (self, file, readers):
        self.__file = file
        self.readFileType ()
        self.data = readers [self.__fileType].read (file)

    def readFileType (self):
        self.__fileType = #parse the header of the file or whatever

readers = DataReaders ()
ds1 = DataSet (file1, readers)
ds2 = DataSet (file2, readers)

Thanks very much. Did something rather like this. – bob.sacamento Jan 25 '13 at 21:06 — bob.sacamento, Jan 25 '13 at 21:06

score 1 · Answer 2 · answered Jan 24 '13 at 23:05

If I understand correctly, you have several types of files or different file formats. These files have data sets and the format of these data sets are are not specific to a given type.

In this case, I would have a class to infer the file type and also infer the data format. Once the file type and file formats are inferred you can create the appropriate file type and the data format object.

class Detector(object):
    @classmethod
    def detect(cls, filename):
       """Return a file object"""
       pass

    @classmethod
    def infer_filetype(cls, header):
       pass

    @classmethod
    def infer_data_format(cls, contents):
       pass

class File(object):

    def __init__(self, reader)
        pass

    def data(self):
       pass
    ...
    # Other attributes.


class Reader(object)
    """An iterable to read the contents"""

Thanks very much for your detailed answer. Will accept Hyperboreus's -- looks like he got here first -- but am much obliged for your help. — bob.sacamento, Jan 25 '13 at 21:06

Object oriented DesignProblem in python, has-a vs. is-a

2 Answers2