1

I have to extract patient information from a bunch of XML files for further data analysis.

I have multiple Patients that can have multiple Diseases . For each Diseases there may or may not have been a Treatment or several. Each Treatment may or may not have TreatmentDetails. The TreatmentDetails are often duplicated in files (ie. files with different names, but the same TreatmentDetails for a Diseases or just a small change).

I think that a data structure of the type Patient[i].Disease[j].Treatment[k] might be useful for this problem. Unfortunately, I am not very good with Classes or OOP.

How can I achieve this type of data structure Patient[i].Disease[j].Treatment[k]?

Below is my code:

class PatientClass(object):

    def __init__(self):
        self.patient = []

    def addPatient(self, PatientID, casVersion):
        self.patient.append((False, PatientID, casVersion))

    def patient_count(self):
        return(len(self.patient))

    def printPatient(self):
        print(self.patient)

    def printpatientN(self,n):
        print(self.patient[n])

    def __str__(self,n):
        return(self.patient[n])

class Disease(PatientClass):

    def __init__(self):
        PatientClass.__init__(self)
        self.disease = []

    def addDisease(self, datetime_Start, datetime_End):
        self.disease.append((False, datetime_Start, datetime_End))


    def printDiseaseN(self,n):
        print(self.disease[n])

    def __str__(self,n):
        return "%s has disease %s" % (self.patient[n], self.disease[n])
Ignacio Vergara Kausel
  • 5,521
  • 4
  • 31
  • 41
RMS
  • 1,350
  • 5
  • 18
  • 35

2 Answers2

0

You can start with this simple implementation of 'has-a' paradigm:

    class Patient:

        def __init__(self, patientID, casVersion):
            self.patientID = patientID
            self.casVersion = casVersion
            self.disease = []

        def addDisease(self, disease):
            self.disease.append(disease)
            return self

        def __str__(self):
            return "<Patient: %s, casVer: %s, disease: %s>" % (self.patientID, self.casVersion, " ".join([str(d) for d in self.disease]))

    class Disease:

        def __init__(self, datetimeStart, datetimeEnd):
            self.datetimeStart = datetimeStart
            self.datetimeEnd = datetimeEnd
            self.treatment = []

        def __str__(self):
            return "<Disease: start: %s, end: %s, treatment: %s>" % (self.datetimeStart, self.datetimeEnd, " ".join([str(t) for t in self.treatment]))


# usage
print Patient("TheFirst", "Cas").addDisease(Disease("yesterday", "today"))
TomB
  • 16
  • 1
0

You can take advantage of the builtin Python dict and/or list machinery and use very simple classes to structure your data. Below is an example of storing Patient objects. They are extended dicts - I choose dicts because that makes a good fit for records where you are free to add the fields you want at anytime, as new keys). I also use a standard dict to store Patient objects, using their record id as the key to store and retrieve Patient objects from the container patients dict.

The simple classes shown extend UserDict (the proper way to inherit from dict), with the "has-a list of" concept for the patient diseases and treatments.

An example of usage is shown, you can see that a lot of builtin useful behaviour is available via inheritance.

You can use this structure for your data or adapt it to your taste and needs (example: you may want to define an even simpler acess to some attributes by using properties, to write .name instead of ['name'], see here for a brief explanation, or you might extend the classes in other ways).

super().__init(...) is just preserving the default initialization of the classes while adding more attributes, otherwise they won't function properly because they would override the defaults. Treatment doesn't need it because it does not redefine __init__ (yet)

from collections import UserDict
# see also UserList, UserString

class Treatment(UserDict):
    pass

class Disease(UserDict):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.treatments = []


class Patient(UserDict):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.diseases = []

patients = dict()

p1 = patients['0001'] = Patient(name='Eric', surname='Idle')

d1 = Disease(name='Eating too much')
t1 = Treatment(title='Radical Therapy', description='Mint pill after lunch.')
d1.treatments.append(t1)

d2 = Disease(name='Cannot Control the Legs Syndrome')
t2 = Treatment(title='Conservative Approach', description='Have a seat.')
d2.treatments.append(t2)

p1.diseases.extend([d1,d2])

for pid in patients:
    print()
    print('PID:{} {} {}'.format(pid, patients[pid]['name'],\
           patients[pid]['surname']))
    print(50*'-')
    for disease in patients[pid].diseases:
        print()
        print('Disease: {}'.format(disease['name']))
        for treatment in disease.treatments:
            print('Treatment: {}'.format(treatment['title']))
            print('\t\t{}'.format(treatment['description']))
progmatico
  • 4,714
  • 1
  • 16
  • 27