0

I am dealing with monthly data for a number of different data sets (e.g. air temperature, ocean temperature, wind speeds etc), whereby each month and each data set will share similar attributes. I want to be able to initialise and populate these attributes as efficiently as possible. So far I can only think of trying to somehow append new attributes to pre-existing attributes. Is this possible in Python?

For example, let's say I initialise all my monthly variables like the following:

class Data_Read:
    def __init__(self, monthly=np.zeros((200,200,40))): #3D data
        self.jan = monthly 
        self.feb = monthly
        self.march = monthly
        self.april = monthly
        self.may = monthly
        self.june = monthly
        self.july = monthly
        self.august = monthly
        self.sept = monthly
        self.oct = monthly
        self.nov = monthly
        self.dec = monthly

Then I can create a new data set for each month which will become something like air_temp.jan or wind_speed.june by doing the following:

air_temp = Data_Read()
wind_speed = Data_Read()

These would be the raw data attributes. However I would then like to do some processing to each of these data sets (like de-trending). Is there a way I can create a class (or a new class?) that will generate new attributes like self.jan.detrend. Basically I want to avoid having to write 12 lines of code for every attribute I want to create, and then be able to easily call any "attribute of the attribute" in my data processing functions thereafter.

Many thanks.

petezurich
  • 9,280
  • 9
  • 43
  • 57
  • In your code above every month will get the *same* object and if you do not provide `monthly` even the instances will have the same object (the one from the default). – Klaus D. Mar 22 '19 at 16:35
  • Perhaps take a shape as input instead of an array, then create a new array with that shape for each month? Regarding your main question, you could just write methods like `detrend` on your `Data_Read` class. `detrend` can take a month name as input, so you could do `air_temp.detrend("jan")` to get that result. If you want to cache these results so that you don't recompute them each time, you could keep a dictionary or such of computed results internally. – Nathan Mar 22 '19 at 16:38
  • 1
    You shouldn't have 12 separate variables in the first place. You want *one* attribute, a `dict` with months as keys. – chepner Mar 22 '19 at 16:41
  • Thanks all. I had completely overlooked Dictionaries. I think these are the right way to go. – Will Gregory Mar 22 '19 at 16:47
  • Though if you still want to be able to access the arrays as `air_temp.jan` when you internally have a dictionary, you can override `__getattr__` to look up month names in your dictionary. – Nathan Mar 22 '19 at 16:49

1 Answers1

0

Here's an example of how you could store everything internally in a dictionary and still reference the arrays as attributes as well as call functions on arrays by name:

import numpy as np


class Data_Read:
    def __init__(self, monthly_shape=(200, 200, 40)): #3D data
        months = [
            "jan",
            "feb",
            "march",
            "april",
            "may",
            "june",
            "july",
            "august",
            "sept",
            "oct",
            "nov",
            "dec"
        ]
        self._months = {month: np.zeros(monthly_shape) for month in months}

    def detrend(self, month):
        # this is a dummy function that just increments
        return self._months[month] + 1

    def __getattr__(self, name):
        if name in self._months:
            return self._months[name]
        return super().__getattr__(name)

air_temp = Data_Read()
print(air_temp.jan.shape)  # (200, 200, 40)
print((air_temp.detrend("jan") == 1).all())  # True

You can also achieve the same result using setattr and getattr because attributes are just stored in a dictionary on the object anyway:

import numpy as np


class Data_Read:
    def __init__(self, monthly_shape=(200, 200, 40)): #3D data
        months = [
            "jan",
            "feb",
            "march",
            "april",
            "may",
            "june",
            "july",
            "august",
            "sept",
            "oct",
            "nov",
            "dec"
        ]
        for month in months:
            setattr(self, month, np.zeros(monthly_shape))

    def detrend(self, month):
        # this is a dummy function that just increments
        return getattr(self, month) + 1
Nathan
  • 9,651
  • 4
  • 45
  • 65