2

I have a data structure consisting of thousands of medium sized (hundreds of byte) objects, which each represent a subset of a larger dataset. This isn't optimal for several reasons (complexity when analyzing larges scopes, strain on garbage collector, etc.)

Conceptually, you can imagine the objects representing for example meteorological data for a day, when the dataset as a whole is the data for a year (say). Trivial example:

class YearData
{
   private readonly DayData[] days = new DayData[365];
   public DayData GetDayData(int dayNumber)
   {
      return days[dayNumber];
   }
}

class DayData
{
   private readonly double[] temperatures = new double[24];
   public double GetTemperature(int hour)
   {
      return temperatures[hour];
   }     
   public void SetTemperature(int hour, double temperature)
   {
      temperatures[hour] = temperature;
   }
}       

In a refactoring effort I have tried to move the data to a single object representing the whole dataset, but to keep the rest of the code unchanged (and simple), I need the objects representing the subset/segment of data. Example:

class YearData
{
   private readonly double[] temperatures = new double[365*24];   

   public DayData GetDayData(int day)
   {
      return new DayData(this, day);
   } 

   internal double GetTemperature(int day, int hour)
   {
      return temperatures[day*24 + hour];
   }

   internal double SetTemperature(int day, int hour, double temperature)
   {
      temperatures[day*24 + hour] = temperature;
   }
}

class DayData // or struct?
{
    private readonly YearData yearData;
    private readonly int dayNumber;
    public DayData(YearData yearData, int dayNumber)
    {
       this.yearData = yearData;
       this.dayNumber = dayNumber;
    }
    public double GetTemperature(int hour)
    {
       return yearData.GetData(dayNumber, hour);
    }
    public void SetTemperature(int hour, double temperature)
    {
       yearData.SetData(dayNumber, hour, temperature);
    }
}

This way I can have a single huge and long lived object, and I can keep many small short lived objects for the analysis of the data. GC is happier and doing analysis directly on the whole dataset is now less complicated.

My questions are, first: does this pattern have a name? Seems like it should be pretty common pattern.

Second (.NET specific): The segment object is very lightweight and immutable. Does that make it a good candidate for being a struct? Does it matter that one of the struct fields is a reference? Is it bad form to use a struct for an type that appears mutable but which in fact isn't?

RMalke
  • 4,048
  • 29
  • 42
Anders Forsgren
  • 10,827
  • 4
  • 40
  • 77

2 Answers2

2

Very interesting problem and approach!

I am not sure, but I think these patterns may be considered a Flyweight and an Adapter.

A Flyweight, from sourcemaking.com:

  • Use sharing to support large numbers of fine-grained objects efficiently.
  • The Motif GUI strategy of replacing heavy-weight widgets with light-weight gadgets.

You are holding the temperatures array are stored in YearData, that can be seen as a DayData Factory.

And also an Adapter:

  • Convert the interface of a class into another interface clients expect. Adapter lets classes work together that couldn’t otherwise because of incompatible interfaces.
  • Wrap an existing class with a new interface.
  • Impedance match an old component to a new system

So, by exposing the DayData that way, you are providing an interface the client wants

Also, I am not sure about DayDate being a struct, you can check a good explanation about when and how to use structs in this answer

Community
  • 1
  • 1
RMalke
  • 4,048
  • 29
  • 42
1

Flyweight, definitely.

It allows you to pack the large data in an optimal way but still pretend that you have an extra object for each single data.

Wiktor Zychla
  • 47,367
  • 6
  • 74
  • 106
  • But there is no re-use (so there isn't LESS data) around at any point. Its the exact same amount of data and in fact the same number of objects too. Even a few *more* bytes in the revised one. Sure the day objects are lightweight, but there is no sharing which I thought was the central theme of flyweights? – Anders Forsgren Feb 28 '13 at 22:47
  • The point of Flyweight is that you can reuse your code that uses class apis but at some point you have to completely revise your data structure because objects do not fit in memory. Thus, you pack the object data in an optimal structure but expose it as small and lightweight objects thus allowing the client code to remain untouched. – Wiktor Zychla Feb 28 '13 at 22:52
  • You are right: The more I think about it the more it fits a classic flyweight. The larger set object *is* the factory, and the re-use occurs when two lightweight day objects are created representing the same day. – Anders Forsgren Mar 02 '13 at 22:20