My day to day work flow is something like this:
- acquire raw data (~50GB)
- parse raw data timing-information and build raw data structure (struct / object) from timing-information (what event occurred when, in which order, in what file, what other events occurred at the same time, etc ...)
- load only the necessary parts of raw data into struct / object as selected from previous timing information (basically this is a way to sub-select data)
- for each raw data chunk, calculate / extract certain metrics like RMS of signal, events where data > threshold, d' / z-score, and save them with struct / object
- given the the previously calculated metrics, load some raw-data of same time episodes from different data channel and compare certain things, etc ...
- visualize results x, y, z
I have two ways of dealing with this kind of data / workflow:
- use struct()
- use objects
There are certain advantages / disadvantages to both cases:
struct:
- can add properties / fields on the fly
- have to check for state of struct every single time that I pass a struct to a function
- keep re-writing certain functions because every time that I change the struct slightly I a) tend to forget that a function already exists for it or b) I write a new version that handles a special case of the struct state.
objects:
- using 'get.property()' methods, I can check the state of a property before it get's accessed inside a function / method -> allows to do data consistency checks.
- I always know which methods work with my object, since they are part of the object definition.
- need to
clear classes
every time I add a new property or method - very annoying!
Now my question is: how do other people deal with this kind of situation? how do you organize your data? in structs? in objects? how do you handle state checks? is there a way to do 'stateless' programming in matlab?