0

I seem to have found lots of hack answers, without a 'standardized' answer to this questions. I am looking for an implementation of Matlab's struct in Python, specifically with the two following capabilities:

  1. in struct 's', access field value 'a' using dot notation (i.e. s.a)
  2. create fields on the fly, without initialization of dtype, format (i.e. s.b = np.array([1,2,3,4]) )

Is there no way to do this in Python? To date, the only solution I have found is here, using a dummy class structtype(). This works but feels a little hackish. I also thought maybe scipy would expose its mat_struct, used in loadmat(), but I couldn't find a public interface to it. What do other people do? I'm not too worried about performance for this struct, its more of a convenience.

Community
  • 1
  • 1
Michael
  • 486
  • 6
  • 19
  • 1
    Why do you need to use `.` notation? Why can't you just use a `dict`? – TheBlackCat Dec 03 '15 at 20:20
  • Not a need, more a preference. Maybe I just have to get over it. Without requirement (1), a `dict` works fine – Michael Dec 03 '15 at 20:49
  • 3
    A dict is going to be much, much easier to work with, thanks to all the methods it provides. And it will be shorter to write and will do most, if not all, operations faster. Plus most built-in and third-party functions and classes you are going to find are designed to work with dicts. – TheBlackCat Dec 03 '15 at 20:55
  • Matlab's struct is basically a Python dict, (key,value) pairs... – Nick Dec 04 '15 at 00:00

3 Answers3

4

If you're on 3.3 and up, there's types.SimpleNamespace. Other than that, an empty class is probably your best option.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • And if you aren't on 3.3 yet, copying that example implementation into your own code is a perfectly reasonable option. I use something similar a fair bit. Sometimes, it's called a `Bunch`, e.g. in `scikit-learn`. – Robert Kern Dec 03 '15 at 21:13
2

The simplest and intuitively most similar Python implementation would be to use type to instantiate a temporary class. It is practically similar to making a dummy class, but I think it semantically expresses the intent of a struct--like object more clearly.

>>> s = type('', (), {})()
>>> s.a = 4
>>> s.a
4

Here, type is used to create a nameless class (hence the '') with no bases (or parent classes, indicated by the empty tuple) and no default class attributes (the empty dictionary) and the final () instantiates the class/struct. Bear in mind that values passed to the dictionary do not show up in the instance's __dict__ attribute, but this fact may not be relevant to you. This method also works in older versions (< 3.x) of Python.

Sari
  • 596
  • 7
  • 12
0

In Octave I did:

octave:2>      x.a = 1;
octave:3>      x.b = [1, 2; 3, 4];
octave:4>      x.c = "string";
octave:7> save -7 test.mat x

In ipython (2.7):

In [27]: from scipy.io import loadmat    
In [28]: A=loadmat('test.mat')

In [29]: A
Out[29]: 
{'__globals__': [],
 '__header__': 'MATLAB 5.0 MAT-file, written by Octave 3.8.2, 2015-12-04 02:57:47 UTC',
 '__version__': '1.0',
 'x': array([[([[1.0]], [[1.0, 2.0], [3.0, 4.0]], [u'string'])]], 
      dtype=[('a', 'O'), ('b', 'O'), ('c', 'O')])}

In this case A['x'] is a numpy structured array, with 3 dtype=object fields.

In [33]: A['x']['b'][0,0]
Out[33]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

In [34]: A['x'][0,0]
Out[34]: ([[1.0]], [[1.0, 2.0], [3.0, 4.0]], [u'string'])

In [35]: A['x'][0,0]['b']
Out[35]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

Since x comes from MATLAB I have to index it with [0,0].

octave:9> size(x)
ans =
   1   1

I can load A with a different switch, and access attributes with .b format:

In [62]: A=loadmat('test.mat',struct_as_record=False)

In [63]: A['x'][0,0].b
Out[63]: 
array([[ 1.,  2.],
       [ 3.,  4.]])

In this case the elements of A['x'] are of type <scipy.io.matlab.mio5_params.mat_struct at 0x9bed76c>

Some history might help. MATLAB originally only had 2d matricies. Then they expanded it to allow higher dimensions. cells were added, with the same 2d character, but allowing diverse content. structures were added, allow 'named' attributes. The original MATLAB class system was built on structures (just link certain functions to a particular class structure). MATLAB is now in its 2nd generation class system.

Python started off with classes, dictionaries, and lists. Object attributes are accessed with the same . syntax as MATLAB structures. dictionaries with keys (often, but not always strings). Lists indexed with integers, and have always allowed diverse content (like cells). And with a mature object class system, it is possible construct much more elaborate data structures in Python, though access is still governed by basic Python syntax.

numpy adds n-dimensional arrays. A subclass np.matrix is always 2d, modeled on the old style MATLAB matrix. An array always has the same kind of elements. But dtype=object arrays contain pointers to Python objects. In many ways they are just Python lists with an array wrapper. They are close to MATLAB cells.

numpy also has structured arrays, with a compound dtype, composed of fields. fields are accessed by name. np.recarray is a structured array, with added ability to access fields with the . syntax. That makes them look a lot like MATLAB arrays of structures.

hpaulj
  • 221,503
  • 14
  • 230
  • 353