Matlab array of struct : Fast assignment

Question

Is there any way to "vector" assign an array of struct.

Currently I can

edges(1000000) = struct('weight',1.0); //This really does not assign the value, I checked on 2009A.
for i=1:1000000; edges(i).weight=1.0; end;

But that is slow, I want to do something more like

edges(:).weight=[rand(1000000,1)]; //with or without the square brackets.

Any ideas/suggestions to vectorize this assignment, so that it will be faster.

Thanks in advance.

this post might be of help: http://stackoverflow.com/questions/4166438/how-do-i-define-a-structure-in-matlab/4169216#4169216 — Amro, Oct 28 '11 at 15:46

score 13 · Answer 1 · answered Aug 11 '12 at 20:39

13

This is much faster than deal or a loop (at least on my system):

N=10000;
edge(N) = struct('weight',1.0); % initialize the array
values = rand(1,N);  % set the values as a vector

W = mat2cell(values, 1,ones(1,N)); % convert values to a cell
[edge(:).weight] = W{:};

Using curly braces on the right gives a comma separated value list of all the values in W (i.e. N outputs) and using square braces on the right assigns those N outputs to the N values in edge(:).weight.

answered Aug 11 '12 at 20:39

Alistair

1,179
1
8
4

Nice! Syntatically and pragmatically elegant! It'd be nice if Matlab syntax allowed expanding arrays into an argument sequence, something like '{values}{:}'. Tried making a function to take a cell value list, but apparently it does not like assigning to `varargout` in the exact same way that `deal()` does haha. – eacousineau Mar 22 '14 at 00:23
Whoops, found I was using `mat2cell()` instead of `num2cell()`. Here's the function: [`cellexpand()`](https://gist.github.com/eacousineau/9699289#file-cellexpand-m). – eacousineau Mar 22 '14 at 00:42
You can use anonymous handles as well: `cellexpand = @(x) x{:}; numexpand = @(x) cellexpand(num2cell(x));`. An example: `[a, b] = numexpand([1, 2]);`. More specific example: `[edge.weight] = numexpand([edge.weight] + 50);` – eacousineau Mar 22 '14 at 01:45

score 8 · Accepted Answer · edited May 23 '17 at 12:33

8

You can try using the Matlab function deal, but I found it requires to tweak the input a little (using this question: In Matlab, for a multiple input function, how to use a single input as multiple inputs?), maybe there is something simpler.

n=100000;
edges(n)=struct('weight',1.0);
m=mat2cell(rand(n,1),ones(n,1),1);
[edges(:).weight]=deal(m{:});

Also I found that this is not nearly as fast as the for loop on my computer (~0.35s for deal versus ~0.05s for the loop) presumably because of the call to mat2cell. The difference in speed is reduced if you use this more than once but it stays in favor of the for loop.

edited May 23 '17 at 12:33

Community

1
1

answered Oct 28 '11 at 08:21

Aabaz

3,106
2
21
26

2

These are my times. On Octave : .17s for 100K and 1.57s for 1mil for this method and it takes for ever if I use for loop, like 230s for 100K. MATLAB 2009B (diff machine/OS): 5s/49s using above and .22s/2.2s using for loop. – sumodds Oct 28 '11 at 08:43

score 7 · Answer 3 · answered Oct 28 '11 at 15:51

7

You could simply write:

edges = struct('weight', num2cell(rand(1000000,1)));

answered Oct 28 '11 at 15:51

Amro

123,847
25
243
454

score 2 · Answer 4 · answered Oct 28 '11 at 07:59

2

Is there something requiring you to particularly use a struct in this way?

Consider replacing your array of structs with simply a separate array for each member of the struct.

weights = rand(1, 1000);

If you have a struct member which is an array, you can make an extra dimension:

matrices = rand(3, 3, 1000);

If you just want to keep things neat, you could put these arrays into a struct:

edges.weights = weights;
edges.matrices = matrices;

But if you need to keep an array of structs, I think you can do

[edges.weight] = rand(1, 1000);

answered Oct 28 '11 at 07:59

Brian L

3,201
1
15
15

Both of them does the same. But, I think I need it to be array of structs (meaning objects of array) and not struct of arrays (single big struct of a large array). What is the difference between the two in MATLAB, is there any ? Meaning w.r.t allocation of memory and if so, what is its implication ? – sumodds Oct 28 '11 at 08:18
1

The difference is that in Matlab, an array of structs ("struct-organized") is grossly inefficient because each struct stores each of its fields in a separate array, so you can't do vectorized operations on them. A struct of arrays ("planar-organized") like Brian's will store each of its fields in primitive arrays which are contiguous in memory, and vectorized (fast) Matlab functions will work on. It is a much better structure for Matlab, and more idiomatic. – Andrew Janke Mar 17 '14 at 16:41

Andrew Janke · Answer 5 · 2014-03-17T16:59:12.587

The reason that the structs in your example don't get initialized properly is that the syntax you're using only addresses the very last element in the struct array. For a nonexistent array, the rest of them get implicitly filled in with structs that have the default value [] in all their fields.

To make this behavior clear, try doing a short array with clear edges; edges(1:3) = struct('weight',1.0) and looking at each of edges(1), edges(2), and edges(3). The edges(3) element has 1.0 in its weight like you want; the others have [].

The syntax for efficiently initializing an array of structs is one of these.

% Using repmat and full assignment
edges = repmat(struct('weight', 1.0), [1 1000]);

% Using indexing
% NOTE: Only correct if variable is uninitialized!!!
edges(1:1000) = struct('weight', 1.0);  % QUESTIONABLE

Note the 1:1000 instead of just 1000 when indexing in to the uninitialized edges array.

There's a problem with the edges(1:1000) form: if edges is already initialized, this syntax will just update the values of selected elements. If edges has more than 1000 elements, the others will be left unchanged, and your code will be buggy. Or if edges is a different type, you could get an error or weird behavior depending on its existing datatype. To be safe, you need to do clear edges before initializing using the indexing syntax. So it's better to just do full assignment with the repmat form.

BUT: Regardless of how you initialize it, an array-of-structs like this is always going to be inherently slow to work with for larger data sets. You can't do real "vectorized" operations on it because your primitive arrays are all broken up in to separate mxArrays inside each struct element. That includes the field assignment in your question – it is not possible to vectorize that. Instead, you should switch a struct-of-arrays like Brian L's answer suggests.

score 0 · Answer 6 · edited Dec 21 '15 at 17:14

You can use a reverse struct and then do all operations without any errors like this

x.E(1)=1;
x.E(2)=3;
x.E(2)=8;
x.E(3)=5;

and then the operation like the following

x.E

ans =

    3     8     5

or like this

x.E(1:2)=2

x = 

    E: [2 2 5]

or maybe this

x.E(1:3)=[2,3,4]*5

x = 

    E: [10 15 20]

It is really faster than for_loop and you do not need other big functions to slow your program.

Matlab array of struct : Fast assignment

6 Answers6