1

There are already several similar questions on SO, for example How do I create a variable number of variables?. But I think my question is bit different one.

I want to create dynamic array names. MATLAB (and Octave) has built in data structure cell array that can do the job. For example code creates A{1} to A{10} variables dynamically. var1 is 'dynamic input' to program.

var1=10;
for j=1:1:var1
  A{j} = {}; %creating empty cell
end
%resizing empty cells using a loop, arrays can be of different size

As answered in the first question, python has dictionary to do this task. However, I am working with huge amount of data stored in form of arrays, and need to use numpy array as they are faster than python built in lists. Any suggestions?

Update: I want to create variable size arrays afterwards.

bner341
  • 525
  • 1
  • 7
  • 8
  • 2
    Why not just use a two-dimensional array? – chepner Jul 08 '17 at 01:35
  • Your code looks like it could be translated to python by creating a list of numpy-arrays. Can you further explain your problem? If I remember correctly you use cells in matlab if you need non-rectangular arrays. However the cell in your example is rectangular (var1 x 1 x var1) – Johannes Jul 08 '17 at 01:35
  • @chepner, I want to create different size arrays. – bner341 Jul 08 '17 at 13:08
  • @Johannes Can you explain what is list of numpy arrays? May be that is what I am looking for. But I need to have different sizes of each array. – bner341 Jul 08 '17 at 13:09

3 Answers3

1

You can use numpy.memmap to create those kind of arrays.

data = np.arange(12, dtype='float32')

data.resize((3,4))

  • numpy.memmap documentation here.
shashank
  • 11
  • 7
1

So your Octave code produces:

A =
{
  [1,1] =
  {
    [1,1] = [](0x0)
    [1,2] = [](0x0)
    [1,3] = [](0x0)
    [1,4] = [](0x0)
    [1,5] = [](0x0)
    [1,6] = [](0x0)
    [1,7] = [](0x0)
    [1,8] = [](0x0)
    [1,9] = [](0x0)
    [1,10] = [](0x0)
  }
  [1,2] =
  {
    [1,1] = [](0x0)
    [1,2] = [](0x0)
    [1,3] = [](0x0)
    [1,4] = [](0x0)
    [1,5] = [](0x0)
    [1,6] = [](0x0)
    [1,7] = [](0x0)
    [1,8] = [](0x0)
    [1,9] = [](0x0)
    [1,10] = [](0x0)
  }
  ...
  [1,10] =
  {
    [1,1] = [](0x0)
    [1,2] = [](0x0)
    ...
    [1,9] = [](0x0)
    [1,10] = [](0x0)
  }
}

So A is a cell that contains 10 cells, each with 10 0x0 matrices.

Cells are 2d (or higher) and contain any kind of object. python lists are 1d and contain any kind of object. numpy arrays are more like matrices, except they can be 0 or 1d. Another difference, MATLAB/Octave lets you 'grow' arrays and cells by just assigning to a higher index. You grow lists with .append, and 'grow' arrays be concatenating several together to make a new one (no grow-in-place).

In [559]: A = []
In [560]: for i in range(10):
     ...:     A.append([[] for _ in range(10)])
     ...:     
In [561]: A
Out[561]: 
[[[], [], [], [], [], [], [], [], [], []],
 [[], [], [], [], [], [], [], [], [], []],
 ...
 [[], [], [], [], [], [], [], [], [], []],
 [[], [], [], [], [], [], [], [], [], []],
 [[], [], [], [], [], [], [], [], [], []]]

Or if you want a list of lists of (0,0) arrays:

In [562]: A =[]    
In [563]: for i in range(10):
     ...:     A.append([np.zeros((0,0)) for _ in range(10)])

Or you could initial a multidimensional array:

In [565]: np.zeros((10,10,0,0),int)
Out[565]: array([], shape=(10, 10, 0, 0), dtype=int32)
In [566]: np.zeros((1,10,1,10,0,0),int)
Out[566]: array([], shape=(1, 10, 1, 10, 0, 0), dtype=int32)

Octave equivalent:

>> zeros(1,10,1,10,0,0)
ans = [](1x10x1x10x0x0)

numpyalso have object dtype lists which can contain 'anything'.

If I save the Octave A to a mat file, and load it in numpy I get:

In [569]: data = loadmat('test.mat')
In [570]: data
Out[570]: 
{'A': array([[ array([[array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64)]], dtype=object),
         array([[array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         ...
         array([[array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64),
         array([], shape=(0, 0), dtype=float64)]], dtype=object)]], dtype=object),
 '__globals__': [],
 '__header__': b'MATLAB 5.0 MAT-file, written by Octave 4.0.0, 2017-07-08 02:13:25 UTC',
 '__version__': '1.0'}

A is an object dtype array containing object dtype arrays containing (0,0) float arrays:

In [572]: data['A'].shape
Out[572]: (1, 10)
In [573]: data['A'][0,0].shape
Out[573]: (1, 10)
In [574]: data['A'][0,0][0,0].shape
Out[574]: (0, 0)

Without the automatic growth behavior of MATLAB, it doesn't make much sense to initial an numpy array to shape (0,0).

Simpler example

A simple example of creating arrays of differing size - in a list:

In [590]: A = [np.arange(i+3) for i in range(5)]
In [591]: A
Out[591]: 
[array([0, 1, 2]),
 array([0, 1, 2, 3]),
 array([0, 1, 2, 3, 4]),
 array([0, 1, 2, 3, 4, 5]),
 array([0, 1, 2, 3, 4, 5, 6])]

I can save and reload it:

In [600]: savemat('test.mat', {'A':A})
In [601]: loadmat('test.mat')
Out[601]: 
{'A': array([[array([[0, 1, 2]]), array([[0, 1, 2, 3]]),
         array([[0, 1, 2, 3, 4]]), array([[0, 1, 2, 3, 4, 5]]),
         array([[0, 1, 2, 3, 4, 5, 6]])]], dtype=object),
 '__globals__': [],
 '__header__': b'MATLAB 5.0 MAT-file Platform: posix, Created on: Sun Jul  9 09:44:45 2017',
 '__version__': '1.0'}

Note that this has converted the list of arrays into an object array of arrays.

Octave loads this as:

>> A
A =
{
  [1,1] =
    0  1  2
  [1,2] =
    0  1  2  3
  [1,3] =
    0  1  2  3  4
  [1,4] =
    0  1  2  3  4  5
  [1,5] =
    0  1  2  3  4  5  6
}
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • 1
    As I said in the post, I do not want to use pythin lists for performance reason. Multidimentional arrays will not work as each array, in my application, will have different size. I did not use "automatic growth" function of MATLAB/Octave cell arrays, since size is available at runtime. – bner341 Jul 08 '17 at 13:13
  • Besides creating these arrays what are you doing with them? What's this performance that you worry about? If the arrays differ in size, you have to store them individually, and reference them by pointer - in a cell like structure. – hpaulj Jul 08 '17 at 13:38
  • @hpaulj ya, but its not memory friendly. example if u have 1M data and 4GB ram. U might get memory dump errors – Sanjay SP Jul 05 '18 at 15:16
0

Ok so from what I understand you want to create several arrays of different sizes, so the final datastructure is not rectangular. Numpy-arrays have to be rectangular, so you cannot just use them. What you can do is to create a list of numpy-arrays. A list is just a datastructure that holds a sequence of arbitrary objects, so it does not care if its elements are arrays of the same size.

If you want to translate your code 1:1, do something like this:

import numpy as np

var1=10;
a = []
for j in range(var1):
    # you cant use a[j] in python because the j-th element does not exist yet, so use .append instead to add an element to the end
    a.append(np.array(0))
end

Now you have a list of 10 empty numpy arrays and you can loop through them to resize them or add data etc.

An easier method to get the above is this, its just different syntax:

import numpy as np

var1 = 10
a = [np.array(0) for i in range(var1)]

To change, resize, add data just loop through the list:

for arr in a:
    # arr is the numpy array
    arr.resize(...)
Johannes
  • 3,300
  • 2
  • 20
  • 35