Recommended use of Numpy's structured arrays

Question

I want to save 1D arrays into each entry of another 1D array. Essentially I have a list of grid points (1D array) for which each I would like to save a 1D array containing information specific to the grid point. (Imagine for each of a list of cities in a country, saving the demographic age distribution from age 1 to 100.)

While I can think of many ways to theoretically implement this, I am not sure which is the most efficient and in general recommended.

I could save the data in a matrix where each row is a country and the columns specify the age group. That would work and I can implement it.

Alternatively I could save the information as a structured array where the list of countries is an array and each entry is an array with the age distribution. Intuitively I prefer this option, but I don't know whether it's actually recommended. My program runs many thousand iterations, so speed is more important than prettiness of the code. The other problem I have with this is that I need to initialise that array dynamically (I hope I'm using that word correctly). I have a for-loop iterating through the list of points and saving the results for each point. However, since my list of points varies based on the init values, I can't initialise it like so data = np.array([(0, 0, 0, 0, 0), (0, 0, 0, 0, 0), (0, 0, 0, 0, 0), (0, 0, 0, 0, 0)]) because it won't always be a 4x5 array.

Any ideas on what would be the best way to do this?

What about creating arrays for each datapoint, and creating a dict as an index for the arrays? I'd be happy to hear about what others think too! :) — David, Dec 31 '20 at 16:00
Also, would this help maybe? [49384823](https://stackoverflow.com/questions/49384682/how-to-iterate-1d-numpy-array-with-index-and-value/49384823) — David, Dec 31 '20 at 16:06
I don't see how structured arrays help. We see them most often when loading `csv` files, where the columns contain different kinds of data - string labels, integer values and float values. Each column becomes a field of the structured array. While it's a convenient way of keeping mixed data types together, computationally it's no better than separate arrays for each column. — hpaulj, Dec 31 '20 at 17:07
When creating an array 'dynamically/iteratively', common practice is to collect values in a list of lists, and doing one array construction at the end. List append is relatively fast. — hpaulj, Dec 31 '20 at 17:08
@Hollossy That's possible, I just don't know how efficient it is. I see dicts as more of a tool for less structured forms of data, like strings. A matrix should be more efficient as an array than a dict, right? — LondonLiliput, Jan 02 '21 at 11:53
@hpaulj thanks for your insights. I think I'll go with a 2D array then since that's computationally efficient and the simplest way. — LondonLiliput, Jan 02 '21 at 11:57
@LondonLiliput honestly I'm not sure about how to deal with it either, it's one of the many problems I'm hoping to find a solution for which is why I followed your question! :) Maybe someone smarter than me will drop us a solution! (fingers crossed!) — David, Jan 02 '21 at 13:04

Recommended use of Numpy's structured arrays

0 Answers0