Behavior of components when structures are in an array

Question

I am currently working on the simulation of a physical system in Fortran90 with something like 50 millions particles. Each has a position x (to simplify).

For now, I am using a 1D vector that contains the position of each particle. And when I have to iterate on every particle, I just go through that vector (as I took care to sort the particles to limit cache misses).

I am now considering creating a particle class. But what about the access to its position as I iterate ? Will it be as fast as the previous case ?

So, what does the compiler do to store the attributes of an object ? And a fortiori, what about the case with more than one attributes?

Thank you for your time.

*a fortiori*? from the stronger? Welcome on Stackoverflow, be sure to take the Welcome [tour] and show your code. — Vladimir F Героям слава, Jun 13 '18 at 13:06
The reason why I ask for the code is because it is much easier to understand code than its description in written words. I don't really get the details of what you are doing. Perhaps just https://en.wikipedia.org/wiki/AOS_and_SOA ? That is a very broad topic and probably has been asked before. — Vladimir F Героям слава, Jun 13 '18 at 14:16
Related https://stackoverflow.com/questions/38461099/how-to-implement-structures-of-arrays-instead-of-arrays-of-structures-in-fortran https://stackoverflow.com/questions/18187312/using-2d-array-vs-array-of-derived-type-in-fortran-90 https://stackoverflow.com/questions/1125626/array-of-structures-or-structure-of-arrays https://stackoverflow.com/questions/7274268/which-is-faster-vector-of-structs-or-a-number-of-vectors — Vladimir F Героям слава, Jun 13 '18 at 14:19
You meant `Components` instead of `Attributes` - "Attribute" has a different meaning in Fortran. And you meant `Array` instead of Vector. — Rodrigo Rodrigues, Jun 13 '18 at 19:08
Thank you for the all the links. I realize now that I did not have the right vocabulary to precisely search for the answers I wanted. Sorry, I am relatively new to fortran, and its vocabulary. — torvalds19, Jun 14 '18 at 13:29
And I also that wanting to make a short post, I glossed over the fact that I am currently using OpenMP, but as parallelization is killed on more than one cpu, I am considering adding some MPI (I have two processors), and exchanging the particles using derived types seems a lot easier than reordering the arrays each time. — torvalds19, Jun 14 '18 at 13:40
There are not many languages that have built in parallelisation. modern Fortran is one with COARRAYS, so OpenMp and MPI may not even be needed... There may be others?? — Holmz, Jun 15 '18 at 13:42

Rodrigo Rodrigues · Accepted Answer · 2018-06-14T07:06:56.203

On "how are derived types stored":

Fortran Standard requires components of a sequence type to be stored (in memory) as a sequence of contiguous storage, in components' declaration order. Sequence types are those declared with a SEQUENCE statement, which implies that the type shall have at least one component, each component shall be of an intrinsic or sequence type, shall not be a parameterized or extensible type, and can't have type-bound procedures. If you want this behavior and your type is suitable, make it a sequence type (you may take data alignment into consideration).

On the other hand, Fortran Standard does not state how compilers have to organize storage for non-sequence derived types. That's not bad at all, as compilers are free to optimize storage. Most of times, you may expect almost the same as sequence types: things stored contiguously whenever posible (padding may apply). Arrays and strings are always contiguous. Pointer and allocatable components are a only reference, for obvious reasons, and their targets lay somewhere else.

From the Standard:

A structure resolves into a sequence of components. Unless the structure includes a SEQUENCE statement, the use of this terminology in no way implies that these components are stored in this, or any other, order. Nor is there any requirement that contiguous storage be used. The sequence merely refers to the fact that in writing the definitions there will necessarily be an order in which the components appear, and this will define a sequence of components. This order is of limited significance because a component of an object of derived type will always be accessed by a component name except in the following contexts: the sequence of expressions in a derived-type value constructor, intrinsic assignment, the data values in namelist input data, and the inclusion of the structure in an input/output list of a formatted data transfer, where it is expanded to this sequence of components. Provided the processor adheres to the defined order in these cases, it is otherwise free to organize the storage of the components for any nonsequence structure in memory as best suited to the particular architecture.

On "Is it faster to have a derived type than independent arrays":

As @VladmirF said in comment, its a broad topic, depends highly on how are you accessing and operating your data, and has been asked and answered before (Check links on its comment). You may find a lot about it arround (link1, link2) and I'll add this one on "cache blocking" thay may interest you.

Behavior of components when structures are in an array

1 Answers1

On "how are derived types stored":

On "Is it faster to have a derived type than independent arrays":

Linked