0

I am currently "optimizing" a scientific modelling program developed in Fortran 95. This program is basically making heavy computations in 3D to solve some equations, in addition numerous variable have to be saved and used ~ 50 tables with sizes likes (50; 50; 10000), I even have some 5D tables with sizes like (6;6;15;15;10000) to save in order to reduce the computation time.

I developed a perfectly working version of this code using a python3 interface to control my runs. Basically python is calling a fortran module containing my code to obtain all the results from my modelling. The problem with this method is that I cannot parallelize my code in some time consuming regions. Moreover, I would benefit from the computational time advantage of Fortran for a post treatment of the models that is now partially done in python due to interface.

In the first part of my optimization campaign for this code I want to add a control of the runs with Fortran. A program would call the module containing my code to obtain all the necessary and heavy variables. The Python interface would still be presented, the switch between the Fortran and python control run being done in the compilation in the Makefile directly, this Makefile is already done, everything is compiling well and the python interface is still perfectly working.

My troubles are concerning the Fortran control program and its gestion of the allocated memory I assume. As the size of my tables are not known in advance and requires to open some files I have to declare all my variable as ALLOCATABLE. I then allocate them with the correct sizes before calling my module containing my code. When calling my code errors related to memory problems are appearing, with the error message "Program received signal SIGSEV: Segmentation fault - invalid memory reference". This error appears when I'm setting a table to 0d0, if I'm reducing the size/precision of my modelling the program can proceed a bit further before crashing hence the memory related problem. I think that I'm doing something not correct in the utilisation of the variables between my control and my modelling module. Maybe some variables are stored in the wrong memory space, I precise that I'm using gfortran on ubuntu 22.04.1.

I have different possibilities to try to solve this issue using derived types and pointers or simply by breaking my modelling module. Before going into these heavy structural modifications I wanted to know if someone has experience an equivalent problem and what were the solutions. Here is a schema of the structure of my code:

Run program:

program run_model

use coordinates
use file
use mathematical
use modelling_module

implicit none

integer :: n_x, n_y, n_z


real(8),dimension(:), ALLOCATABLE:: x,y,z
+ all other output variables in 3D
.
.
.
Some operations and file opening 

ALLOCATE(x(n_x),y(n_y),z(n_z))
+ all other variables

CALL modelling(n_x, n_y, n_z, output variables)

end program run_model

Modelling module in a separated file:

module modelling_module

use coordinates
use file
use mathematical

implicit none
private
public :: modelling

contains

subroutine modelling(n_x, n_y, n_z, output variables)

integer, intent(in):: n_x, n_y, n_z,
real(8),dimension(n_x), intent(out):: x
real(8),dimension(n_y), intent(out):: y
real(8),dimension(n_z), intent(out):: z

+ all output variables

Computation of the model
.
.
.

end subroutine modelling
end module modelling_module

Thank you in advance for your answers !

  • Hi, can you provide a https://stackoverflow.com/help/minimal-reproducible-example so that people can help you? As it is, we can but speculate on the myriad different errors that can cause a segmentation fault. – janneb Feb 08 '23 at 18:12
  • Your likely running out of memory. Just one of those 15x15x50x50x10000 arrays will take ~42 Gbyte, and you say you have "some" of these. Unfortunately due to [lazy allocation](https://stackoverflow.com/questions/712683/what-is-lazy-allocation) it is perfectly possible for the allocation to appear to succeed, and then the program failing with a sig segv when you actually try to use that memory. – Ian Bush Feb 08 '23 at 19:22
  • Thank you for all your comments ! I will try to provide a minimum reproductible example with an error caused by the right problem and not just memory exhaustion. @HighPerformanceMark I looked at the link that your sent, my problem is probably caused by the error number #6 or a Stack Exhaustion Due to Heap, I will look into it. – StoneWall06 Feb 08 '23 at 19:53
  • @IanBush the table (15x15x50x50x10000) is the maximum table size I can reach without memory exhaustion when using unreasonable modelling precision. In my normal use of the code I am not running out of memory as my code is working using the python interface to run it. The problem is caused by my call of the modelling module with the Fortran program. – StoneWall06 Feb 08 '23 at 19:59
  • @HighPerformanceMark The principal question I would have is: how would you declare your variables and interface your modules in such cases ? – StoneWall06 Feb 08 '23 at 20:06
  • 1
    How much memory do you have? The 15x15x50x50x10000 is really 42 GB. – Vladimir F Героям слава Feb 09 '23 at 07:10
  • In the run I'm currently making these variables are only (6x6x15x15x10000) with only 1/3 of the 6x6 part used, I am absolutely sure that those are not the problem, the program is running with the python interface and these variables used. – StoneWall06 Feb 09 '23 at 08:45
  • @StoneWall06 How much memory does your machine have? – Ian Bush Feb 09 '23 at 10:05
  • I have 32 GB, again the exact same program is running when I'm using the python interface with the exact same modules and the 5 indices matrices. I only have two of them, 1/3 of each matrice is filled, there is enough space, this is not the problem. These tables are declared as local variables and not transmitted to the main program. – StoneWall06 Feb 09 '23 at 10:25
  • @StoneWall06 Also it is unclear to me which calculations are working and which are not and under what conditions, could you **edit the question** to make a list of what does work and what doesn't, including array sizes and the "interface" being used. – Ian Bush Feb 09 '23 at 10:28
  • @StoneWall06 also you are compiling the Fortran with all run time error checkc enabled? `-fcheck=all -g` as a minimum if using gfortran – Ian Bush Feb 09 '23 at 10:29

0 Answers0