0

I am currently writing a Fortran subroutine that processes a (possibly large) data array. I am using multiple subroutines from multiple modules and would like to have the input data array, as well as the output data array as global variables so that I can read / write them in every subroutine. However I would like to refrain from copying the data arrays unnecessarily, because I fear it would slow down the whole program (as said before, the data arrays are potentially very big, ~10.000x5 entries or so).

At the moment, I use a variable module which contains global variables for all subroutines. I read the input and output arrays into my subroutine and then copy the input values onto the global array, perform the calculations and then copy the global output array onto the output array I have within my subroutine. The code for the subroutine looks as follows:

subroutine flexible_clustering(data_array_in, limits_in, results_array_out)
    use globalVariables_mod
    use clusterCreation_mod
    use clusterEvaluation_mod
    implicit none

    real*8, dimension(:,:) :: data_array_in
    real*8, dimension(:) :: limits_in

    real*8, dimension(:,:) :: results_array_out

    ! determine dimensions
    data_entries_number = size(data_array_in(:,1))
    data_input_dimension = size(data_array_in(1,:))
    data_output_dimension = size(results_array_out(1,:))

    ! allocate and fill arrays
    call allocate_global_variable_arrays()
    data_array = data_array_in
    limits = limits_in

    ! clustering
    call cluster_creation()

    call cluster_evaluation()

    results_array_out = results_array

    call reset_global_variables()  
  end subroutine flexible_clustering

The global variables used here are defined as follows in globalVariables_mod (with appropriate allocate / deallocate subroutines):

  integer :: data_entries_number = 0, data_input_dimension = 0, data_output_dimension = 0
  integer :: total_cluster_number = 0
  real*8, allocatable, dimension(:) :: limits
  real*8, allocatable, dimension(:,:) :: data_array
  real*8, allocatable, dimension(:,:) :: results_array

Summed up, I take data_array_in, limits_in and results_array_out and copy them to data_array, limits and results_array to make them global variables in all subroutines.

Is there a way to omit this copying? Maybe using pointers? Can I optimize this another way?

Thanks in advance!

Ian Bush
  • 6,996
  • 1
  • 21
  • 27
Ruhldieb
  • 11
  • 2
  • This [other question](https://stackoverflow.com/q/32386146/3157076) asks about approaches other than common blocks. Some answers relate to "global variables in modules", but others are discussed. Using procedure arguments doesn't mean there will be a copy. – francescalus Jul 30 '20 at 09:31
  • To build on francescalus' comment I would urge you to use procedure arguments rather than "global variables in modules" - in the long run it is usually a much more maintainable solution. Also please don't use the non-standard, potentially non-portable real*8 - see https://stackoverflow.com/questions/838310/fortran-90-kind-parameter – Ian Bush Jul 30 '20 at 10:11
  • Thank you both for your answers! From what I have understood from the links, by procedure arguments you mean simply passing them through the subroutines as arguments? My problem with this is that it renders the code less readable and requires for some subroutines to have (in my opinion) weird arguments. I will however consider it if there is no other/better solution. @Ian Bush, do you mean I should use real (kind=8) or rather even real (kind=kind(0.d0))? – Ruhldieb Jul 30 '20 at 10:46
  • @Ruhldieb Don't use real( kind=8 ) - that is also non-portable. Real( Kind - Kind( 0.0d0 ) ) is much better, but please read the link provided – Ian Bush Jul 30 '20 at 11:11
  • For clarifying code with long lists of arguments (I assume this is what you mean by less readable) derived types are your friend – Ian Bush Jul 30 '20 at 11:13
  • 3
    *My problem with this is that it renders the code less readable ...* I beg to differ, and I offer a subroutine called `flexible_clustering` (code above) in evidence. To understand that code I have to look at its argument declarations (it would help if they had `intent` declared) and then (maybe, it's not obvious) I also have to look at the sources of the modules (and here some `only` clauses would assist). – High Performance Mark Jul 30 '20 at 12:18
  • 1
    From having to work with a code that uses loads of global (module) variables and almost no arguments I can tell you that it is really horrible to find out where does a variable suddenly used in the code come from. – Vladimir F Героям слава Jul 30 '20 at 12:55
  • Apart from globals are good/bad, I think it is possible to change "data_array" and "limits" to array pointers and let them point to "data_array_in" and "limits_in" (i.e., "data_array => data_array_in" etc). But this means that if cluster_creation() and cluster_evaluation() modify "data_array", for example, the modification changes the value of "data_array_in" also, so we need to be careful... – roygvib Jul 30 '20 at 13:00
  • I will probably just pass the arguments to the subroutines. Within those subroutines I will use contain so that the scope of the passed variables extends to the sub-subroutines. Like this, derived types should not be necessary. @roygvib pointed to pointers, which seems like another good idea, though I would need to do some reading on that part to fully understand them. Anyhow, thanks everyone for your input, it has helped me a lot! – Ruhldieb Jul 30 '20 at 13:19

0 Answers0