Non Tabular data structure in Fortran

Question

I would like to build a data structure for non tabular data. I am not sure what is the right way to do that in (modern) Fortran.

I have a data set of houses that includes their location (lat,lon) and price. I have another data of factories that include their location (lat,lon) and the amount of pollution they produce. For each house I need to create a list of factories which are within 5km radius of the house. Not just the number of these factories but the whole (lat,lon,pollution) vectors of these factories. Each house has a different number of factories close to it ranging from zero to about eighty.

MODULE someDefinitions
IMPLICIT NONE
INTEGER, PARAMETER :: N_houses=82390, N_factories=4215

TYPE house
  REAL :: lat,lon,price
  ! a few more fields which are not important here
END TYPE
TYPE factory
  REAL :: lat,lon,pollution
  ! a few more fields which are not important here
END TYPE

Contains

PURE FUNCTION haversine(deglat1,deglon1,deglat2,deglon2) RESULT (dist)
  ! Some code for computing haversine distance in meters
END FUNCTION haversine

END MODULE someDefinitions


PROGRAM createStructure
USE someDefinitions
IMPLICIT NONE

TYPE(factory), DIMENSION(N_factories) :: factories
TYPE(house), DIMENSION(N_houses) :: houses
INTEGER :: i,j
! more variables definitions as needed

! code to read houses data from the disk
! code to read factories data from the disk

DO i=1,N_houses
  DO j=1,N_factories
     !here I compute the distance between houses(i) and factories(j)
     ! If this distance<=5000 I want to add the index j to the list of indices
     ! associated with house i. How? What is the right data structure to do
     ! that? some houses have zero factories within 5000 meters from them.
     ! Some houses have about 80 factories around them. It's unbalanced.
  END DO !j
END DO !i

END PROGRAM createStructure

The created structure will then be used in further calculations. A matrix of N_houses x N_factories is way too large to save in memory. Note: I know Fortran 2008 if that is helpful in any way.

How about dictionarys ? See https://libatoms.github.io/QUIP/dictionary.html. — Thomas Ludewig, Sep 03 '19 at 15:05
There are likely better algorithmic ways to handle your problem, but at first glance it looks like you could be interested in the approach of [this other question](https://stackoverflow.com/q/18316592/3157076). — francescalus, Sep 03 '19 at 15:08

score 0 · Answer 1 · answered Sep 04 '19 at 20:25

Using too many nested derived types can become tedious. Here is an example using 2D arrays for all data except the required list. This is similar to the K-Nearest Neighbors (KNN) algorithm naively implemented. There may be better algorithms, of course, but the following can be a good start.

program NoStrucyures
  implicit none
  type listi
    real, allocatable :: item(:,:)
  end type 

  integer, parameter :: N_houses=82390, N_factories=4215
  real :: houses(N_houses,3)
  real :: factories(N_factories,3)
  real :: distance(N_factories)
  type(listi) :: list(N_houses)
  integer :: i, j, k, within5k 

  ! Generating dummy data
  call random_number(houses)
  call random_number(factories)
  houses = houses * 500000
  factories = factories * 500000

  do i = 1, N_houses

      distance = sqrt((houses(i,1)-factories(:,1))**2 + (houses(i,2)-factories(:,2))**2)
      within5k = count( distance <= 5000 ) 

      if (within5k > 0) then 
        allocate(list(i)%item(within5k,3))
        k = 0
        do j = 1, N_factories
          if (distance(j) <= 5000) then 
            k = k + 1
            list(i)%item(k,:) = factories(j,:)
          end if
        end do 
      else 
        list(i)%item = reshape([-1, -1, -1],[1,3])
      end if 

  end do

  do i=1,10
    print *, list(i)%item
  end do 

end program NoStrucyures

Non Tabular data structure in Fortran

1 Answers1