0

Attached below is a small piece of code I wrote in fortran. It reads files of specially formatted filename and yields information about aminoacids. If i compile it with compiler flag -O0, it all works fine. But with flag O1, it gives weired output, and gives segmentation fault with -O2 onwards.

Even strange is the fact that on line #127, if i uncomment a blank print statement, it works fine with all optimizations. I understand that optimizations can result in such faults, but I was curious to know

1) Why? whats going on here and how to prevent it

2) How to analyze such errors, I tried getting assembly output and diffing it, but it is quite overwhelming to pinpoint the error. Any simpler method?

!< this module contains basic arrays of length 45, which contains the information about
!! forst 3 characters of 45 aminoacids, and their chiralities, each amino acid has been assigned a number in range 1-45
!! and position of amide nitrogen in that aminoacid. Two functions   getAminoNumberFromString  getAminoStringFromNumber
!! can be used to get information about the serial number from the amino acid string and vice versa. and position of amide bond
!! can be retrived using getAmidePosition
module aminoacid_symbol_table
implicit none

character(len=9), dimension(45), parameter         :: aminoSymbol = &
&(/ 'Ala_2S   ', 'Cys_2R   ', 'Asp_2S   ', 'Glu_2S   ', 'Phe_2S   ',&
&   'Gly      ', 'His_2S   ', 'Ile_2S_3S', 'Lys_2S   ', 'Leu_2S   ',&
&   'Met_2S   ', 'Asn_2S   ', 'Pro_2S   ', 'Gln_2S   ', 'Arg_2S   ',&
&   'Ser_2S   ', 'Thr_2S_3R', 'Sec_2R   ', 'Val_2S   ', 'Trp_2S   ',&
&   'Tyr_2S   ', 'Ala_2R   ', 'Cys_2S   ', 'Asp_2R   ', 'Glu_2R   ',&
&   'Phe_2R   ', 'His_2R   ', 'Ile_2R_3S', 'Lys_2R   ', 'Leu_2R   ',&
&   'Met_2R   ', 'Asn_2R   ', 'Pro_2R   ', 'Gln_2R   ', 'Arg_2R   ',&
&   'Ser_2R   ', 'Thr_2R_3R', 'Sec_2S   ', 'Val_2R   ', 'Trp_2R   ',&
&   'Tyr_2R   ', 'Ile_2R_3R', 'Ile_2S_3R', 'Thr_2R_3S', 'Thr_2S_3S'  /)

integer(kind=8),  dimension(45), parameter         :: aminoNumber = &
&(/ 1 ,   2 ,   3 ,   4 ,   5 ,   6 ,   7 ,   8 ,   9 ,&
&   10,   11,   12,   13,   14,   15,   16,   17,   18,&
&   19,   20,   21,   22,   23,   24,   25,   26,   27,&
&   28,   29,   30,   31,   32,   33,   34,   35,   36,&
&   37,   38,   39,   40,   41,   42,   43,   44,   45/)

integer(kind=8),  dimension(45), parameter         :: amidePosition = &
&(/  7,    8,   10,   11,   13,    5,   12,   11,   11,&
&   10,   10,   10,    9,   11,   14,    8,   10,    9,&
&    9,   16,   14,    7,    8,   10,   11,   13,   12,&
&   11,   11,   10,   10,   10,    9,   11,   14,    8,&
&   10,    9,    9,   16,   14,   11,   11,   10,   10 /)

integer(kind=8),  dimension(45), parameter         :: aminoSize = &
&(/13,    14,   16,   19,   23,   10,   20,   22,   24,&
&  22,    20,   17,   17,   20,   26,   14,   17,   14,&
&  19,    27,   24,   13,   14,   16,   19,   23,   20,&
&  22,    24,   22,   20,   17,   17,   20,   26,   14,&
&  17,    14,   19,   27,   24,   22,   22,   17,   17/)

contains

function getAminoNumberFromString(aminoStringCode) result(aminoNumericCode)
    implicit none
    character(len=9), intent(in)    :: aminoStringCode
    integer(kind=8)                 :: aminoNumericCode
    integer(kind=4)                 :: i
    ! print *, aminoStringCode
    string_match: do i = 1,45
        if(aminoStringCode .eq. aminoSymbol(i)) then
            aminoNumericCode = i
        end if
    end do string_match
end function getAminoNumberFromString

function getAminoStringFromNumber(aminoNumericCode) result(aminoStringCode)
    implicit none
    integer(kind=8), intent(in)     :: aminoNumericCode
    character(len=9)                :: aminoStringCode
    integer(kind=4)                 :: i

    string_match: do i = 1,45
        if(aminoNumericCode .eq. aminoNumber(i)) then
            aminoStringCode = aminoSymbol(i)
        end if
    end do string_match
end function getAminoStringFromNumber

function getAmidePosition(aminoNumericCode) result(amideBondPosition)
    implicit none
    integer(kind=8), intent(in)     :: aminoNumericCode
    integer(kind=8)                 :: amideBondPosition

    amideBondPosition = amidePosition(aminoNumericCode)
end function getAmidePosition

function getAminoSize(aminoNumericCode) result(aminoTotalSize)
    implicit none
    integer(kind=8), intent(in)     :: aminoNumericCode
    integer(kind=8)                 :: aminoTotalSize

    aminoTotalSize = aminoSize(aminoNumericCode)
end function getAminoSize

function get3AminoFromFilename(fileName) result(aminoList)
    implicit none
    character(len=80), intent(in)  :: fileName
    ! character, dimension(9)        :: firstAmino
    character(len=9)               :: firstAmino, aminoList(3)
    integer(kind=4)                :: i, aminoCounter, scnt
    ! Thr_2S_3S__Leu_2S__Ile_2S_3R.smi
    firstAmino = ' '
    aminoCounter = 1
    scnt = 1
    i=1
    do while (i<35)
        if((fileName(i:i+1)/='__') .and. (fileName(i:i+1)/='.x')) then
            ! print *,fileName(i:i)
            firstAmino(scnt:scnt) = fileName(i:i)
            scnt=scnt+1
            i=i+1
        else if(aminoCounter<=3) then
            ! print *,firstAmino
            aminoList(aminoCounter) = firstAmino
            firstAmino =''
            aminoCounter=aminoCounter+1
            scnt=1
            i=i+2
        else
            return
        end if
    end do
end function get3AminoFromFilename


subroutine getSelected6(fileName, neighboursList)
    implicit none
    character(len=80), intent(in)              :: fileName
    character(len=9)                           :: aminoList(3)
    integer(kind=8)                            :: sizeAmino1, sizeAmino2, sizeAmino3
    integer(kind=8)                            :: amidePosition1, amidePosition2, amidePosition3
    integer(kind=8)                            :: aminoNumber1, aminoNumber2, aminoNumber3
    integer(kind=8), intent(out)               :: neighboursList(2,6)


    aminoList = get3AminoFromFilename(fileName)
    ! print *,""
    aminoNumber1   = getAminoNumberFromString(aminoList(1))
    aminoNumber2   = getAminoNumberFromString(aminoList(2))
    aminoNumber3   = getAminoNumberFromString(aminoList(3))
    sizeAmino1     = getAminoSize(aminoNumber1)
    sizeAmino2     = getAminoSize(aminoNumber2)
    sizeAmino3     = getAminoSize(aminoNumber3)
    amidePosition1 = getAmidePosition(aminoNumber1)
    amidePosition2 = getAmidePosition(aminoNumber2)
    amidePosition3 = getAmidePosition(aminoNumber3)
    ! for R1CR2NH2COCR3R4, list is organized as [N, H_N/C_N, C1, C, O, C_C]
    neighboursList(1,1) = amidePosition1
    neighboursList(1,3) = 4
    neighboursList(1,4) = amidePosition1+1
    neighboursList(1,5) = amidePosition1+1+1
    neighboursList(1,6) = amidePosition1+1+2
    if ((aminoNumber1 .eq. 13) .or. (aminoNumber1 .eq. 33)) then
        neighboursList(1,2) =  amidePosition1 - 1
    else
        neighboursList(1,2) = sizeAmino1 - 1 + amidePosition2 + amidePosition3 - 2
    endif

    neighboursList(2,1) = amidePosition2+amidePosition1-1
    neighboursList(2,3) = amidePosition1+3
    neighboursList(2,4) = amidePosition2+amidePosition1
    neighboursList(2,5) = amidePosition2+amidePosition1+1
    neighboursList(2,6) = amidePosition2+amidePosition1+2
    if ((aminoNumber2 .eq. 13) .or. (aminoNumber2 .eq. 33)) then
        neighboursList(2,2) =  amidePosition1+amidePosition2 - 2
    else
        neighboursList(2,2) =  sizeAmino1 -1 + sizeAmino2 -3 + amidePosition3 - 1
    endif
end subroutine getSelected6

end module aminoacid_symbol_table

program main
use aminoacid_symbol_table
implicit none
character(len=80)  :: fileName 
integer(kind=8)              :: neighboursList(2,6)
fileName = 'Gly__Gly__Gly.xyz'
call getSelected6(fileName,neighboursList)
print *, neighboursList
end program main

compiled using GNU Fortran (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

Output expected (O0 flag):

$ ./a.out

                5                    9                   17                   20                    4                    8                    6                   10                    7                   11                    8                   12 

I think its related to this question: Execution of printf() and Segmentation Fault

Edit: output with O1 and O2 flag

$ gfortran -O1 -g -C aminotrial.f90 

$ ./a.out  

                0            268435456                    0                    0                    0                    0                    0                    0                    0                    0                    0                    0


$ gfortran -O2 -g -C aminotrial.f90 
$ ./a.out 

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f68a90812da in ???
#1  0x7f68a9080503 in ???
#2  0x7f68a8cb3f1f in ???
#3  0x561db7603a62 in __aminoacid_symbol_table_MOD_getaminosize
at /aminotrial.f90:82
#4  0x561db7603a62 in __aminoacid_symbol_table_MOD_getselected6
at /aminotrial.f90:131
#5  0x561db7603b91 in MAIN__
at /aminotrial.f90:169
#6  0x561db760374e in main
at /aminotrial.f90:164
Segmentation fault (core dumped)

O3 is exactly same as O2

I am not sure how it is duplicate of generic question regarding flags of gfortran

ipcamit
  • 330
  • 3
  • 16
  • Please provide also the output in respect to the `-O1` and `O2` versions. Best to compile also with debug options (by head -g) and boundary checking options (by head -C) to see where the problem occurs. Note that the `kind=8` and `kind=4` are system dependent and don't signal that it are 3 of 8 byte integeres (have a look at `SELECTED_INT_KIND`) – albert Feb 24 '19 at 12:02
  • Yes. But I couldn't figure out why it would matter in o3 but not in o0. – ipcamit Feb 24 '19 at 12:25
  • I have put this as a duplicate because you should use _all_ of the suggested flags, possibly with a debugger, in an attempt to find the bug in the code. (Note that `-C` possibly doesn't do what you think it does: it isn't relevant here.) Once you have found and fixed the bug, recompile with higher optimization flags. In particular, that questions answers "How to analyze such errors?" – francescalus Feb 24 '19 at 12:40
  • 1
    "-O0 -fcheck=all" already gives the same error (out of bounds), so I guess that the result with "-O0" only was simply lucky (by not overwriting a critical area of some memory). But with -O1, -O2, ... I guess the code may have overwritten some critical area (<-- additional options "-fsanitize=address -g" or valgrind may also be useful). – roygvib Feb 24 '19 at 12:56
  • 1
    If ypur code is wrong (not conforming), it triggers https://en.m.wikipedia.org/wiki/Undefined_behavior Anything can happen then, different results with different optimizations are completely normal, the behavior of the program can be completely unpredictable. – Vladimir F Героям слава Feb 24 '19 at 15:33

0 Answers0