Program is not any faster with OpenMP

Question

My goal is to parallelize a section in my Fortran program. The flow of the program is:

Read data from a file
make some computations
write the results to 2 different files

Here I want to parallelize the writing process since I’m writing into different files.

    module foo
        use omp_lib
        implicit none
        type element
            integer, dimension(:), allocatable          :: v1, v2
            real(kind=8), dimension(:,:), allocatbale   :: M
        end type element

   contains

   subroutine test()
       implicit none
       type(element)      :: e


       do
           e = read_data_from_file()

           call compute_data(e)

           !$OMP SECTIONS
           !$OMP SECTION
           !$ call write_to_file1(e)
           !$OMP SECTION
           !$ call write_to_file2(e)
           !$OMP END SECTIONS
       end do
    end subroutine test 


    ...

    end module foo

But this program isn't going anything faster. So I think that I’m missing something?

Don't waste your time parallelising i/o unless you have the hardware to support it. (in my experience people who have such hardware don't ask this kind of question so I'm assuming that you don't.) If you have two threads trying to use one write head at (sort of) the same time you are just going to slow both write operations down while the o/s plays nice and gives each an equal share and you pay the overhead for all those switches from one to the other. — High Performance Mark, Oct 24 '16 at 16:15
It's also not clear from what you've posted whether you have any parallelism in your code at all -- see http://stackoverflow.com/questions/2770911/how-does-the-sections-directive-in-openmp-distribute-work for an explanation. And when you've sorted that out `single` is likely to be more performant than `section` for the file writing. — High Performance Mark, Oct 24 '16 at 16:21
You are missing the `OMP PARALLEL` directive, is it somewhere hidden? Perhaps you wanted `OMP PARALLEL SECTIONS` instead? But Mark is right, it will not make it faster anyway because disk operations are hard to parallelize. — Vladimir F Героям слава, Oct 24 '16 at 17:40
And please, 1. use tag [tag:fortran] and 2. use titles which describe your problem, your original title just repeated your 2 tags. — Vladimir F Героям слава, Oct 24 '16 at 17:41
Thank you for your replies. My purpose here is/was, since i'm writing into 2 different output files, to split the work and do it in parallel rather than writing the data sequentially to the first file and then the second file. — ridi, Oct 24 '16 at 21:31
1. That is wrong, don't make it parallel. It's not worth. 2. Do you have `omp parallel ` anywhere in your code at all? — Vladimir F Героям слава, Oct 25 '16 at 09:26

score 0 · Answer 1 · answered Oct 24 '16 at 20:48

0

In general one can divide scientific computing codes in bandwidth bound and computational bound algorithms. The bandwidth bound algorithms are all that only do few operations on the data they need. Like having O(n) data where O(n) flops are performed on. Thinking of the hard disk speed or the network connection speed, I/O is a bandwidth bound operation as well and therefore not or only badly parallelizable.
If you really want to gain performance out of the parallelization split the code into bandwidth bound and computational bound algorithms and use your time to parallelize the later ones.

answered Oct 24 '16 at 20:48

M.K. aka Grisu

2,338
5
17
32

In my case, writing the data into the binary files, takes 80% of the bandwith that's why i'm trying to write to the different files at the same time (in parallel). – ridi Oct 24 '16 at 21:34
Assuming that writing to files is limited by the bandwidth of harddisk interface you will split the bandwidth by parallelizing it. Due to managing tasks for the filesystem and hardware related slow down of classical harddisk during parallel actions that even slow down the program more. One possibility to accelerate the write process is to use memory mapped I/O which is normally much faster. But in this case you have to write your I/O in C and than interface it from Fortran if you really need it there. – M.K. aka Grisu Oct 25 '16 at 08:57

score 0 · Answer 2 · edited May 23 '17 at 10:27

If you specify you problem more precisely there are hundreds of experts eager to solve it. From the comment to the answer above I see that you are using binary output but still has bandwidth left to write faster, that means that you disk speed is fine and you're not limited by parsing, but rather that you actual program is not putting out data in a faster pace than this.

So optimize your code, to make it catch up with your write-speed, instead of increasing the write speed with an equally slow code.

Writing them 2 files sequentially at the max of your bandwidth is as fast and much easier than writing in parallel (at the same max speed).

If I am mistaken, and you are indeed limited by IO, maybe this other question/answer can help you: How to avoid programs in status D.

Program is not any faster with OpenMP

2 Answers2