Suppose I have a multi-dimension array in Fortran
A(:,:,:,:)
, where the size of each dimension is the same, n
.
I would like to swap the second and third dimension. I can do a reshape
A2 = reshape(A, (/ n, n, n, n/), order = (/1,3,2,4/) )
Is there any faster way? Since I only need a partial swap.
Updated Example code
Program reshape_test
Use, Intrinsic :: iso_fortran_env, Only : wp => real64, li => int64
real :: A(32,32,32,32), A2(32,32,32,32)
integer :: i,j,k,l,n,m, m_iter
Integer( li ) :: start, finish, rate
m_iter = 3200
n = 32
do l = 1, n
do k = 1, n
do j = 1, n
do i = 1, n
A(i,j,k,l) = i + j * k**2 - l
end do
end do
end do
end do
Call System_clock( start, rate )
do m=1,m_iter
A2 = reshape(A, (/n, n, n, n/), order = (/1,3,2,4/) )
end do
! write (*,*) A(1,2,3,4), A(1,3,2,4), A2(1,2,3,4), A2(1,3,2,4)
Call System_clock( finish, rate )
Write(*,*) 'Time for reshape-1', Real( finish - start, wp ) / rate
Call System_clock( start, rate )
do m=1,m_iter
do j = 1, n
do i = 1, n
A2(:,i,j,:) = A(:,j,i,:)
end do
end do
end do
!write (*,*) A(1,2,3,4), A(1,3,2,4), A2(1,2,3,4), A2(1,3,2,4)
Call System_clock( finish, rate )
Write(*,*) 'Time for reshape-2', Real( finish - start, wp ) / rate
Call System_clock( start, rate )
do m=1,m_iter
do l = 1, n
do k = 1, n
do j = 1, n
do i = 1, n
A2(i,k,j,l) = A(i,j,k,l)
end do
end do
end do
end do
end do
! write (*,*) A(1,2,3,4), A(1,3,2,4), A2(1,2,3,4), A2(1,3,2,4)
Call System_clock( finish, rate )
Write(*,*) 'Time for reshape-3', Real( finish - start, wp ) / rate
end program reshape_test
In this example, by gfortran -o3
, I got (I run three times)
Time for reshape-1 13.500307800000000
Time for reshape-2 10.146418400000000
Time for reshape-3 20.489294800000000
Time for reshape-1 11.421597100000000
Time for reshape-2 9.3823936999999997
Time for reshape-3 19.856221900000001
Time for reshape-1 14.376207500000000
Time for reshape-2 10.756465400000000
Time for reshape-3 21.301044500000000
Adding -march=native
leads to (I run three times)
Time for reshape-1 12.517529200000000
Time for reshape-2 10.240939200000000
Time for reshape-3 26.637998799999998
Time for reshape-1 12.436479700000000
Time for reshape-2 9.9803838000000002
Time for reshape-3 21.760061700000001
Time for reshape-1 11.419214300000000
Time for reshape-2 10.908747200000001
Time for reshape-3 21.445951399999998
Overall approach 2 is the fastest. -march=native
seems not very effective.
(code style of report timing follows answer in How to speed up reshape in higher rank tensor contraction by BLAS in Fortran?)