• Overloading functions with assumed-size array arguments?

    From Neil@21:1/5 to All on Fri Mar 11 20:53:25 2022
    I was wondering if it is possible to use overloaded functions with
    assumed-size array arguments? Please see the example below, in which
    I try to overload the BLAS ddot and sdot functions. Calls to ddot
    with the first element of the array as an argument rather than an
    array segment (the commented out lines) fail at compile time.


    PROGRAM test
    IMPLICIT NONE
    INTEGER,PARAMETER :: dp=KIND(1.d0),sp=KIND(1.0)
    REAL(dp) dx(3),dy(3)
    REAL(sp) sx(3),sy(3)
    INTERFACE ddot
    REAL(dp) FUNCTION ddot(n,x,inx,y,iny)
    IMPORT dp
    INTEGER,INTENT(in) :: n,inx,iny
    REAL(dp),INTENT(in) :: x(*),y(*)
    END FUNCTION ddot
    REAL(sp) FUNCTION sdot(n,x,inx,y,iny)
    IMPORT sp
    INTEGER,INTENT(in) :: n,inx,iny
    REAL(sp),INTENT(in) :: x(*),y(*)
    END FUNCTION sdot
    END INTERFACE ddot

    dx=(/1.d0,3.d0,5.6d0/) ; dy=(/-1.2d0,3.5d0,5.9d0/)
    sx=REAL(dx,sp) ; sy=REAL(dy,sp)

    ! This works:
    WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(:),1,dy(:),1)
    WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(:),1,sy(:),1)

    ! This doesn't work (fails to compile):
    ! WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(1),1,dy(1),1)
    ! WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(1),1,sy(1),1)

    END PROGRAM test


    There is a question on StackOverflow that suggests the answer may be
    that it is not possible to do what I want (https://stackoverflow.com/questions/24287436/how-to-have-generic-subroutine-to-work-in-fortran-with-assumed-size-array),
    although the answer doesn't quite say this explicitly. Is there any
    other trick one could use to allow assumed-size arguments to work with overloading?


    (Background: I'm trying to modify a large Fortran code that features
    lots of assumed-size BLAS/LAPACK calls so that I can easily switch
    from double precision arithmetic to single precision arithmetic; I
    would rather avoid having to replace things such as
    "ddot(n,x(l),1,y(m),1)" with "ddot(n,x(l:l+n-1),1,y(m:m+n-1),1)"
    everywhere in the code if I can.)


    Thanks for any help,

    Neil.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to nddtwe...@gmail.com on Fri Mar 11 14:11:55 2022
    On Friday, March 11, 2022 at 12:53:29 PM UTC-8, nddtwe...@gmail.com wrote:
    I was wondering if it is possible to use overloaded functions with assumed-size array arguments? Please see the example below, in which
    I try to overload the BLAS ddot and sdot functions. Calls to ddot
    with the first element of the array as an argument rather than an
    array segment (the commented out lines) fail at compile time.

    (snip)

    ! This doesn't work (fails to compile):
    ! WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(1),1,dy(1),1)
    ! WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(1),1,sy(1),1)

    (snip)

    Is there any
    other trick one could use to allow assumed-size arguments to work with overloading?

    You are allowed to call routines with assumed size dummy arrays,
    with an array element actual argument. I suspect that the overload
    function resolver doesn't know that. Can you also add overload
    interfaces for scalar arguments?

    Since you ask for a trick, I am suggesting it, otherwise some will probably suggest that you don't do that.

    Also, what are the inx and iny arguments for?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Neil@21:1/5 to gah4@u.washington.edu on Fri Mar 11 22:44:09 2022
    gah4 <gah4@u.washington.edu> wrote:
    On Friday, March 11, 2022 at 12:53:29 PM UTC-8, nddtwe...@gmail.com wrote:
    I was wondering if it is possible to use overloaded functions with
    assumed-size array arguments? Please see the example below, in which
    I try to overload the BLAS ddot and sdot functions. Calls to ddot
    with the first element of the array as an argument rather than an
    array segment (the commented out lines) fail at compile time.

    (snip)

    ! This doesn't work (fails to compile):
    ! WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(1),1,dy(1),1)
    ! WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(1),1,sy(1),1)

    (snip)

    Is there any
    other trick one could use to allow assumed-size arguments to work with
    overloading?

    You are allowed to call routines with assumed size dummy arrays,
    with an array element actual argument. I suspect that the overload
    function resolver doesn't know that. Can you also add overload
    interfaces for scalar arguments?

    Thanks very much for the suggestion. It partially works. If I
    replace the test program with...

    PROGRAM test
    IMPLICIT NONE
    INTEGER,PARAMETER :: dp=KIND(1.d0),sp=KIND(1.0)
    REAL(dp) dx(3),dy(3)
    REAL(sp) sx(3),sy(3)
    INTERFACE ddot
    REAL(dp) FUNCTION ddot(n,x,inx,y,iny)
    IMPORT dp
    INTEGER,INTENT(in) :: n,inx,iny
    REAL(dp),INTENT(in) :: x,y
    END FUNCTION ddot
    REAL(sp) FUNCTION sdot(n,x,inx,y,iny)
    IMPORT sp
    INTEGER,INTENT(in) :: n,inx,iny
    REAL(sp),INTENT(in) :: x,y
    END FUNCTION sdot
    END INTERFACE ddot
    dx=(/1.d0,3.d0,5.6d0/) ; dy=(/-1.2d0,3.5d0,5.9d0/)
    sx=REAL(dx,sp) ; sy=REAL(dy,sp)

    ! Now this doesn't work!:
    ! WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(:),1,dy(:),1)
    ! WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(:),1,sy(:),1)

    ! Now this does work!:
    WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(1),1,dy(1),1)
    WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(1),1,sy(1),1)

    END PROGRAM test


    then I can pass assumed-size arrays to the overloaded ddot (at least
    with gfortran):

    pyb075000006:~> gfortran test_overloading_blas.f90 -lblas
    pyb075000006:~> ./a.out
    42.340000000000003 42.340000000000003
    42.3400002 42.3400002

    However, I now can't pass array segments to the overloaded ddot.


    Since you ask for a trick, I am suggesting it, otherwise some will probably suggest that you don't do that.

    Also, what are the inx and iny arguments for?

    Storage spacing between the elements of dx & dy (arguments of ddot).


    Thanks,

    Neil.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to nddtwe...@gmail.com on Fri Mar 11 15:40:14 2022
    On Friday, March 11, 2022 at 2:44:12 PM UTC-8, nddtwe...@gmail.com wrote:


    (snip)

    However, I now can't pass array segments to the overloaded ddot.

    Actually, I was thinking you could put in both interfaces, but maybe the compilers are too good to allow that.

    If they don't allow it, you could put in two copies with different names,
    that it would call as appropriate, even though the code was the same.
    (Or an ENTRY with a different name, but into the same routine.)

    For extra challenge, an assumed shape ENTRY and assumed size to the
    same routine!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Neil@21:1/5 to gah4@u.washington.edu on Sat Mar 12 10:34:24 2022
    gah4 <gah4@u.washington.edu> wrote:
    On Friday, March 11, 2022 at 2:44:12 PM UTC-8, nddtwe...@gmail.com wrote:


    (snip)

    However, I now can't pass array segments to the overloaded ddot.

    Actually, I was thinking you could put in both interfaces, but maybe the compilers are too good to allow that.

    Unfortunately gfortran at least doesn't seem to allow this.

    If they don't allow it, you could put in two copies with different names, that it would call as appropriate, even though the code was the same.
    (Or an ENTRY with a different name, but into the same routine.)

    But that would prevent one using external BLAS and LAPACK libraries
    such as OpenBLAS in a straightforward manner (if you mean making a
    copy of the BLAS and LAPACK routines and renaming them for
    overloading).

    I think I probably need to roll up my sleeves and replace the array
    elements with arrays in calls to BLAS & LAPACK routines. But I don't
    see why in principle compilers can't cope with overloading of
    subprograms with assumed-size array arguments, allowing either array
    segments or initial array elements to be given in the calls to the
    overloaded subprograms.

    Thanks,

    Neil.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to Neil on Sat Mar 12 10:29:13 2022
    On 3/12/22 4:34 AM, Neil wrote:
    I think I probably need to roll up my sleeves and replace the array
    elements with arrays in calls to BLAS & LAPACK routines. But I don't
    see why in principle compilers can't cope with overloading of
    subprograms with assumed-size array arguments, allowing either array
    segments or initial array elements to be given in the calls to the
    overloaded subprograms.

    I did not follow exactly what your code examples do, but in general you
    DO NOT want to do what you are attempting to do when performance is
    important. The reason is that when you pass an array section actual
    argument to an assumed size dummy argument (or to a subprogram with an
    implicit interface), the compiler will invoke copy-in/copy-out
    semantics. That is because the assumed size array must be contiguous.
    The compiler might look ahead and skip making that copy in some cases
    when it knows the actual argument is also contiguous, but in other cases
    it is not possible for the compiler to know, so it will add the overhead anyway. Or at least it will add overhead to do the checking. For a
    vector operation (level-1 BLAS, like ddot()), there are N memory fetches
    for each argument and a comparable number of floating point operations,
    so the performance is already memory-limited. If you add an extra N
    memory fetches and an extra N memory stores to do the copy-in for each argument, then you have tripled the memory overhead. Then if it needs to
    do the copy-out (some level-1 BLAS will need this, like daxpy()), there
    will be an extra 2*N memory overhead for each such vector, multiplying
    the memory overhead further.

    Furthermore, there will be the odd looking feature that if you pass in noncontiguous array slices, the incx and incy values will be 1, while in
    the original ddot() call they would have been >1. You should certainly
    leave comments in the code explaining this quirk to other programmers
    (or to future-you a month later).

    This confusing feature of actual and dummy argument association is not
    limited to BLAS and LAPACK, it appears everywhere with legacy f77
    libraries. MPI calls are another place where the programmer must be
    careful, for example. The answer to this problem is that all of these
    libraries must be rewritten to use f90+ calling conventions, from top to bottom, in order to eliminate these redundant copy-in/copy-out
    operations. It cannot be done simply with interface blocks. The reason
    is that somewhere down the calling tree there will be the possibility of
    a noncontiguous array actual argument that is associated with a
    contiguous assumed size dummy argument, and the compiler must make the temporary copy for that to work.

    You cannot use tricks like defining the interface block dummy arguments
    to be scalars rather than vectors. That might work in some cases,
    despite violating the standard, but eventually you will find a compiler,
    or an optimization level of a compiler, where that fails.

    It does not look like that major rewrite is ever going to happen. At
    least for BLAS, LAPACK, and MPI, it has been three decades and the code
    is still written and maintained using the old conventions. So I think
    the answer is to learn how storage association works (yes, it is
    complicated), and to continue to use the old f77 conventions for those
    types of libraries when performance is important.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ron Shepard on Sat Mar 12 10:03:05 2022
    On Saturday, March 12, 2022 at 8:29:18 AM UTC-8, Ron Shepard wrote:
    On 3/12/22 4:34 AM, Neil wrote:
    I think I probably need to roll up my sleeves and replace the array elements with arrays in calls to BLAS & LAPACK routines. But I don't
    see why in principle compilers can't cope with overloading of
    subprograms with assumed-size array arguments, allowing either array segments or initial array elements to be given in the calls to the overloaded subprograms.

    I did not follow exactly what your code examples do, but in general you
    DO NOT want to do what you are attempting to do when performance is important. The reason is that when you pass an array section actual
    argument to an assumed size dummy argument (or to a subprogram with an implicit interface), the compiler will invoke copy-in/copy-out
    semantics. That is because the assumed size array must be contiguous.
    The compiler might look ahead and skip making that copy in some cases
    when it knows the actual argument is also contiguous, but in other cases
    it is not possible for the compiler to know, so it will add the overhead anyway. Or at least it will add overhead to do the checking.

    The one given is only an example, so we can't be sure, but I suspect
    that in the cases meant the compiler can figure it out.

    The easiest case where it fails, that is where, as you note,
    a non-contiguous array is used (or possible) is with an assumed
    shape dummy array. In that case, one should rewrite the BLAS
    routines to use assumed shape.

    With the pass an array element to assumed size, one can change
    the starting element, but the rest is assumed contiguous.
    If you change

    X(1,1,1,10) to X(:,:,:,10:)

    then it is just as contiguous as before.

    But yes, more specific examples would help be sure that this
    wasn't a problem.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From jfh@21:1/5 to All on Sat Mar 12 17:04:28 2022
    On Saturday, March 12, 2022 at 11:11:57 AM UTC+13, gah4 wrote:
    On Friday, March 11, 2022 at 12:53:29 PM UTC-8, nddtwe...@gmail.com wrote:
    I was wondering if it is possible to use overloaded functions with assumed-size array arguments? Please see the example below, in which
    I try to overload the BLAS ddot and sdot functions. Calls to ddot
    with the first element of the array as an argument rather than an
    array segment (the commented out lines) fail at compile time.
    (snip)
    ! This doesn't work (fails to compile):
    ! WRITE(*,*)dot_PRODUCT(dx,dy),ddot(3,dx(1),1,dy(1),1)
    ! WRITE(*,*)dot_PRODUCT(sx,sy),ddot(3,sx(1),1,sy(1),1)
    (snip)
    Is there any
    other trick one could use to allow assumed-size arguments to work with overloading?
    You are allowed to call routines with assumed size dummy arrays,
    with an array element actual argument. I suspect that the overload
    function resolver doesn't know that. Can you also add overload
    interfaces for scalar arguments?

    I have sometimes done just that, overloading 3 specific cases for an argument being absent, scalar or rank-1 array
    though the array was assumed shape not assumed size. The absent-argument case was there to avoid the problem of disambiguation with an optional argument. The case with a scalar actual argument x merely called the array case with actual argument the
    size 1 array [x].

    The above assumes the question was about cases with both scalar and array arguments. If it was really about two or more cases, each with a scalar argument, that is also OK if the argument has a different type or kind type parameter in every case. Note:
    1e0 and 1d0 have different kind type parameters but the character strings 'foo' and 'foobar' do not

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Neil@21:1/5 to Ron Shepard on Sun Mar 13 15:05:35 2022
    Thanks very much for the detailed answer.

    Ron Shepard <nospam@nowhere.org> wrote:
    I did not follow exactly what your code examples do, but in general you

    My goal is (I think) straightforward: I would like to be able to
    switch easily between double precision and single precision arithmetic
    in a Fortran program that uses BLAS and LAPACK (and also MPI). The
    naive way of doing this would be to create a separate single-precision
    version of the code in which ddot is replaced with sdot and dgemm with
    sgemm and dgels with sgels and... However, I was hoping it would be
    possible to avoid creating separate single- and double-precision
    versions of the code by overloading "ddot" so that it actually uses
    sdot if the arguments are single precision (and likewise for all other
    BLAS & LAPACK routines).

    In the great majority of cases in my Fortran code the F77-style syntax (providing the first element in an assumed-size array argument) is
    used in BLAS/LAPACK calls. Strides are 1 wherever possible, but there
    are nevertheless several places where the stride is greater than 1.

    The test code I posted here was simply to investigate whether it is
    actually possible to overload ddot to choose between ddot and sdot
    depending on whether the argument is double or single precision.

    With gfortran at least, gah4's suggestion of using a scalar rather
    than an assumed-size array in the interface templates for ddot and
    sdot allowed overloading to work with F77-style calling syntax
    (although I agree with your comment about it potentially not working
    with all compilers).

    Performance is an issue, and if compilers are unable to avoid copying
    array segments then I agree it would be best to stick to the F77
    syntax. Thanks for pointing this out.

    So perhaps the best thing for me to do is just to write a script to
    edit the source files to switch between double and single-precision
    versions of the code.

    Thanks again,

    Neil.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From FortranFan@21:1/5 to nddtwe...@gmail.com on Sun Mar 13 10:12:02 2022
    On Sunday, March 13, 2022 at 11:05:39 AM UTC-4, nddtwe...@gmail.com wrote:

    ..
    My goal is (I think) straightforward: I would like to be able to
    switch easily between double precision and single precision arithmetic
    in a Fortran program that uses BLAS and LAPACK (and also MPI). ..

    @nddtwe...@gmail.com,

    Please note your goal is *not* at all straightforward with respect to current Fortran standard:
    - you are trying to combine facilities introduced starting Fortran 90 whose fundamental premise also includes type-kind-rank safety with FORTRAN 77 style usage with instructions such as a "ddot(n,x(l),1,y(m),1)" that overlooked such considerations.

    As you show in your original post, your problem really is the code you are working with that has "ddot(n,x(l),1,y(m),1)" and which you would rather not replace. It is your reluctance here that constrains you severely. Otherwise, there are some nice
    modern options with current Fortran standard which help both with performance and expressiveness in code, though I state this notwithstanding certain limitations in modern Fortran when it comes to writing truly generic code.

    Alternately, with what you're pursuing - "I'm trying to modify a large Fortran code that features lots of assumed-size BLAS/LAPACK calls" - you could consider full-blown refactoring and modernization that will pay rich dividends while you much more "
    easily switch from double precision arithmetic to single precision arithmetic."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to FortranFan on Sun Mar 13 12:07:16 2022
    On Sunday, March 13, 2022 at 10:12:05 AM UTC-7, FortranFan wrote:

    (snip)

    Alternately, with what you're pursuing - "I'm trying to modify a large Fortran code that
    features lots of assumed-size BLAS/LAPACK calls" - you could consider full-blown
    refactoring and modernization that will pay rich dividends while you much more
    "easily switch from double precision arithmetic to single precision arithmetic."

    It would be nice to have BLAS and LAPACK versions up to modern Fortran, but
    I suspect that the OP doesn't want to be the one to write them.

    Many routines written in the Fortran 66 and Fortran 77 days, before dynamic allocation, are designed to allow for larger arrays (and especially larger 2D arrays
    for matrix work), where the routine operates on only part of the array.

    Even with dynamic allocation, it is sometimes nice to be able to operate on only part of a larger array.

    One could write wrapper routines, with assumed shape arguments, that then
    call the assumed size actual BLAS and LAPACK routines. As someone noted,
    that has to be done carefully to avoid copying for contiguous arrays.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to nddtwe...@gmail.com on Sun Mar 13 12:23:15 2022
    On Saturday, March 12, 2022 at 2:34:28 AM UTC-8, nddtwe...@gmail.com wrote:
    gah4 <ga...@u.washington.edu> wrote:
    On Friday, March 11, 2022 at 2:44:12 PM UTC-8, nddtwe...@gmail.com wrote:

    (snip)

    However, I now can't pass array segments to the overloaded ddot.

    Actually, I was thinking you could put in both interfaces, but maybe the compilers are too good to allow that.

    Unfortunately gfortran at least doesn't seem to allow this.

    If they don't allow it, you could put in two copies with different names, that it would call as appropriate, even though the code was the same.
    (Or an ENTRY with a different name, but into the same routine.)

    But that would prevent one using external BLAS and LAPACK libraries
    such as OpenBLAS in a straightforward manner (if you mean making a
    copy of the BLAS and LAPACK routines and renaming them for
    overloading).

    If you use BIND with NAME=, and different names with the same
    BIND(NAME=...) name, it might not believe that they are the same.
    (That is, without the C.)

    It does seem that it is trying to protect you from calling a routine
    two different ways, which is exactly what you want to do.

    Note that NAME= names are case sensitive, so need to match
    the case that the linker sees.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)