I am wondering how compiler authors deal with possible code
bloat in connection with elemental procedures.
Bastiaan Braams <bjbr...@gmail.com> schrieb:
I am wondering how compiler authors deal with possible code
bloat in connection with elemental procedures.
By implementing the procedure as scalar and looping.
On Tuesday, December 14, 2021 at 9:23:18 AM UTC-8, Thomas Koenig wrote:Thank you Thomas and Glen, this is very informative. So I think that the following code structure is the proper way to deal with the situation mentioned in my original post. Function Foo(n,x) can be written as an elemental function, but this will not
Bastiaan Braams <bjbr...@gmail.com> schrieb:
I am wondering how compiler authors deal with possible code
bloat in connection with elemental procedures.
By implementing the procedure as scalar and looping.In some cases, it would be an optimization to pass an array and
have it processed inside the routine.
I do remember about 1973, when the IBM OS/360 compilers
changed to doing implied-DO loops from outside to inside
the I/O routines. The new library was backwards compatible,
but old libraries were not forward compatible.
In the case of compilers doing global optimization,
I suppose it is possible to do it in the routine. Only code
for the actual used combination would be generated.
Thank you Thomas and Glen, this is very informative.
So I think that the following code structure is the proper way to deal
with the situation mentioned in my original post. Function Foo(n,x)
can be written as an elemental function, but this will not lead to
efficient code for the case that argument (n) is scalar and argument (x)
is vector. (Let's say that this case is of special interest.)
So then we still write Foo as an elemental function, but we also supply a specific function for the special case and compose the two versions
into a generic interface.
On Tuesday, December 14, 2021 at 11:21:29 PM UTC-8, bjbr...@gmail.com wrote:
(snip)
Thank you Thomas and Glen, this is very informative.
So I think that the following code structure is the proper way to deal
with the situation mentioned in my original post. Function Foo(n,x)
can be written as an elemental function, but this will not lead to
efficient code for the case that argument (n) is scalar and argument (x)
is vector. (Let's say that this case is of special interest.)
So then we still write Foo as an elemental function, but we also supply a
specific function for the special case and compose the two versions
into a generic interface.
I am sometimes surprised how much work, over the years, has been done
to make function calls faster, even as computers got faster, so there was less need for the speed.
In languages like C++, almost everything is a function callNot necessarily, it depends on your programming style.
to some virtual function.
gah4 <ga...@u.washington.edu> schrieb:Just to be sure... I was not concerned about the overhead of the function call. Once again the structure of that function Foo, here written as a scalar elemental function. This is the three-term recurrence for Laguerre polynomial r = L[n](x).
On Tuesday, December 14, 2021 at 11:21:29 PM UTC-8, bjbr...@gmail.com wrote:
(snip)
Thank you Thomas and Glen, this is very informative.
So I think that the following code structure is the proper way to deal
with the situation mentioned in my original post. Function Foo(n,x)
can be written as an elemental function, but this will not lead to
efficient code for the case that argument (n) is scalar and argument (x) >> is vector. (Let's say that this case is of special interest.)
So then we still write Foo as an elemental function, but we also supply a >> specific function for the special case and compose the two versions
into a generic interface.
I am sometimes surprised how much work, over the years, has been doneIn languages like C++, almost everything is a function call
to make function calls faster, even as computers got faster, so there was less need for the speed.
to some virtual function. Reducing that overhead can go a
long way towards making things faster.
Also, if anything, bloat has been increasing faster than the
increase in processor speed in the last few years...
gah4 <ga...@u.washington.edu> schrieb:
On Tuesday, December 14, 2021 at 11:21:29 PM UTC-8, bjbr...@gmail.com wrote:
(snip)
Thank you Thomas and Glen, this is very informative.
So I think that the following code structure is the proper way to deal
with the situation mentioned in my original post. Function Foo(n,x)
can be written as an elemental function, but this will not lead to
efficient code for the case that argument (n) is scalar and argument (x) >> is vector. (Let's say that this case is of special interest.)
So then we still write Foo as an elemental function, but we also supply a >> specific function for the special case and compose the two versions
into a generic interface.
I am sometimes surprised how much work, over the years, has been doneIn languages like C++, almost everything is a function call
to make function calls faster, even as computers got faster, so there was less need for the speed.
to some virtual function. Reducing that overhead can go a
long way towards making things faster.
Also, if anything, bloat has been increasing faster than the
increase in processor speed in the last few years...
I am sometimes surprised how much work, over the years, has been done
to make function calls faster, even as computers got faster, so there was less need for the speed.
Just to be sure... I was not concerned about the overhead of the function call.
Once again the structure of that function Foo, here written as a scalar elemental function. This is the three-term recurrence for
Laguerre polynomial r = L[n](x).
Suppose now that this function is called with scalar argument (n) and
array argument (x). Then I would think that we really want a loop within
each branch of the if-then-else construct and maybe within the (do k = 2, n) as well; we don't want one single loop over the entire code, with the test
on (n) inside the loop. And if Foo is declared elemental and there is not
a special version for the case of scalar (n) and array (x), then I expect the compiler to do just that, place the loop outside the if-then-else.
However, to add to my confusion I just did a timing test.
One instance where Foo is just an elemental function, no further
overloading, and another version where the elemental function is
overloaded with a function for arguments scalar (n) and array (x).
Makes no difference in the timing when using gfortran;
I tried it with -O0 and with -O3.
On 12/15/21 4:24 PM, Thomas Koenig wrote:
In languages like C++, almost everything is a function call
to some virtual function.
Not necessarily, it depends on your programming style.
"gah4" wrote in message news:0ab6410a-af0e-4f5d-a32d-520912de99e3n@googlegroups.com...
On Wednesday, December 15, 2021 at 11:06:07 PM UTC-8, bjbr...@gmail.com wrote:
(snip, I wrote)
Just to be sure... I was not concerned about the overhead of the function call.
Once again the structure of that function Foo, here written as a scalar elemental function. This is the three-term recurrence for
Laguerre polynomial r = L[n](x).
Oh, that one. Many years ago (Fortran 66 days) I was working on a program written by someone else that way too slow. It computed a bunch of
integrals with Gaussian quadrature, the first time I knew about that.
Now, Gaussian quadrature is pretty fast, but it was doing a lot of them.
As well as remember now, there was a subroutine that would compute
two of them, then divide, with the second one being a normalization.
In any case, for reasons similar to what you say, it was recomputing
the same thing many times. I think to fix it, I separately computed the integrals, moved some things out of a loop, and so eliminated much duplication.
Suppose now that this function is called with scalar argument (n) and
array argument (x). Then I would think that we really want a loop within each branch of the if-then-else construct and maybe within the (do k =
2, n)
as well; we don't want one single loop over the entire code, with the
test
on (n) inside the loop. And if Foo is declared elemental and there is
not
a special version for the case of scalar (n) and array (x), then I
expect the
compiler to do just that, place the loop outside the if-then-else.
However, to add to my confusion I just did a timing test.
One instance where Foo is just an elemental function, no further overloading, and another version where the elemental function is
overloaded with a function for arguments scalar (n) and array (x).
Makes no difference in the timing when using gfortran;
I tried it with -O0 and with -O3.
I think that sounds right. The statements outside the inner loop
are fairly fast, and no so many, so doing them mutliple times
make a small difference. Branch prediction will mean almost
no delay to doing the if, that you might have expected.
If there was a large part of the calculation depending on n,
but not on x, then the difference would be larger.
I think your idea is good, but doesn't apply to this example.
"gah4" wrote in message news:0ab6410a-af0e-4f5d-a32d-520912de99e3n@googlegroups.com...
"gah4" wrote in message news:0ab6410a-af0e-4f5d-a32d-520912de99e3n@googlegroups.com...
On Wednesday, December 15, 2021 at 11:06:07 PM UTC-8, bjbr...@gmail.com wrote:
(snip, I wrote)
Just to be sure... I was not concerned about the overhead of the function call.
Once again the structure of that function Foo, here written as a scalar elemental function. This is the three-term recurrence for
Laguerre polynomial r = L[n](x).
Oh, that one. Many years ago (Fortran 66 days) I was working on a program written by someone else that way too slow. It computed a bunch of
integrals with Gaussian quadrature, the first time I knew about that.
Now, Gaussian quadrature is pretty fast, but it was doing a lot of them.
As well as remember now, there was a subroutine that would compute
two of them, then divide, with the second one being a normalization.
In any case, for reasons similar to what you say, it was recomputing
the same thing many times. I think to fix it, I separately computed the integrals, moved some things out of a loop, and so eliminated much duplication.
Suppose now that this function is called with scalar argument (n) and
array argument (x). Then I would think that we really want a loop within each branch of the if-then-else construct and maybe within the (do k =
2, n)
as well; we don't want one single loop over the entire code, with the
test
on (n) inside the loop. And if Foo is declared elemental and there is
not
a special version for the case of scalar (n) and array (x), then I
expect the
compiler to do just that, place the loop outside the if-then-else.
However, to add to my confusion I just did a timing test.
One instance where Foo is just an elemental function, no further overloading, and another version where the elemental function is
overloaded with a function for arguments scalar (n) and array (x).
Makes no difference in the timing when using gfortran;
I tried it with -O0 and with -O3.
I think that sounds right. The statements outside the inner loop
are fairly fast, and no so many, so doing them mutliple times
make a small difference. Branch prediction will mean almost
no delay to doing the if, that you might have expected.
If there was a large part of the calculation depending on n,
but not on x, then the difference would be larger.
I think your idea is good, but doesn't apply to this example.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 159 |
Nodes: | 16 (0 / 16) |
Uptime: | 99:35:26 |
Calls: | 3,209 |
Files: | 10,563 |
Messages: | 3,009,786 |