On 10/28/21 5:38 AM, Vincent Lefevre wrote:
In article <sl9bqb$hf5$2@dont-email.me>,
James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
On 10/26/21 6:01 AM, Vincent Lefevre wrote:
OK, but I was asking "where is the result of an overflow defined by
the standard?" I don't see the word "overflow" in the above spec.
Overflow occurs when a floating constant is created whose value is
greater than DBL_MAX or less than -DBL_MAX. Despite the fact that the
above description does not explicitly mention the word "overflow", it's
perfectly clear what that description means when overflow occurs.
Why "perfectly clear"??? This is even inconsistent with 7.12.1p5
of N2596, which says:
7.12.1p5 describes the math library, not the handling of floating point constants. While the C standard does recommended that "The
translation-time conversion of floating constants should match the execution-time conversion of character strings by library functions,
such as strtod , given matching inputs suitable for both conversions,
the same result format, and default execution-time rounding."
(6.4.4.2p11), it does not actually require such a match. Therefore, if
there is any inconsistency it would not be problematic.
A floating result overflows if the magnitude (absolute value)
of the mathematical result is finite but so large that the
mathematical result cannot be represented without extraordinary
roundoff error in an object of the specified type.
7.12.1p5 goes on to say that "If a floating result overflows and default rounding is in effect, then the function returns the value of the macro HUGE_VAL ...".
As cited above, the standard recommends, but does not require, the use
of default execution-time rounding mode for floating point constants.
HUGE_VAL is only required to be positive (7.12p6) - it could be as small
as DBL_MIN. However, on implementations that support infinities, it is
allowed to be a positive infinity (footnote 245), and when
__STDC_IEC_559__ is pre#defined by the implementation, it's required to
be positive infinity (F10p2). Even if it isn't positive infinity, it is
allowed to be DBL_MAX. DBL_MAX and positive infinity are two of the
three options allowed by 6.4.4.2p4 for constants larger than DBL_MAX, in
which case there's no conflict.
If HUGE_VAL is not one of those three values, then 6.4.4.2p4 still
applies, but 7.12.1p5 need not apply, since a match to the behavior of
strtod() is only recommended, not required..
If you have a mathematical value (exact value) much larger than
DBL_MAX and that rounds to DBL_MAX (e.g. with round-toward-zero),
there should be an overflow, despite the fact that the FP result
is not greater than DBL_MAX (since it is equal to DBL_MAX).
Agreed. As a result, the overflow exception should be signaled. However,
the C standard mandates that "Floating constants are converted to
internal format as if at translation-time. The conversion of a floating constant shall not raise an exceptional condition or a floating-point
exception at execution time." (6.4.4.2p8). If an implementation chooses
to do the conversion at translation-time, the exception would be raised
only within the compiler, which has no obligation to do anything with
it. The implementation could generate a diagnostic, but such a constant
is not, in itself, justification for rejecting the program.
Therefore, if an implementation chooses to defer actual conversion until run-time, it's required to produce the same results, which means it must
clear that overflow exception before turning control over to the user code.
Moreover, with the above definition, it is DBL_NORM_MAX that is
more likely taken into account, not DBL_MAX.
According to 5.2.4.2.2p19, DBL_MAX is the maximum representable finite
floating point value, while DBL_NORM_MAX is the maximum normalized
number. 6.4.4.2p4 refers only to representable values, saying nothing
about normalization. Neither 7.12.5p1 nor 7.12p6 say anything to require
that the value be normalized. Therefore, as far as I can see, DBL_MAX is
the relevant value.
Note also that in case of overflow, "the nearest representable value"
is not defined.
No definition by the standard is needed; the conventional mathematical
definitions of "nearest" are sufficient. If infinity is representable,
DBL_MAX is always nearer to any finite value than infinity is.
Regardless of whether infinity is representable, any finite value
greater than DBL_MAX is closer to DBL_MAX than it is to any other
representable value.
The issue is that this may easily be confused with the result
obtained in the FE_TONEAREST rounding mode with the IEEE 754 rules
(where, for instance, 2*DBL_MAX rounds to +Inf, not to DBL_MAX,
despite the fact that 2*DBL_MAX is closer to DBL_MAX than to +Inf).
Yes, and DBL_MAX and +Inf are two of the three values permitted by
6.4.4.2p4, so I don't see any conflict there. As far as I can see, the
value required by IEEE 754 is always one of the three values permitted
by 6.4.4.2p4, so there's never a conflict. Are you aware of any?
For hexadecimal floating point constants on systems with FLT_RADIX a
power of 2, 6.4.4.2p4 only allows one value - the one that is correctly
rounded - but that's precisely the same value that IEEE 754 requires.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)