• Re: contradiction about the INFINITY macro

    From James Kuyper@21:1/5 to Vincent Lefevre on Thu Oct 28 11:22:09 2021
    On 10/28/21 5:38 AM, Vincent Lefevre wrote:
    In article <sl9bqb$hf5$2@dont-email.me>,
    James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

    On 10/26/21 6:01 AM, Vincent Lefevre wrote:
    OK, but I was asking "where is the result of an overflow defined by
    the standard?" I don't see the word "overflow" in the above spec.

    Overflow occurs when a floating constant is created whose value is
    greater than DBL_MAX or less than -DBL_MAX. Despite the fact that the
    above description does not explicitly mention the word "overflow", it's
    perfectly clear what that description means when overflow occurs.

    Why "perfectly clear"??? This is even inconsistent with 7.12.1p5
    of N2596, which says:

    7.12.1p5 describes the math library, not the handling of floating point constants. While the C standard does recommended that "The
    translation-time conversion of floating constants should match the execution-time conversion of character strings by library functions,
    such as strtod , given matching inputs suitable for both conversions,
    the same result format, and default execution-time rounding."
    (6.4.4.2p11), it does not actually require such a match. Therefore, if
    there is any inconsistency it would not be problematic.

    A floating result overflows if the magnitude (absolute value)
    of the mathematical result is finite but so large that the
    mathematical result cannot be represented without extraordinary
    roundoff error in an object of the specified type.

    7.12.1p5 goes on to say that "If a floating result overflows and default rounding is in effect, then the function returns the value of the macro HUGE_VAL ...".
    As cited above, the standard recommends, but does not require, the use
    of default execution-time rounding mode for floating point constants.
    HUGE_VAL is only required to be positive (7.12p6) - it could be as small
    as DBL_MIN. However, on implementations that support infinities, it is
    allowed to be a positive infinity (footnote 245), and when
    __STDC_IEC_559__ is pre#defined by the implementation, it's required to
    be positive infinity (F10p2). Even if it isn't positive infinity, it is
    allowed to be DBL_MAX. DBL_MAX and positive infinity are two of the
    three options allowed by 6.4.4.2p4 for constants larger than DBL_MAX, in
    which case there's no conflict.
    If HUGE_VAL is not one of those three values, then 6.4.4.2p4 still
    applies, but 7.12.1p5 need not apply, since a match to the behavior of
    strtod() is only recommended, not required..

    If you have a mathematical value (exact value) much larger than
    DBL_MAX and that rounds to DBL_MAX (e.g. with round-toward-zero),
    there should be an overflow, despite the fact that the FP result
    is not greater than DBL_MAX (since it is equal to DBL_MAX).

    Agreed. As a result, the overflow exception should be signaled. However,
    the C standard mandates that "Floating constants are converted to
    internal format as if at translation-time. The conversion of a floating constant shall not raise an exceptional condition or a floating-point
    exception at execution time." (6.4.4.2p8). If an implementation chooses
    to do the conversion at translation-time, the exception would be raised
    only within the compiler, which has no obligation to do anything with
    it. The implementation could generate a diagnostic, but such a constant
    is not, in itself, justification for rejecting the program.

    Therefore, if an implementation chooses to defer actual conversion until run-time, it's required to produce the same results, which means it must
    clear that overflow exception before turning control over to the user code.

    Moreover, with the above definition, it is DBL_NORM_MAX that is
    more likely taken into account, not DBL_MAX.

    According to 5.2.4.2.2p19, DBL_MAX is the maximum representable finite
    floating point value, while DBL_NORM_MAX is the maximum normalized
    number. 6.4.4.2p4 refers only to representable values, saying nothing
    about normalization. Neither 7.12.5p1 nor 7.12p6 say anything to require
    that the value be normalized. Therefore, as far as I can see, DBL_MAX is
    the relevant value.

    Note also that in case of overflow, "the nearest representable value"
    is not defined.

    No definition by the standard is needed; the conventional mathematical
    definitions of "nearest" are sufficient. If infinity is representable,
    DBL_MAX is always nearer to any finite value than infinity is.
    Regardless of whether infinity is representable, any finite value
    greater than DBL_MAX is closer to DBL_MAX than it is to any other
    representable value.

    The issue is that this may easily be confused with the result
    obtained in the FE_TONEAREST rounding mode with the IEEE 754 rules
    (where, for instance, 2*DBL_MAX rounds to +Inf, not to DBL_MAX,
    despite the fact that 2*DBL_MAX is closer to DBL_MAX than to +Inf).

    Yes, and DBL_MAX and +Inf are two of the three values permitted by
    6.4.4.2p4, so I don't see any conflict there. As far as I can see, the
    value required by IEEE 754 is always one of the three values permitted
    by 6.4.4.2p4, so there's never a conflict. Are you aware of any?

    For hexadecimal floating point constants on systems with FLT_RADIX a
    power of 2, 6.4.4.2p4 only allows one value - the one that is correctly
    rounded - but that's precisely the same value that IEEE 754 requires.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)