Forum: >>> Magnum BBS <<<

Re: what is defined, was for or against equality

From Thomas Koenig@21:1/5 to David Brown on Thu Jan 6 16:43:05 2022

David Brown <david.brown@hesbynett.no> schrieb:

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a
language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

[...]

The real challenge from big languages and big standard libraries is not /writing/ code, it is /reading/ it. It doesn't really matter if a C programmer, when writing some code, does not know what the syntax "void foo(int a[static 10]);" means. (Most C programmers don't know it, and
never miss it.) But it can be a problem if they have to read and
understand code that uses something they don't know.

Agreed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Martin Ward@21:1/5 to David Brown on Fri Jan 7 14:02:50 2022

On 06/01/2022 08:11, David Brown wrote:

The trick is to memorize the/defined/ behaviours, and stick to them.

Isn't the set of defined behaviours bigger than the set
of undefined behaviours? How do you know what is defined
if you don't know what is undefined?

For example, a = b + c is precisely defined in C and C++ for
floating point variables, but the result can be "undefined behaviour"
for ordinary 32 bit signed integer values.

If you want to stick to defined behaviours then you need
to add extra code. For example, CERT recommends:

if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
/* Handle error */
} else {
sum = si_a + si_b;
}

--
Martin

Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Spiros Bousbouras@21:1/5 to Thomas Koenig on Fri Jan 7 13:21:29 2022

On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

David Brown <david.brown@hesbynett.no> schrieb:

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

This seems to me exactly like the C model. What difference do you see ?

Regarding the more general issue, it seems to me that undefined behaviour is
a red herring (which I think is the point David was making). Every time one writes code in any language , one must have an expectation on how the code is supposed to behave and some reasoning on why the code they wrote will behave according to their expectations. The reasoning will be based (apart from general rules from logic and mathematics) on what the standard of the programming language specifies (if the language has a standard) , what the translator/compiler documentation specifies , what the documentation of any libraries they use specifies and so forth.

For example lets say that I write in C

int a = INT_MAX + 1 ;

with the expectation that a will get the value INT_MIN. The onus is on me
to provide a reasoning why the code above will meet my expectation. If I
cannot provide such a reasoning then from my point of view the code is
already undefined. The fact that the C standard also says that the code is undefined is irrelevant. Even if the C standard specified for example that signed integer arithmetic uses wraparound, unless I could point to the place
in the standard where it said so, the code is still undefined from my point
of view so I should not use it.

But lets say that I have the above code and I intend to compile it with
GCC using the -fwrapv flag. Then my expectation is actually justified
based on the GCC documentation for what -fwrapv means and the parts
of the C standard which define what the various symbols in

int a = INT_MAX + 1 ;

mean. I'm not going to provide a proof because it should be obvious. But
any such proof would not need to cite any part of the C standard which explicitly mentions undefined behaviour.

The only occasion where an explicit mention of undefined behaviour would be relevant would be if the C standard (or any standard) were contradictory i.e. it said in some place that some construct has a certain defined behaviour and it said in some other place that the same construct has undefined behaviour. But with a popular language like C , if such contradictions existed , they would be caught early and corrected.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Thomas Koenig on Fri Jan 7 12:06:12 2022

On 06/01/2022 17:43, Thomas Koenig wrote:

David Brown <david.brown@hesbynett.no> schrieb:

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a
language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

C has basically the same concept.

(IIRC, C++ as a few constraints such as the "one definition rule" that
where the standard says no diagnostics are necessary, because
identifying the error would mean the compiler has to see multiple
translation units at once. Compilers often diagnose these if they have
some kind of link-time optimisation or program-at-once mode.)

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

That is the same in C. From 4.2 "Conformance" :

"""
If a “shall” or “shall not” requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard
by the words “undefined behavior” or by the omission of any explicit definition of behavior. There is no difference in emphasis among these
three; they all describe “behavior that is undefined”.
"""

The only difference I see from what you describe of Fortran (I have not
read any Fortran standards) is that the C standards also note that
behaviour that is not defined in the standards is undefined behaviour as
far as the standards are concerned. That is a tautology, of course, and applies equally to Fortran and any other language.

It is quite possible that the details of which behaviours are defined or
not varies between the languages - things like division by 0,
out-of-bounds array access, etc., may be different. As I understand it, passing aliased pointers or array references as different parameters to
the same function can lead to undefined behaviour in Fortran, whereas it
is defined in C (unless you use "restrict").

[...]

The real challenge from big languages and big standard libraries is not
/writing/ code, it is /reading/ it. It doesn't really matter if a C
programmer, when writing some code, does not know what the syntax "void
foo(int a[static 10]);" means. (Most C programmers don't know it, and
never miss it.) But it can be a problem if they have to read and
understand code that uses something they don't know.

Agreed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Martin Ward on Fri Jan 7 15:56:22 2022

On 07/01/2022 15:02, Martin Ward wrote:

On 06/01/2022 08:11, David Brown wrote:

The trick is to memorize the/defined/ behaviours, and stick to them.

Isn't the set of defined behaviours bigger than the set
of undefined behaviours? How do you know what is defined
if you don't know what is undefined?

You know what is "defined" because you can find the definition for it - everything else is undefined. You could enumerate all defined
behaviours for a language - after all, the documentation (language
standards, compiler manual, library documentation, etc.) is finite. It
doesn't really make sense to try to find how many undefined behaviours
there are - it's like asking how many things are there that are apples.

Language standards tell you the defined behaviour for a language.
Anything that is not there, is undefined - that's simply what the word "undefined" means.

Note that there are many other things besides language standards that
define behaviour of code in practice - compilers or interpreters can add
their own definitions to things that are not defined by the language
standards, as can additional standards such as POSIX.

If you write a function "foo" - perhaps written in the same language
(such as C), perhaps in a completely different language - then its
behaviour is not defined by the language standards. It is not mentioned anywhere in those documents, so it is undefined. (That is different
from functions whose behaviour is specified in the standard, such as
"memcpy".)

Undefined behaviour, as far as language standards are concerned, are omnipresent in programming - for all languages. The problem only comes
when you attempt to execute something that does not have its behaviour
defined /anywhere/. Then it is incorrect code - a bug.

When I learned to program (i.e., during my university education rather
than from books, magazines and trial and error previous to that), we
were very clear about how a function is specified. You have a
pre-condition and a post-condition. The function can assume the
pre-condition is logically "true", and it will guarantee that the post-condition is true at the exit. (Typically you also have an
"invariant" that is a clause in both parts, but that is just for
convenience.) If the function is called when the pre-condition is
false, the function has no obligation to do anything - it can give an
error, launch nasal daemons, give the answer it thinks the programmer
hoped for, or anything else. The behaviour is undefined.

This concept has existed since the dawn of programming:

"""
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into
the machine wrong figures, will the right answers come out?' I am not
able rightly to apprehend the kind of confusion of ideas that could
provoke such a question.

Charles Babbage
"""

The C standards contain a fair number of explicit undefined behaviours.
They do that for convenience and clarity, and often to encourage
compiler developers towards greater efficiency rather than run-time
checks, and to encourage programmers towards not assuming particular
behaviours even if one compiler happens to define the behaviour. So a
compiler writer knows that they can assume "a + b" never overflows (for
integer arithmetic), and a programmer knows that they can't assume
signed arithmetic is wrapping even if the compiler they are using at the
time /guarantees/ wrapping behaviour. (I have never seen a C compiler
that guarantees this without explicit flags.)

C is a language that expects the programmer to take responsibility for
his or her code, and ensure that it is correct. Fortunately, good
compiler developers know this is difficult and provide tools to help
people find their bugs. Thus you have a language that can give
efficient results, /and/ provide good debugging and run-time checking,
as long as you get good tools and understand how to use them.

For example, a = b + c is precisely defined in C and C++ for
floating point variables, but the result can be "undefined behaviour"
for ordinary 32 bit signed integer values.

Actually, it is not precisely defined for floating point operations - if
there is an "exceptional condition" during the evaluation (the result is
not mathematically defined or not in the range of representable values
for its type), the behaviour is undefined. That applies to all
expressions - integer and floating point.

Now, it is very common (but certainly not universal) for C
implementations to use IEEE floating point formats and rules. These
provide the "mathematical definitions" for floating point operations,
including handling of calculations outside the normal ranges. But if
you are not using these, such calculations could result in undefined
behaviour. (For example, if you use "gcc -ffast-math", the compiler
will assume that all expressions are normal finite numbers - that's
perfectly valid for C, and can be very much more efficient on a lot of targets.)

Signed integer overflow is undefined behaviour on most compilers (the
size is not necessarily 32-bit). The only one I know that defines the behaviour is gcc (and compatibles, such as clang and icc) with the
"-fwrapv" flag enabled.

And of course that makes perfect sense. It is logical to assume that if
you add two positive numbers, you get a positive number - it is
illogical to suppose that sometimes the "correct" answer will be
negative. Some programming languages (such as Java) specifically define
signed integer arithmetic to be wrapping - the result is that sometimes
you get the wrong answer in Java, while in C you would get undefined
behaviour. Wrong answers are less helpful - leaving the behaviour
undefined means you get more efficient code and that you can use
debugging tools (such as gcc's -fsantitize=undefined) to help find the
errors in your code.

If you want to stick to defined behaviours then you need
to add extra code. For example, CERT recommends:

if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
      ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
    /* Handle error */
} else {
    sum = si_a + si_b;
}

That is /not/ code to "stick to defined behaviours". It is code to
identify problems and perhaps find some way to handle it (depending on
what the "handle error" code is).

You can "stick to defined behaviour" much more simply:

int sum = (unsigned int) si_a + (unsigned int) si_b;

The behaviour is fully defined, and the result will be wrong if there is
an overflow - just like when you use a language that has fully defined
signed integer arithmetic by wrapping.

The answer here is /not/ to worry about what happens when your
expressions overflow and you get undefined behaviour. The answer is to
think about the code you are writing, and make sure that the types and expressions you write are appropriate for the values you have. Check
your values for validity when you get them in (from files, user input,
etc.), then write code that is correct for the full range of values.
Simple. (Well, as simple as any programming!)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Spiros Bousbouras@21:1/5 to Martin Ward on Sat Jan 8 03:41:55 2022

On Fri, 7 Jan 2022 14:02:50 +0000
Martin Ward <martin@gkc.org.uk> wrote:

On 06/01/2022 08:11, David Brown wrote:

The trick is to memorize the/defined/ behaviours, and stick to them.

Isn't the set of defined behaviours bigger than the set
of undefined behaviours?

That depends on how you define those sets. For example, any finite string is
a potential C source code and, of strings of length N (for any value of N), only a very small percentage have defined behaviour. But regardless, you
need to know at least some defined behaviours to be able to programme at all and, as long as you stick to those, you are not using any undefined
behaviours.

How do you know what is defined
if you don't know what is undefined?

As David has already said, you know by reading the definitions. And this is
the only way to know. Trying to guess what you're getting at, perhaps you
are thinking of someone who learns some C, then makes some unwarranted assumptions from what they have learned and then has those assumptions scaled back by coming across explicit mentions of "undefined behaviour" in the C standard. Perhaps some people do behave this way. For example someone who already knows assembly and begins to learn C may assume that all address manipulations which would be legal in assembly are also legal using C
pointers. The correct remedy is not to make unwarranted assumptions to begin with, whether one learns C or any other programming language. There is an infinite number of unwarranted assumptions one can make and the C standard
can only caution against a finite number of them.

For example, a = b + c is precisely defined in C and C++ for
floating point variables, but the result can be "undefined behaviour"
for ordinary 32 bit signed integer values.

If you want to stick to defined behaviours then you need
to add extra code. For example, CERT recommends:

if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
/* Handle error */
} else {
sum = si_a + si_b;
}

Whether you need to add code as the above will depend on what you already
know about the types and values of si_a and si_b .

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to Spiros Bousbouras on Sat Jan 8 09:31:06 2022

Spiros Bousbouras <spibou@gmail.com> schrieb:

On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

David Brown <david.brown@hesbynett.no> schrieb:

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a
language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

This seems to me exactly like the C model. What difference do you see ?

First, I see a difference in result. Highly intelligent and
knowledgable people argue vehemently if a program should be able
to use undefined behavior or not, and lot of vitriol is directed
against compiler writers who use the assumption that undefined
behavior cannot happen in their compilers for optimization,
especially if it turns out that existing code was broken and no
longer works after a compiler upgrade (Just read a few of Linus
Torvald's comments on that matter).

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

#include <unistd.h>
#include <string.h>

int main()
{
char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

not more and not less erroneous than

int main()
{
int *p = 0;
*p = 42;
}

whereas I would argue that there is an important difference between
the two.

If the C standard replaced "the behavior is undefined" with "the
program is in error, and the subsequent behavior is undefined"
or something along those lines, the discussion would be much
muted.

(Somebody may point out to me that this what the standard is
actually saying. If so, that would sort of reinforce my argument
that it should be clearer :-)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to David Brown on Sat Jan 8 17:52:02 2022

David Brown <david.brown@hesbynett.no> writes:

Undefined behaviour, as far as language standards are concerned, are >omnipresent in programming - for all languages.

Please prove this astounding assertion. My impression is that managed languages define everything, at least to some extent, and leave
nothing undefined. If they allowed nasal demons, the appeal of
managed languages would evaporate instantly.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[Things like .NET define a lot but they still are at the mercy
of their envronment when you ask for a variable sized chunk of
storage. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Spiros Bousbouras@21:1/5 to Thomas Koenig on Sat Jan 8 22:28:00 2022

On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

Spiros Bousbouras <spibou@gmail.com> schrieb:

On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

This seems to me exactly like the C model. What difference do you see ?

First, I see a difference in result. Highly intelligent and
knowledgable people argue vehemently if a program should be able
to use undefined behavior or not, and lot of vitriol is directed
against compiler writers who use the assumption that undefined
behavior cannot happen in their compilers for optimization,
especially if it turns out that existing code was broken and no
longer works after a compiler upgrade (Just read a few of Linus
Torvald's comments on that matter).

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

The C standard is in no position to say that some programme is in
error. This would require near omniscience from the standard
writers.

#include <unistd.h>
#include <string.h>

int main()
{
char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

not more and not less erroneous than

int main()
{
int *p = 0;
*p = 42;
}

whereas I would argue that there is an important difference between
the two.

The only difference I see between the two is that the first is defined
by POSIX and the second is not. According to POSIX the first is required
to print something on stdout. I cannot imagine any extension which
would make the second programme do something useful and a conforming implementation may well compile it as essentially a no-op.

But with something like

int main(voidd) {
int *p = 0 ;
*p = 42 ;
.... do other stuff ...
return 0 ;
}

the C standard allows for a conforming implementation to do something
useful like perhaps store 42 to address 0.

If the C standard replaced "the behavior is undefined" with "the
program is in error, and the subsequent behavior is undefined"
or something along those lines, the discussion would be much
muted.

(Somebody may point out to me that this what the standard is
actually saying. If so, that would sort of reinforce my argument
that it should be clearer :-)

No , it most definitely does not say that nor could it possibly say
that.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to Spiros Bousbouras on Sun Jan 9 00:09:19 2022

Spiros Bousbouras <spibou@gmail.com> schrieb:

On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

Spiros Bousbouras <spibou@gmail.com> schrieb:

On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

This seems to me exactly like the C model. What difference do you see ?

First, I see a difference in result. Highly intelligent and
knowledgable people argue vehemently if a program should be able
to use undefined behavior or not, and lot of vitriol is directed
against compiler writers who use the assumption that undefined
behavior cannot happen in their compilers for optimization,
especially if it turns out that existing code was broken and no
longer works after a compiler upgrade (Just read a few of Linus
Torvald's comments on that matter).

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

The C standard is in no position to say that some programme is in
error. This would require near omniscience from the standard
writers.

A standard (or other specification document) is certainly able to
state that some construct is in error. To grab an often-quoted
example:

J3/18-007r1, the Fortran 2018 interpretation documents, states in
subclause 9.5.3, "Array elements and array sections",

# The value of a subscript in an array element shall be within the
# bounds for its dimension.

No omnicience required to write or understand that sentence.

This puts the burden on the programmer. The compiler might catch
such an error error and abort the program, or other unpredictable
things such as overwriting an unrelated variable might also happen.

Reading a language standard can be hard. Quite often, information
is scattered throughout the text and needs to be pieced together
to find the necessary information, especially definition of terms
which are crucial to understanding. Most programmers do do not
read standards (at least final committee drafts can usually be
found these days on the Internet), but compiler writers should at
least be familiar with what they are implementing.

Programmers often rely on books, but these can also get things wrong.

Because programmers are human, they also can get ticked off when being
told that a construct they have used for years has been illegal
for decades :-|

Having a good standard is crucial to being able to write good compilers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Spiros Bousbouras@21:1/5 to Thomas Koenig on Sun Jan 9 21:30:13 2022

On Sun, 9 Jan 2022 00:09:19 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

Spiros Bousbouras <spibou@gmail.com> schrieb:

On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

The C standard is in no position to say that some programme is in
error. This would require near omniscience from the standard
writers.

A standard (or other specification document) is certainly able to
state that some construct is in error. To grab an often-quoted
example:

J3/18-007r1, the Fortran 2018 interpretation documents, states in
subclause 9.5.3, "Array elements and array sections",

# The value of a subscript in an array element shall be within the
# bounds for its dimension.

No omnicience required to write or understand that sentence.

This puts the burden on the programmer. The compiler might catch
such an error error and abort the program, or other unpredictable
things such as overwriting an unrelated variable might also happen.

I haven't read any Fortran standards so I can only go by the above quote.
Only the programmer knows what their requirements are and why they think that the code they wrote will meet those requirements. My idea of error is that either the code does not meet the requirements or it does so only by accident and the programmer does not have a correct reasoning as to why their code
will meet those requirements. You seem to be reading the quote as saying

No matter what the programmer requirements and no matter what extensions
their Fortram implementation offers , the programmer requirements will
not be justifiably met if they use an array subscript outside the bounds
for its dimension.

Perhaps some Fortran implementation gives information as to the layout of distinct variables so that one knows what will be overwritten by writing off the bounds of some aray and it will be overwritten in the way the programmer wants. Unlikely (especially for Fortran) but it cannot be excluded. I can imagine a C implementation for small embedded systems which does provide such information and a programmer using it to reduce the number of instructions to achieve a desired result. A more realistic example is the following :

#include <stdio.h>

int main(void) {
int a = 12 , b = 14 ;
printf("%2$d %1$d\n" , a , b) ;
return 0 ;
}

The above code has undefined behaviour according to the C standard. It is defined according to POSIX .Whether it is in error depends on whether the programmer really wanted to print
14 12

and no standards committee can possibly know this. So I still think that your reading requires omniscience from the Fortran standard writers. But perhaps there are other parts of the standard which justify your reading. For example some parts of the Common Lisp standard do state that an implementation must
not extend some construct to provide useful functionality beyond what the standard specifies. I don't remember precisely how it states it and I can't find those parts now.

Reading a language standard can be hard. Quite often, information
is scattered throughout the text and needs to be pieced together
to find the necessary information, especially definition of terms
which are crucial to understanding. Most programmers do do not
read standards (at least final committee drafts can usually be
found these days on the Internet), but compiler writers should at
least be familiar with what they are implementing.

Programmers often rely on books, but these can also get things wrong.

C books at least usually don't go into the fine details of undefined
behaviour. To hone one's instincts in this area one should spend a few
months systematically reading comp.lang.c while consulting a draft
of the standard !

Because programmers are human, they also can get ticked off when being
told that a construct they have used for years has been illegal
for decades :-|

This may happen but my impression with C is that the strongest complaints
come from people who

- have read the C standard (or at least the relevant parts of it)

- know that their code has undefined behaviour and know what the term means

- they do not rely on any compiler extensions

yet still feel certain (dare I say "entitled" ?) that their code ought to behave in a certain way. For an extreme example see Robert M. Hyatt of
crafty fame (a chess programme which has won awards in the past) : http://www.open-chess.org/viewtopic.php?f=5&t=2519 .
[Fortran used to require that arrays were stored in column major order, that double precision took twice the space of real and integer, and you were allowed to use EQUIVALENCE and adjustable dimensions in argument arrays to do overlaying
assuming that layout. Dunno how much more modern Fortran has deprecated it. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Thomas Koenig on Sun Jan 9 23:00:46 2022

On 08/01/2022 10:31, Thomas Koenig wrote:

Spiros Bousbouras <spibou@gmail.com> schrieb:

This seems to me exactly like the C model. What difference do you see ?

First, I see a difference in result. Highly intelligent and
knowledgable people argue vehemently if a program should be able
to use undefined behavior or not, and lot of vitriol is directed
against compiler writers who use the assumption that undefined
behavior cannot happen in their compilers for optimization,
especially if it turns out that existing code was broken and no
longer works after a compiler upgrade (Just read a few of Linus
Torvald's comments on that matter).

People want compilers to do what the programmer meant, not what he or
she wrote. And in particular, if a compiler did one thing once, they
want it to continue to do the same thing with the same code - as long as
they got what they wanted the first time round.

This is, of course, entirely natural for humans. But it is not natural
for computer programs like compilers.

Linus Torvald's is known for blowing his top on matters that he either
does not understand, or when he has mixed his personal opinions with
facts, or while only looking at a small part of the big picture. (He is
also known as an incredible programmer, a world-class project leader,
and a charismatic visionary who revolutionised the software world - but
that's beside the point here!).

A key example of his complaints in this area revolve around a function
that was something equivalent to :

int foo(int * p) {
int x = *p;
if (!p) return -1;
return x;
}

His complaint was that the compiler saw that "*p" was accessed, and
therefore assumed "p" could not be zero and optimised away the test.
The compiler did exactly what it was asked to do - the optimisation is perfectly valid according to the C standards and additional definitions
given by the compiler. But it was not what the programmer wanted, and
not what older versions of the compiler had done.

Of course, when a new optimisation simply makes object code more
efficient, programmers want that - they don't /always/ want the compiler
to handle things the way older versions did. They want the compiler to
read their minds and see what they meant to write, and generate optimal
code for that.

None of this is helped by the fact that C code often has to work
efficiently on a variety of targets and compilers, and some compilers
give extra guarantees about how they interpret code beyond the
definitions given in the C standards. Many more compilers can be relied
upon in practice to work in particular ways, though they don't guarantee
or document it, and this means the most efficient code that works in
practice on one compiler may be wrong and give incorrect results on
another compiler. You can write C code that is correct and widely
portable, but you can't write C code that is correct, optimally
efficient, and widely portable.

The big question here, is why do you think Fortran is any different? In theory, there isn't a difference - nothing you have said here convinces
me that there is any fundamental difference between Fortran and C in
regards to undefined behaviour. (And there's no difference in the implementations - the most commonly used Fortran compilers also handle
C, C++, and perhaps other languages.)

I believe it is a matter of who writes Fortran programs, and what these programs do. Now, I don't know or use Fortran myself, so I might be
wrong here. However, it seems to me that Fortran is typically used by experienced professional programmers and for scientific or numerical programming. C is used by a much wider range of programmers, for a much
wider range of programming tasks. I think it is inevitable that you'll
get more people programming in C when they are not fully sure of what
they are doing, more code where subtle mistakes can be made, more people
using C when other languages would have been better choices, and more C programmers who are likely to blame their tools for their own mistakes.

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

#include <unistd.h>
#include <string.h>

int main()
{
char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

C does not have a "write" function in the standard library. So the
behaviour of "write" is not defined by the C standards - but that does
not mean the behaviour is undefined. It just means it is defined
elsewhere, not in the C standards. If the programmer doesn't know what
the "write" function does or how it is specified, then it might be
undefined behaviour - certainly it is bad programming.

not more and not less erroneous than

int main()
{
int *p = 0;
*p = 42;
}

whereas I would argue that there is an important difference between
the two.

There is no fundamental difference - if you know the behaviour is
defined, it is defined. (The program is then correct or incorrect
depending on how that definition matches your requirements.) If not, it
is undefined (and incorrect). In neither case is the behaviour defined
by the C standard, but the behaviour could be defined by something else (library documentation or external definition of "write", or a C
compiler that specifically says it defines the behaviour of
dereferencing null pointers).

If the C standard replaced "the behavior is undefined" with "the
program is in error, and the subsequent behavior is undefined"
or something along those lines, the discussion would be much
muted.

That sounds like you dislike the "time travel" aspect of C's undefined behaviour. Many would agree with that - they don't like the idea that undefined behaviour later in the program can be used to change the
behaviour of code earlier on. The C standard considers undefined
behaviour to be program-wide - if you execute something that has
undefined behaviour (remembering that this means there is no definition /anywhere/ of what will happen), the whole program is wrong and you
can't expect anything from it.

People often find this disturbing. They think perhaps it is fair enough
that dereferencing a null pointer can crash a program, but it shouldn't
affect things that came before it.

However, there are two key points to think about. First, the standards handling of undefined behaviour means that a compiler /can/ use UB to
change the object code generated for earlier source code, not that it
/must/ do so. A compiler always balances efficient code generation with ease-of-use and ease-of-debugging. The ideal balance point will depend
on the programmer writing the code, so compiler flags are used to tune
it, but surprises can still happen.

The other point is to consider how the standards could say anything
else. If the standards required observable behaviour to be completed
before undefined behaviour occurred, the results would be terrible. Dereferencing a null pointer or dividing by zero could cause a complete
crash (remember the "Windows for Warships" affair? A single divide by
zero brought the whole ship network down, leaving it dead in the water
for hours). That means the compiler would need to make sure any
volatile writes had hit main memory before reading a pointer. It would
have to ensure all file stream buffers were flushed to disk before doing
a division. You can be sure Linus Torvalds would have a thing or two to
say about such a compiler.

(Somebody may point out to me that this what the standard is
actually saying. If so, that would sort of reinforce my argument
that it should be clearer :-)

[Fortran has in principle historically allowed rather aggressive optimization, e.g., A*B+A*C can turn into A*(B+C). On the other hand, in the real world, when IBM improved their optimizing compiler Fortran H into Fortran X, the developers said any new optimization had to produce bit identical results
to what the old compiler did. So this is not a new issue. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Anton Ertl on Sun Jan 9 23:53:52 2022

On 08/01/2022 18:52, Anton Ertl wrote:

David Brown <david.brown@hesbynett.no> writes:

Undefined behaviour, as far as language standards are concerned, are
omnipresent in programming - for all languages.

Please prove this astounding assertion. My impression is that managed languages define everything, at least to some extent, and leave
nothing undefined. If they allowed nasal demons, the appeal of
managed languages would evaporate instantly.

Certainly managed languages define far more than unmanaged languages.
But equally certainly, they do not define everything.

In Python, I can write :

x = flooble(123)

Nowhere in any part of the documentation for Python is a definition of
what the function "flooble" should do. Calling it is /undefined
behaviour/ as far as the language standards are concerned.

Certainly some aspects of calling it - such as the calling convention -
are defined. What should happen if the function does not exist is
defined. But the language and the standards do not define the behaviour
of "flooble".

Being "undefined behaviour as far as the language standards are
concerned" does not mean you can get nasal daemons, it means that the
language standards do not say what will happen. When one says "Division
by 0 is undefined behaviour in C", that is what is meant - as a compiler
or a host OS could give you well-defined and predictable behaviour when
you attempt to divide by 0.

A managed language may put limits on the kind of effect of undefined
behaviour. In Python (at least, CPython), it is possible to call
externally defined functions in shared libraries - even if the Python
bytecode interpreter limits possible effects of pure Python code,
calling external functions gets around those limits. I suppose you
could have a more locked-down managed language that does not allow any
external code, and has additional tracking on things like data space
usage, time usage, and other resources to stop run-away code.

Within such a closed language, you could have defined behaviour for all
code, since any code run or functions called would be in the same
language and have their definitions clear to the interpreter.

Personally, I don't see minimising undefined behaviour as part of the
appeal of managed languages. I make as much effort not to divide by
zero or work with invalid references in my Python code as I do in my C
code - it doesn't much matter if the program stops with Python exception
or a crash. I use Python for the convenience of working with strings, dictionaries, and other data structures with little concern for memory management, for its libraries, and other high-level features.

When running unknown code - such as javascript from a website - it is
vital that the effect of any code is limited. Code may have behaviour
that is undefined by the language standards, but it will be defined by
other parts of the code or by its environment (browser, built-in
libraries, etc.). And while it may crash the javascript program or hang
the browser, it should never be able to launch nasal daemons.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to David Brown on Mon Jan 10 12:04:02 2022

David Brown <david.brown@hesbynett.no> schrieb:

The big question here, is why do you think Fortran is any different? In theory, there isn't a difference - nothing you have said here convinces
me that there is any fundamental difference between Fortran and C in
regards to undefined behaviour.

I am not sure how to better explain it. I will try a bit, but
this will be my last reply to you in this thread. We seem to have
a fundamental difference in our understanding, and seem to be
unable to resolve it.

(And there's no difference in the
implementations - the most commonly used Fortran compilers also handle
C, C++, and perhaps other languages.)

Sort of.

At the risk of boring most readers of this group, a very short, but
(hopefully) pertinent introduction of how modern compilers work:

A front end translates the source to an abstract syntax tree (which
you can view with gfortran with -fdump-fortran-original) and from
that into an intermediate representation (which you can view with
gfortran, or with gcc in general, with -fdump-tree-original).
This intermediate representation is then optimized, in
an architecture-independent way (usually using SSA) and then
translated into assembler or directly to object code using a
"back end", of which many compilers also have several.

An example: The program

print *,"Hello, world"
end

is translated into (code only)

WRITE UNIT=6 FMT=-1
TRANSFER 'Hello, world'
DT_END

and then, in the intermediate representation.

MAIN__ ()
{
{
struct __st_parameter_dt dt_parm.0;

dt_parm.0.common.filename = &"hello.f90"[1]{lb: 1 sz: 1};
dt_parm.0.common.line = 2;
dt_parm.0.common.flags = 128;
dt_parm.0.common.unit = 6;
_gfortran_st_write (&dt_parm.0);
_gfortran_transfer_character_write (&dt_parm.0, &"Hello, world"[1]{lb: 1 sz: 1}, 12);
_gfortran_st_write_done (&dt_parm.0);
}
}

There is no compiler (if you mean a single binary) that handles both
C and Fortran. They are separate front ends to common middle
and back ends.

And there are certainly differences in the code that the front
ends handle to the middle end, so saying that there is "no
difference in the implementations" is not correct.

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

#include <unistd.h>
#include <string.h>

int main()
{
char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

C does not have a "write" function in the standard library. So the
behaviour of "write" is not defined by the C standards - but that does
not mean the behaviour is undefined.

When interpreting at a language standard, you _must_ follow the
definitions in the standards if they exist, you cannot use everyday interpretations.

Subclause 3.4.3 (N2596) defines

# undefined behavior

# behavior, upon use of a nonportable or erroneous program
# construct or of erroneous data, for which this document imposes
# no requirements

write() is nonportable and the C standard imposes no requirements
on it. Therefore, the program above invokes undefined behavior.

It just means it is defined
elsewhere, not in the C standards.

Nope, see above.

(If you replaced every occurence of "undefined behavior" in the C
standard with "WRTLPFMFT behavior" and "the behavior is undefined"
with "the behavior is WRTLPFMFT", the meaning of the standard
would not change.)
[It seems like nitpicking here. Yes, the C and POSIX standards are
different things, but we all know how common it is to use them
together. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gah4@21:1/5 to Thomas Koenig on Mon Jan 10 16:58:55 2022

On Saturday, January 8, 2022 at 10:11:55 AM UTC-8, Thomas Koenig wrote:

(snip)

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

#include <unistd.h>
#include <string.h>
int main()
{
char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

Without the:

#include <unistd.h>

I agree that this would be undefined behavior. But with the include file,
you are agreeing to use whatever standard the include file belongs to.

The include file defines the arguments to write(), but even more indicates
that you either supply (in another file), or use an otherwise supplied library defining write().

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Thomas Koenig on Tue Jan 11 18:16:28 2022

On 10/01/2022 13:04, Thomas Koenig wrote:

David Brown <david.brown@hesbynett.no> schrieb:

The big question here, is why do you think Fortran is any different? In
theory, there isn't a difference - nothing you have said here convinces
me that there is any fundamental difference between Fortran and C in
regards to undefined behaviour.

I am not sure how to better explain it. I will try a bit, but
this will be my last reply to you in this thread. We seem to have
a fundamental difference in our understanding, and seem to be
unable to resolve it.

Fair enough. Maybe in a future discussion, one of us will have an
"Aha!" moment and understand the other's viewpoint, and progress will be
made - until then, there's no point in going around in circles. I'll
snip bits of your post here, and try to minimise new points (unless I
get that "Aha!") - but be sure I am reading and appreciating your entire
post.

(And there's no difference in the
implementations - the most commonly used Fortran compilers also handle
C, C++, and perhaps other languages.)

Sort of.

At the risk of boring most readers of this group, a very short, but (hopefully) pertinent introduction of how modern compilers work:

There is no compiler (if you mean a single binary) that handles both
C and Fortran. They are separate front ends to common middle
and back ends.

Yes. But it is the middle end that handles most of the optimisations, including those based on undefined behaviour. The front end determines
whether code can have undefined behaviour and in what circumstances.

C does not have a "write" function in the standard library. So the
behaviour of "write" is not defined by the C standards - but that does
not mean the behaviour is undefined.

When interpreting at a language standard, you _must_ follow the
definitions in the standards if they exist, you cannot use everyday interpretations.

Subclause 3.4.3 (N2596) defines

# undefined behavior

# behavior, upon use of a nonportable or erroneous program
# construct or of erroneous data, for which this document imposes
# no requirements

write() is nonportable and the C standard imposes no requirements
on it. Therefore, the program above invokes undefined behavior.

No. (As always, this is based on my interpretation of the standards -
consider everything to have "IMHO" attached.) The implementation of
"write" is outside the scope of the standards, and is therefore
undefined as far as the standards are concerned. That does not make it undefined behaviour in the program - it just means the standards don't
say what "write" should do.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Anton Ertl on Tue Jan 11 16:55:54 2022

On 2022-01-08, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

David Brown <david.brown@hesbynett.no> writes:

Undefined behaviour, as far as language standards are concerned, are >>omnipresent in programming - for all languages.

Please prove this astounding assertion. My impression is that managed languages define everything, at least to some extent, and leave
nothing undefined. If they allowed nasal demons, the appeal of
managed languages would evaporate instantly.

The Lisp-like programming language Scheme has unspecified order of
argument evaluation. And you can stuff side effects into argument
expressions, like in C.

Its built-in imperative have undefined return values.

ANSI Common Lisp leaves the effects undefined of modifying literals,
just like C. ANSI Lisp code that perpetrates some kind of error is
safe only if compiled in safe mode; if you compile with reduced safety,
e.g. (declare (optimize (safety 0))), then error become undefined
behavior, including type errors. If you declare that some quantity is
a fixnum integer, and request safety 0 speed 3, and then it turns
out that it's other than an integer, woe to that code.
However, in these cases you're invoking the safety escape hatch;
it's not like C where you are shackled by chains of undefined behavior
which make themselves felt every time you squirm.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to David Brown on Tue Jan 11 19:19:31 2022

On 2022-01-11, David Brown <david.brown@hesbynett.no> wrote:

On 10/01/2022 13:04, Thomas Koenig wrote:

David Brown <david.brown@hesbynett.no> schrieb:

The big question here, is why do you think Fortran is any different? In >>> theory, there isn't a difference - nothing you have said here convinces
me that there is any fundamental difference between Fortran and C in
regards to undefined behaviour.

I am not sure how to better explain it. I will try a bit, but
this will be my last reply to you in this thread. We seem to have
a fundamental difference in our understanding, and seem to be
unable to resolve it.

Fair enough. Maybe in a future discussion, one of us will have an
"Aha!" moment and understand the other's viewpoint, and progress will be
made - until then, there's no point in going around in circles. I'll
snip bits of your post here, and try to minimise new points (unless I
get that "Aha!") - but be sure I am reading and appreciating your entire post.

(And there's no difference in the
implementations - the most commonly used Fortran compilers also handle
C, C++, and perhaps other languages.)

Sort of.

At the risk of boring most readers of this group, a very short, but
(hopefully) pertinent introduction of how modern compilers work:

There is no compiler (if you mean a single binary) that handles both
C and Fortran. They are separate front ends to common middle
and back ends.

Yes. But it is the middle end that handles most of the optimisations, including those based on undefined behaviour. The front end determines whether code can have undefined behaviour and in what circumstances.

More precisely, optimizations are based on the absence of undefined
behavior: the assumption that contracts are being upheld.

More precisely, that contracts are being upheld in the face of the
inability to determine and diagnose statically whether they are
violated; i.e. there is a "blind trust". (Though there do exist
situations in which, in principle, undefined behavior is easily
deducible at translation time, without a requirement to do so.)

Front-ends for different languages are written to the respective
requirements of those languages. Their first aim is to handle
well-defined constructs and situations. They target the intermediate
language of the compiler middle. That language has its own contracts.
The front end for each respective language has to ensure that every
situation in which behavior is defined (contract is upheld) is
translated to reliable intermediate code whose contract is upheld.
Care has to be taken that the intermediate code is expressed in the
right way so that it will not change behavior in invalid ways due to optimizations.

This leaves a lot of room for Fortran and C to have entirely different defined/undefined behaviors.

Even the front end for one single language can have a lot of switches
affecting what is defined or not.

Thre could be a switch which says that overflowing integer addition has
two's complement wrapping behavior. In that case, the compiler then
selects the intermediate instructions which provide that behavior
reliably (possibly simulating signed arithmetic with unsigned), and
also disables any inferences in the front end that might be based on the assumption that overflow has not occurred.

C does not have a "write" function in the standard library. So the
behaviour of "write" is not defined by the C standards - but that does
not mean the behaviour is undefined.

When interpreting at a language standard, you _must_ follow the
definitions in the standards if they exist, you cannot use everyday
interpretations.

Subclause 3.4.3 (N2596) defines

# undefined behavior

# behavior, upon use of a nonportable or erroneous program
# construct or of erroneous data, for which this document imposes
# no requirements

write() is nonportable and the C standard imposes no requirements
on it. Therefore, the program above invokes undefined behavior.

No. (As always, this is based on my interpretation of the standards -

Yes; using any function that is not in the C program, or in the
standard, is ISO C undefined behavior.

A program which includes <unistd.h> is not required to compile
according to ISO C; it can fail with an error message about the
header not being defined. Or, #include <unistd.h> is allowed, in
a conforming implementation, to bring in tokens which have nothing
to do with POSIX.

Furthermore, a program which calls write, and does not provide such a
function itself, is not required to successfully link. If it does link,
there is no requirement that this symbol is a function described by
POSIX.

POSIX implementations have to go out of their way to allow C programs
to use write as an external name, which ISO C allows.

For instance, the GNU C Library defines write as a weak symbol for
some identifier which resembles __libc_write: the "strong" symbol.

The C library internally uses only that __libc_write: it never calls
write, because user code could replace it:

int write(char *x) { ... }

double write = 42.0;

When the application defines the external name write, the weak symbol
coming from glibc yields; it is suppressed in favor of the program's definition.

consider everything to have "IMHO" attached.) The implementation of
"write" is outside the scope of the standards, and is therefore
undefined as far as the standards are concerned. That does not make it undefined behaviour in the program - it just means the standards don't
say what "write" should do.

Right; it's "ISO C formal undefined behavior", not "behavior that is
not defined by any party whatsoever" ... though it could well be.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gah4@21:1/5 to Kaz Kylheku on Tue Jan 11 14:18:56 2022

On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:

(big snip)

This leaves a lot of room for Fortran and C to have entirely different defined/undefined behaviors.

Even the front end for one single language can have a lot of switches affecting what is defined or not.

I suppose so. But more usual, the compiler works to the least
common denominator.

For one, C requires static variables, and especially external ones, to initialize to zero, but Fortran doesn't. Fortran compilers that use C
compiler middle and back ends, tend to zero such variables.

I suspect that there are many more that I don't know about.
As long as the cost is small, and it satisfies both standards,
not much reason not to do it.

Fortran has stricter rules on aliasing than C. I don't actually know
about any effect on C programs, though, but it might be that
compilers do the same for C.

One that is not C or Fortran, but IEEE 754, is the effect of
relational operators with NaN. Comparisons with NaN,
except for "not equal", return false. That means that compilers
have to be careful optimizing such, and especially that
"greater than or equal" is not the logical complement of "less than".
(I haven't looked at how compilers handle this, or, even more,
how the hardware handles it.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From George Neuner@21:1/5 to 480-992-1380@kylheku.com on Tue Jan 11 22:01:37 2022

On Tue, 11 Jan 2022 16:55:54 -0000 (UTC), Kaz Kylheku <480-992-1380@kylheku.com> wrote:

On 2022-01-08, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:

David Brown <david.brown@hesbynett.no> writes:

Undefined behaviour, as far as language standards are concerned, are >>>omnipresent in programming - for all languages.

Please prove this astounding assertion. My impression is that managed
languages define everything, at least to some extent, and leave
nothing undefined. If they allowed nasal demons, the appeal of
managed languages would evaporate instantly.

The Lisp-like programming language Scheme has unspecified order of
argument evaluation. And you can stuff side effects into argument >expressions, like in C.

In Scheme the order of evaluation for let expressions similarly is
unspecified.

There is at least one Scheme which deliberately randomizes the order
of function argument and let evaluation. And there are parallel
Schemes which evaluate function arguments and lets in parallel.

Its built-in imperative have undefined return values.

ANSI Common Lisp leaves the effects undefined of modifying literals,
just like C. ANSI Lisp code that perpetrates some kind of error is
safe only if compiled in safe mode; if you compile with reduced safety,
e.g. (declare (optimize (safety 0))), then error become undefined
behavior, including type errors. If you declare that some quantity is
a fixnum integer, and request safety 0 speed 3, and then it turns
out that it's other than an integer, woe to that code.
However, in these cases you're invoking the safety escape hatch;
it's not like C where you are shackled by chains of undefined behavior
which make themselves felt every time you squirm.

And Lisp's optimization settings can be changed per function or per
compilation unit as well as globally. ["declaim" vs "declare"]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to gah4@u.washington.edu on Wed Jan 12 19:02:48 2022

gah4 <gah4@u.washington.edu> schrieb:

On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:

(big snip)

This leaves a lot of room for Fortran and C to have entirely different
defined/undefined behaviors.

Even the front end for one single language can have a lot of switches
affecting what is defined or not.

I suppose so. But more usual, the compiler works to the least
common denominator.

For one, C requires static variables, and especially external ones, to initialize to zero, but Fortran doesn't. Fortran compilers that use C compiler middle and back ends, tend to zero such variables.

This is more a matter of operating system and linker conventions
than of compilers.

Looking at the ELF standard, one finds

.bss

This section holds uninitialized data that contribute to the program's
memory image. By definition, the system initializes the data with zeros
when the program begins to run. The section occupies no file space, as indicated by the section type, SHT_NOBITS.

which, unsurprisingly, matches exactly what C is doing.

Anybody who writes a Fortran compiler for an ELF system will
use .bss for COMMOM blocks, because it is easiest. Initialization
with zeros then happens automatically.

I suspect that there are many more that I don't know about.
As long as the cost is small, and it satisfies both standards,
not much reason not to do it.

Fortran has stricter rules on aliasing than C. I don't actually know
about any effect on C programs, though, but it might be that
compilers do the same for C.

The rules are different, and unless C is the intermediate language,
a good compiler will hand the corresponding hints to the middle end.
[I have used Fortran systems that initialized otherwise undefined data to a value that would
trap, to help find use-before-set errors. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Thomas Koenig on Thu Jan 13 08:24:32 2022

On 12/01/2022 20:02, Thomas Koenig wrote:

gah4 <gah4@u.washington.edu> schrieb:

On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:

For one, C requires static variables, and especially external ones, to
initialize to zero, but Fortran doesn't. Fortran compilers that use C
compiler middle and back ends, tend to zero such variables.

This is more a matter of operating system and linker conventions
than of compilers.

Looking at the ELF standard, one finds

.bss

This section holds uninitialized data that contribute to the program's
memory image. By definition, the system initializes the data with zeros
when the program begins to run. The section occupies no file space, as indicated by the section type, SHT_NOBITS.

which, unsurprisingly, matches exactly what C is doing.

Anybody who writes a Fortran compiler for an ELF system will
use .bss for COMMOM blocks, because it is easiest. Initialization
with zeros then happens automatically.

I was under the impression that FORTRAN compilers typically put data in
the ".common" section of object files. A key difference between .common
and .bss is that (with standard linker setup) duplicate symbols in .bss
are an error, while duplicate symbols in .common are merged. But in C
startup code, .common is also zeroed (FORTRAN may have different startup
code here - with no experience of the language, I don't know such details).

The use of ".common" by C compilers such as gcc was common practice
precisely to improve compatibility with FORTRAN in the early days, and
it let people write "int global_x;" in headers and have everything work,
rather than the correct practice of "extern int global_x;" in headers
and a single "int global_x;" in one object file. The big disadvantages
are that if you have "int local_x;" in two files, and don't use static,
they'll be merged with no error, and if you have "int global_x;" in one
file and "double global_x;" in another, it's a mess. Modern gcc now
uses "-fno-common" to avoid this.

I suspect that there are many more that I don't know about.
As long as the cost is small, and it satisfies both standards,
not much reason not to do it.

Fortran has stricter rules on aliasing than C. I don't actually know
about any effect on C programs, though, but it might be that
compilers do the same for C.

The rules are different, and unless C is the intermediate language,
a good compiler will hand the corresponding hints to the middle end.

AFAIUI the difference in aliasing rules is that in FORTRAN, pointer or
array parameters are assumed not to alias, while in C the compiler must
assume that they might alias, unless you use "restrict". Are there
other differences?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to Thomas Koenig on Thu Jan 13 11:17:13 2022

Thomas Koenig <tkoenig@netcologne.de> schrieb:

[I have used Fortran systems that initialized otherwise undefined
data to a value that would trap, to help find use-before-set errors.
-John]

That usually is still available, but optional. An short example:

$ cat a.f90
program main
print *,a
end program main
$ gfortran -g -ffpe-trap=invalid -finit-real=snan a.f90
$ ./a.out

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

with a backtrace pointing to the offending line.

It does not necessarily work on COMMON blocks, though.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (2 / 14)
Uptime:	87:46:17
Calls:	6,658
Files:	12,203
Messages:	5,333,954

Re: what is defined, was for or against equality

Who's Online

System Info