Forum: >>> Magnum BBS <<<

Word For Today: =?UTF-8?B?4oCcVWdsaWZpY2F0aW9u4oCd?=

From Lawrence D'Oliveiro@21:1/5 to All on Tue Mar 12 00:14:19 2024

From /usr/include/«arch»/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)

Note how this macro brings the entire expression for “s” into the
scope containing those temporary “__i” and “__arr” variables. You just better hope they won’t clash.

I think there is a clause in the C spec that says names beginning with underscores (“uglified” names, I think they’re called) are reserved
for library implementors or something. But what happens if one library implementation depends on another? What keeps the choices of names
from clashing in that situation? Just luck, I guess.

Basically, string-substitution macros are fiddly, fragile, and prone
to mysterious syntax errors in the target language if you’re not
careful. I thought initially that this was down to limitations in the
the C/C++ preprocessor; maybe a more powerful one, like m4, would
help. But it turns that is even more fiddly and fragile, and prone to mysterious syntax errors if you’re not careful.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Tue Mar 12 01:34:20 2024

On Mon, 11 Mar 2024 18:15:06 -0700, Keith Thompson wrote:

If the standard library includes code from two or more different implementers, all implementers have a very strong interest in avoiding
any clashes. I don't see a real problem here.

... until the Birthday Paradox comes into play.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Mar 12 03:30:55 2024

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/«arch»/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \

This assignment has value; it checks that, loosely speaking,
s is an "assignment compatible" pointer with a fd_set *,
so that there is a diagnostic if the macro is applied to
an object of the wrong type.

for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \

Here, I would have done memset(__arr, 0, sizeof *__arr).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Tue Mar 12 06:14:58 2024

On Tue, 12 Mar 2024 01:33:00 -0400, James Kuyper wrote:

They are called "reserved identifiers", a name which more directly
addresses their purpose. They don't just start with underscores - there
are several different sets of identifiers, reserved for different
purposes. See section 7.1.3 for details. They are provided by *an* implementation. Note the use of the singular.

Looking at the C99 spec, section 7.1.3:

Also reserved for the implementor are all external identifiers
beginning with an underscore, and all other identifiers beginning
with an underscore followed by a capital letter or an underscore.
This gives a name space for writing the numerous behind-the-scenes
non-external macros and functions a library needs to do its job
properly.

The problem I have with that is the singular form of “library”. In a typical Linux distro, you could have thousands of libraries installed.
I just did this command on my Debian system:

dpkg-query -l lib*dev | wc -l

and the answer came back “1037”. The idea that a C-language
implementation and run-time environment is any sense monolithic seems hopelessly out of touch.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Tue Mar 12 01:33:00 2024

On 3/11/24 20:14, Lawrence D'Oliveiro wrote:

From /usr/include/«arch»/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)

Note how this macro brings the entire expression for “s” into the
scope containing those temporary “__i” and “__arr” variables. You just
better hope they won’t clash.

I think there is a clause in the C spec that says names beginning with underscores (“uglified” names, I think they’re called) are reserved
for library implementors or something. But what happens if one library implementation depends on another? What keeps the choices of names
from clashing in that situation? Just luck, I guess.

They are called "reserved identifiers", a name which more directly
addresses their purpose. They don't just start with underscores - there
are several different sets of identifiers, reserved for different
purposes. See section 7.1.3 for details. They are provided by *an* implementation. Note the use of the singular. As far as the standard is concerned, there is only one implementation that is responsible for
translating and executing a given program. What the implementation
implements is not just the C standard library, but also the C language. Libraries other than the C standard library do have implementations, but
those implementations are not what the C standard is usually talking
about when it uses that word.

The standard defines an implementation as "particular set of software,
running in a particular translation environment under particular control options, that performs translation of programs for, and supports
execution of functions in, a particular execution environment" (3.12).

Note that the software must be running before it can be called an implementation. A program that is just sitting on your computer waiting
to be executed cannot qualify. Also, if the software has options,
choosing different options when you start it up can make it a different implementation of C.

You can have an implementation of C where different parts are
implemented by different implementors - in fact, it's quite common for
the language, the C standard library, and the linker to be implemented
by different organizations. However, the combination of those parts only qualifies as a conforming implementation of C if those different parts
work together as required by the standard. Avoiding the conflicts you're talking about is a pre-requisite for doing so.

Most implementors that implement only part of a C implementation make
sure to test whether their part works together with popular
implementations of the other parts, and to document which ones they do
work with. If you cobble together a complete implementation from parts implemented by different implementors, you'd better check their
documentation to see if at least one of them has tested compatibility
with all of the others. If none of them has done such testing, you
shouldn't count on them working together as a conforming C implementation.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Mar 12 06:15:38 2024

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

From /usr/include/«arch»/bits/select.h on my Debian system:

#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)

Note how this macro brings the entire expression for “s” into the
scope containing those temporary “__i” and “__arr” variables. You just
better hope they won’t clash.

I think there is a clause in the C spec that says names beginning with underscores (“uglified” names, I think they’re called) are reserved
for library implementors or something. But what happens if one library implementation depends on another? What keeps the choices of names
from clashing in that situation? Just luck, I guess.

That doesn't happen. There is only one library that's part of the
language implementation, and those identifiers are reserved for that.

Any third party library that is not part of the implementation cannot
use these identifiers with the absolute certainty that they don't clash
with anything.

Standard C doesn't offer a solution for the problem of party library
writers needing private identifiers in their headers.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Tue Mar 12 06:21:14 2024

On Tue, 12 Mar 2024 06:14:58 -0000 (UTC), I wrote:

I just did this command on my Debian system:

dpkg-query -l lib*dev | wc -l

Let me amend that: the more accurate command (counting only installed libraries, not all the ones the package system knows about) would be

dpkg-query -l lib\*dev | grep ^i | wc -l

which produces a result of 320 on my system.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Tue Mar 12 07:44:19 2024

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

The problem I have with that is the singular form of “library”. In a typical Linux distro, you could have thousands of libraries installed.
I just did this command on my Debian system:

dpkg-query -l lib*dev | wc -l

and the answer came back “1037”. The idea that a C-language implementation and run-time environment is any sense monolithic seems hopelessly out of touch.

There is no such out-of-touch idea. In (say) a Glibc-based system, only
the GCC, Glibc and kernel headers are part of the implementation (which comprises C, POSIX plus GNU and Linux extensions), and only the GCC and
Glibc library components and their external names.

Other libraries are third parties; the __ and _[A-Z] namespace
simply doesn't belong to them.

C doesn't provide any special tools for the application developer and
third party code to avoid clashes among themselves.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Richard Kettlewell on Tue Mar 12 11:10:13 2024

On 12/03/2024 09:03, Richard Kettlewell wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

and the answer came back “1037”. The idea that a C-language
implementation and run-time environment is any sense monolithic seems
hopelessly out of touch.

There is no such out-of-touch idea. In (say) a Glibc-based system, only
the GCC, Glibc and kernel headers are part of the implementation (which
comprises C, POSIX plus GNU and Linux extensions), and only the GCC and
Glibc library components and their external names.

Other libraries are third parties; the __ and _[A-Z] namespace
simply doesn't belong to them.

C doesn't provide any special tools for the application developer and
third party code to avoid clashes among themselves.

That’s true, but AFAICT it’s exactly what Lawrence is complaining about: there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.

No, it is /not/ what Lawrence is complaining about. He is complaining
about clashes within the reserved namespace, because he didn't
understand the difference between the library and headers that make up a
C implementation, and other libraries that he happens to have on the
system. It's an understandable confusion, since it is an overload of
the term "library". But hopefully he understands that now.

The limited support for avoiding name clashes in C (user-level C,
outside of the implementation internals) is certainly something that he
(or others) /could/ complain about. It is a well-known issue, and it's
a shame that the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple "namespace" solution
that would hugely simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to handle many cases.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Shepelev@21:1/5 to All on Tue Mar 12 17:46:00 2024

David Brown:

The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public functions
of a module.

--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Anton Shepelev on Tue Mar 12 14:53:57 2024

On 12/03/2024 14:46, Anton Shepelev wrote:

David Brown:

The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public functions
of a module.

What stops that function name clashing with the single function exported
from other people's modules?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Shepelev@21:1/5 to All on Tue Mar 12 18:09:04 2024

bart:

Anton Shepelev:

David Brown:

The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.

What stops that function name clashing with the single
function exported from other people's modules?

A much lower probability.

--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Anton Shepelev on Tue Mar 12 15:42:48 2024

On 12/03/2024 15:09, Anton Shepelev wrote:

bart:

Anton Shepelev:

David Brown:

The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.

What stops that function name clashing with the single
function exported from other people's modules?

A much lower probability.

I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.

It turned out that one of them used this line:

#include "string.h"

and the other used:

#include "malloc.h"

Notice they use "..." rather than <...>. These are not the standard
headers, but user-written headers with the same names. (My compiler
looks for them in the wrong order.)

People like reusing the same popular module names so much, they will
even use the names of standard headers! Any exported function name is
likely to be linked to the name of the module.

returning a struct with pointers to all the public
functions of a module.

There may be an additional clash-point with exposing the name of the struct.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Tue Mar 12 11:51:21 2024

On 3/12/24 02:14, Lawrence D'Oliveiro wrote:
...

The problem I have with that is the singular form of “library”. In a typical Linux distro, you could have thousands of libraries installed.
I just did this command on my Debian system:

dpkg-query -l lib*dev | wc -l

and the answer came back “1037”. The idea that a C-language implementation and run-time environment is any sense monolithic seems hopelessly out of touch.

It was never monolithic, any more than talking about the United States
implies that the United States is a monolithic entity. The standard
allows an implementation to have many separate parts, by failing to
prohibit such a structure - but everything it says about one is about
the implementation as a whole, not the individual parts.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Tue Mar 12 18:42:05 2024

On 2024-03-12, bart <bc@freeuk.com> wrote:

On 12/03/2024 14:46, Anton Shepelev wrote:

David Brown:

The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public functions
of a module.

What stops that function name clashing with the single function exported
from other people's modules?

There are multiple possible answers here.

One is that even if such functions have to be uniquely named, that is a
lesser burden than all API functions having to be uniquely named.
The probability of a clash is reduced, and at most one function
has to be renamed if it occurs.

There are ways that this single function can have the same name
in every component.

For instance, under Microsoft COM, COM DLLs provide well-known
functions like DllGetClassObject, DllRegisterServer and
DllUnregisterServer.

These don't clash since they are in different DLLs.

An appliation queries for the COM object using its class ID,
which is a GUID. That must be unique. DllRegisterServer registers
it in the registry.

Variations on this theme can be done in any system that has dynamic
libraries or loadable modules.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Tue Mar 12 18:50:14 2024

On 2024-03-12, bart <bc@freeuk.com> wrote:

On 12/03/2024 15:09, Anton Shepelev wrote:

bart:

Anton Shepelev:

David Brown:

The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)

My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.

What stops that function name clashing with the single
function exported from other people's modules?

A much lower probability.

I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.

It turned out that one of them used this line:

#include "string.h"

and the other used:

#include "malloc.h"

In the TXR project, I have a "signal.h" header, which must not resolve
to <signal.h>. I also have "time.h" and "termios.h", "glob.h",
"regex.h", "alloca.h".

Choosing header names that are distinct from an implementation's
headers is:

1) unnecessary due the local-first search strategy of #include "..."

2) a fool's errand.

Regarding (2), no name that you choose is guaranteed not to be identical
to something in the implementation! Suppose I panic and rename
my "time.h" to "foo.h". Who is to say that some implementation doesn't
have a <foo.h> header?

There is no such rule that when you name a "whatever.h", you must
ensure there does not exist a <whatever.h>.

People like reusing the same popular module names so much, they will
even use the names of standard headers!

Sometimes deliberately so. Why did I call that header "termios.h"
is that the module is relates to is related to the POSIX termios;
the source file is called termios.c and includes <termios.h> as
well as its own "termios.h". This makes things readable; someone
looking at the directory listing can guess that these files
constitute a module which wraps termios.

Any other naming would obscure that to some degree, other than
perhaps longer names that contain "termios" as a substring.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Blue-Maned_Hawk@21:1/5 to David Brown on Tue Mar 12 22:07:10 2024

David Brown wrote:

The limited support for avoiding name clashes in C (user-level C,
outside of the implementation internals) is certainly something that he
(or others) /could/ complain about. It is a well-known issue, and it's
a shame that the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple "namespace" solution
that would hugely simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to handle many cases.)

Many libraries put a prefix on their identifiers as a form of psuedonamespacing.

--
Blue-Maned_Hawk│shortens to Hawk│/blu.mɛin.dʰak/│he/him/his/himself/Mr. blue-maned_hawk.srht.site
The law prohibits underwater smoking!

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Tue Mar 12 21:31:28 2024

On Tue, 12 Mar 2024 11:51:21 -0400, James Kuyper wrote:

... everything it says about one is about
the implementation as a whole, not the individual parts.

You don’t see the problem with trying avoid clashes between those parts?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Kaz Kylheku on Tue Mar 12 22:29:45 2024

On 12/03/2024 18:50, Kaz Kylheku wrote:

On 2024-03-12, bart <bc@freeuk.com> wrote:

I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.

It turned out that one of them used this line:

#include "string.h"

and the other used:

#include "malloc.h"

In the TXR project, I have a "signal.h" header, which must not resolve
to <signal.h>. I also have "time.h" and "termios.h", "glob.h",
"regex.h", "alloca.h".

Choosing header names that are distinct from an implementation's
headers is:

1) unnecessary due the local-first search strategy of #include "..."

2) a fool's errand.

It's confusing. So "string.h" means the standard header, so it is the
same as <string.h>, unless it happens to find a file called string.h
amongst the project files.

That is undesirable, unless you specifically want to shadow the standard headers. In the examples I saw, that was not the case.

Regarding (2), no name that you choose is guaranteed not to be identical
to something in the implementation! Suppose I panic and rename
my "time.h" to "foo.h". Who is to say that some implementation doesn't
have a <foo.h> header?

The C implementation? Surely that will list all the system headers that
it provides; it looks quite easy to avoid a clash!

There is no such rule that when you name a "whatever.h", you must
ensure there does not exist a <whatever.h>.

You mean that programs should be allowed to do this:

#include <string.h>
#include "string.h"

With the two headers doing totally different things.

I can guess the reasons why such a rule doesn't exist, because so many
programs just carelessly used "..." instead of <...>, and they would all
break if it was imposed.

People like reusing the same popular module names so much, they will
even use the names of standard headers!

Sometimes deliberately so. Why did I call that header "termios.h"
is that the module is relates to is related to the POSIX termios;
the source file is called termios.c and includes <termios.h> as
well as its own "termios.h". This makes things readable; someone
looking at the directory listing can guess that these files
constitute a module which wraps termios.

So, is that /your/ file termios.c, or the one that implements the POSIX
termios code?

If it is your file, does it wrap the standard one, or replace it?
Generally if you want to wrap X, you call the wrapper Y; having both
called X is troublesome. Supposed somebody wanted to wrap your X, and
they wanted theirs called X too.

Suppose you want two different wrappers for X...

Any other naming would obscure that to some degree, other than
perhaps longer names that contain "termios" as a substring.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Richard Kettlewell on Tue Mar 12 23:13:55 2024

On Tue, 12 Mar 2024 08:03:50 +0000, Richard Kettlewell wrote:

That’s true, but AFAICT it’s exactly what Lawrence is complaining about: there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.

My specific complaint was about temporary names being used internal to
library macros. It’s a problem that is essentially impossible to solve
with string-based macro processors.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 00:23:58 2024

On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Tue, 12 Mar 2024 08:03:50 +0000, Richard Kettlewell wrote:

That’s true, but AFAICT it’s exactly what Lawrence is complaining about: >> there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.

My specific complaint was about temporary names being used internal to library macros. It’s a problem that is essentially impossible to solve
with string-based macro processors.

Firstly, C preprocessing is not exactly "string based" but token based.

A string or token based macro preprocessor could provide a way for
macros to obtain and use generated symbols (gensyms): identifiers
that are valid in the host language for variables and other uses,
but are uniquely generated within the translation unit.

Needless to say, the standard C preprocessing doesn't provide such a
thing, which is a problem.

But this issue independent of weaknesses caused by macro processing
being token or character based. A structural macro preprocessor that
doesn't provide gensyms or any equivalent form of hygiene would have the
same problem.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to bart on Wed Mar 13 02:53:26 2024

On 2024-03-12, bart <bc@freeuk.com> wrote:

On 12/03/2024 18:50, Kaz Kylheku wrote:

On 2024-03-12, bart <bc@freeuk.com> wrote:

I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.

It turned out that one of them used this line:

#include "string.h"

and the other used:

#include "malloc.h"

In the TXR project, I have a "signal.h" header, which must not resolve
to <signal.h>. I also have "time.h" and "termios.h", "glob.h",
"regex.h", "alloca.h".

Choosing header names that are distinct from an implementation's
headers is:

1) unnecessary due the local-first search strategy of #include "..."

2) a fool's errand.

It's confusing. So "string.h" means the standard header, so it is the
same as <string.h>, unless it happens to find a file called string.h
amongst the project files.

It's not confusing at all. In projects under my control, you would
never see #include "string.h" where the intent is to include <string.h>.
It is a lousy practice.

That is undesirable, unless you specifically want to shadow the standard headers. In the examples I saw, that was not the case.

You cannot shadow the standard headers when they are correctly included
using #include <...>, unless you resort to compiler specific tricks,
like reconfiguring the <...> search to look in a specified directory.

Regarding (2), no name that you choose is guaranteed not to be identical
to something in the implementation! Suppose I panic and rename
my "time.h" to "foo.h". Who is to say that some implementation doesn't
have a <foo.h> header?

The C implementation? Surely that will list all the system headers that
it provides; it looks quite easy to avoid a clash!

But there is no clash to avoid. A local header file that accidentally
has the same name as something in your /usr/include or whatever
is no problem at all, if you refer to it using #include "...".

But if there were a clash to be avoided, it would be tricky.

Not all headers are documented, so you would have to actually go looking
into the header file installation.

The information there is no reliable for portability, because all you
learn is what files are present in that installation.

There is no such rule that when you name a "whatever.h", you must
ensure there does not exist a <whatever.h>.

You mean that programs should be allowed to do this:

#include <string.h>
#include "string.h"

No I mean, that programs /are/.

With the two headers doing totally different things.

I can guess the reasons why such a rule doesn't exist, because so many programs just carelessly used "..." instead of <...>, and they would all break if it was imposed.

Those programs don't break, because if "string.h" doesn't exist,
then it is re-tried as if it were <string.h> (effectively).

(I don't think it's the best design; it would be better if "..."
and <...> looked in separate places with no fallback from one
to the other, such that #include "stdio.h" programs not
referencing a local file would break.)

So, is that /your/ file termios.c, or the one that implements the POSIX termios code?

Since my project isn't an operating system, or library for one,
but a dynamic language runtime, it has to be the former.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 03:36:11 2024

On 3/12/24 17:31, Lawrence D'Oliveiro wrote:

On Tue, 12 Mar 2024 11:51:21 -0400, James Kuyper wrote:

... everything it says about one is about
the implementation as a whole, not the individual parts.

You don’t see the problem with trying avoid clashes between those parts?

Yes, I do, and so do implementors. Avoiding those clashes is their responsibility. They are supposed to test their partial implementations
with implementations of other parts of C, and document which
combinations they claim qualify as conforming implementations of C. I'm
not saying this is required by the C standard, but only by their general responsibility to produce a usable product.

The C standard only governs things which claim to be conforming
implementations of C. It cannot constrain things for which no such claim
has been made. If you want to rely upon guarantees provided by the C
standard, only use things which claim to meet its requirements.

Therefore, your responsibility is to read the documentation of any
partial implementations you put together, and only put together partial implementations for which such a claim has been made.

Or, if you choose to mix and match partial implementations willy-nilly,
own your choice and don't complain about the results.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to bart on Wed Mar 13 14:12:48 2024

On Wed, 13 Mar 2024 11:45:31 +0000
bart <bc@freeuk.com> wrote:

This is something new I saw today: suppose I have hello.c in a
directory (hello.c uses '#include <stdio.h>').

If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.

It also seems, for a <...> file, to ignore the official repository
and look first within the user's project. So what exactly is the
difference between <...> and "..."? Is it just an extra set of backup
paths to look if it can't find anything within the user's files?

(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere
than the official headers location. I don't make a distinction
between <...> and "...".)

I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.

However all three took local file when I had given them an option -I.
Not sure what to make of this. Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Wed Mar 13 11:45:31 2024

On 12/03/2024 23:00, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

That is undesirable, unless you specifically want to shadow the
standard headers. In the examples I saw, that was not the case.

You didn't mention that. If you'd tell us what project you're talking
about, maybe we could discuss it. Perhaps

That's not really relevant. Suffice that they are amateur projects and
clearly they were using using string.h etc for their own purposes
without thinking.

Read section 6.10.2 of the standard. It describes the search rules for
the #include directive.

Not in N1570 it doesn't. It seems mainly concerned with the syntax.

I understand that the algorithm for finding include files was implementation-defined, and typically depended on these inputs:

* Whether the filename used "..." or <...>
* Whether the file-name specified was absolute or relative
* The path of the source file in which the #include occurs
* Possibly, the complete stack of paths for the current sequence set of
nested #includes
* Possibly, on the CWD
* On where the compiler keeps its standard headers (which in turn may
depend on OS)
* On the set of -I directives given to the compiler (this is
something outside the remit of the standard, AIUI)

To summarize, #include <foo.h> searches for a header (probably but not necessarily a file) identified by foo.h. #include "foo.h" searches for
a *file* called foo.h, and if that fails it then searches for a header identified by <foo.h>. The sequences for both searches are implementation-defined.

This is something new I saw today: suppose I have hello.c in a directory (hello.c uses '#include <stdio.h>').

If I create an empty file called 'stdio.h', then 4 compilers I tried all
picked up that file instead of their official stdio.h. That looks a
dangerous practice to me.

It also seems, for a <...> file, to ignore the official repository and
look first within the user's project. So what exactly is the difference
between <...> and "..."? Is it just an extra set of backup paths to look
if it can't find anything within the user's files?

(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere than
the official headers location. I don't make a distinction between <...>
and "...".)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Michael S on Wed Mar 13 15:48:00 2024

On 2024-03-13, Michael S <already5chosen@yahoo.com> wrote:

I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.

However all three took local file when I had given them an option -I.

Yes; the traditional -I option is idiotic for the majority of the use
cases for which it has actually been used. This is why GCC introduced
-iquote; that's what you want to be using within a project for
redirecting your local #include "...".

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Michael S on Wed Mar 13 16:40:10 2024

On 13/03/2024 13:12, Michael S wrote:

On Wed, 13 Mar 2024 11:45:31 +0000
bart <bc@freeuk.com> wrote:

This is something new I saw today: suppose I have hello.c in a
directory (hello.c uses '#include <stdio.h>').

If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.

I'd agree it seems a poor choice of defaults.

As you know, the details of the searching mechanism is implementation-dependent. But it is common practice to have two lists
of paths - one for "system headers" (for #include <...>), and one for
"user headers" (for #include "..."). (The C standards refer to the
former as "headers" and the later just as "source files for inclusion",
but I think the system/user header file distinction is more user-friendly!)

The system header path typically includes first the directory or
directories containing the standard library headers, but can also
include OS-specific headers, POSIX headers, and any other headers that
are considered part of the C implementation. (For cross-compilers,
these will directories with for the target's C library headers rather
than the host files.)

The user header path will typically be just the directory of the source
file, but may also include the current directory.

Command-line flags generally allow you to replace or extend these.

A failed search for a "..." file should move on to the system header
path, but not vice-versa.

So, which compilers (and relevant flags) had the system header search
start in the current directory? It is not disallowed, AFAIUI, but it
seems a bad idea to me.

It also seems, for a <...> file, to ignore the official repository
and look first within the user's project. So what exactly is the
difference between <...> and "..."? Is it just an extra set of backup
paths to look if it can't find anything within the user's files?

(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere
than the official headers location. I don't make a distinction
between <...> and "...".)

I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.

However all three took local file when I had given them an option -I.
Not sure what to make of this.

What exactly was the "-I" option you gave (and which compiler)? If you
wrote "-I.", including the dot, then gcc will act the way you describe -
that's what the flag does. If you put other directories in the -I list,
it does not look in the current directory.

<https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html>

Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From bart@21:1/5 to Keith Thompson on Wed Mar 13 16:44:54 2024

On 13/03/2024 16:03, Keith Thompson wrote:

bart <bc@freeuk.com> writes:

I understand that the algorithm for finding include files was
implementation-defined, and typically depended on these inputs:

* Whether the filename used "..." or <...>
* Whether the file-name specified was absolute or relative
* The path of the source file in which the #include occurs
* Possibly, the complete stack of paths for the current sequence set of
nested #includes
* Possibly, on the CWD
* On where the compiler keeps its standard headers (which in turn may
depend on OS)
* On the set of -I directives given to the compiler (this is
something outside the remit of the standard, AIUI)

Yes -- but you left out a major part of it, that if a search for a ""
header fails, it continues by treating it as if it were a <> header.

I didn't attempt to describe the algorithm, only to list the various
variables.

If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.

If they're behaving as you're describing, then they're not conforming.
I've tried gcc, clang, and tcc, and all pick up the correct <stdio.h>
header even if there's a "stdio.h" file in the current directory.

It's possible in principle that a compiler could include the current directory in the <> search path, but that would be surprising, and none
of the compilers I've tried do so.

What are these 4 compilers? Are you sure you used <stdio.h> and not "stdio.h"? Did you use any additional command line options? Might you
have set some environment variable like $C_INCLUDE PATH that affects the behavior?

My observations were erroneous. It turned out I'd been messing about
with hello.c (when investigating these matters a few days ago) so that
it was using "stdio.h" rather than <stdio.h>.

So 3 of the 4 compilers (gcc, tcc, dmc) behave as Michael S described,
and will only look at the current directory, or wherever the source file
is, if -I. or -Ipath is used.

The 4th compiler is lccwin32 which still looks first inside the same
directory as the source file (not CWD). I now remember this from many
years ago when it caused some trouble because of a rogue stdio.h lying
about.

(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere than
the official headers location. I don't make a distinction between
<...> and "...".)

Perhaps I'm missing something. If your compiler doesn't distinguish
between <> and "", then #include "stdio.h" should be equivalent to
#include <stdio.h>. You say it ignores a stdio.h file in the current directory. Then how can a source file include *any* header file in the current directory?

#include <stdio.h> // This includes the system header.
#include "stdio.h" // You say this ignores any local file and
// includes the system header.
#include "foo.h" // Does this not include a local file named "foo.h"?

<...> are not special, but it knows which are the standard headers
because there is a list of them in the compiler.

So if an include file is one of those, then it will look inside the
compiler because that's where the headers files are embedded. This
allows the compiler to be self-contained.

To override that, the `-ext` option is used, then an include file is
searched for like any ordinary file.

(Which is first in the directory containing the current source file,
then CWD, then any extra include paths specified.

I think it's more elaborate than that, because if this is a nested
include, there will be a stack of 'current source files' whose paths are searched in top-down order.

Probably CWD should not be part of this, but the process is already so convoluted that I can't be bothered to change it.

In my non-C language, any auxiliary files are looked in exactly ONE
location. Then you don't have the problem of C where if an include file
is missing or renamed, it might instead find an identically-named but
WRONG file in another location.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Nick Bowler@21:1/5 to Keith Thompson on Wed Mar 13 17:01:59 2024

On Tue, 12 Mar 2024 16:00:05 -0700, Keith Thompson wrote:

(The standard does mention the possibility that the "foo.h" search is
not supported. Any such implementation would not be able to handle user-defined header files; perhaps they would have to be installed as "headers" somehow. In every implementation I know about, the compiler
will *at least* find the foo.h file if it's in the same directory as
the file that includes it.

The POSIX-standard compiler commands (c89, c99) require #include
searches to work this way so it's not surprising that today, most
compilers work like this.

However some traditional compilers (for example, VAX C) search relative
to the current working directory of the running compiler for #include
"foo.h", rather than the directory containing the source file.

Standard C does not forbid such behaviour and standard C predates the
first version of POSIX.2 by a few years so there are probably some
standard C compilers that work like VAX C.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Mar 13 18:47:38 2024

On 13/03/2024 16:15, Keith Thompson wrote:

Michael S <already5chosen@yahoo.com> writes:
[...]

I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.

However all three took local file when I had given them an option -I.
Not sure what to make of this. Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.

According to the GNU cpp manual, the "-I" option prepends directories to
the search path used for <> headers, and the "-iquote" option prepends directories to the search path used for "" headers.

I find that a bit surprising.

You are not quite correct - and I find /that/ a bit surprising!

The -I option applies equally to <...> and "..." headers.

I had never heard of the "-iquote"
option.

The complete description is here: <https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html>. (It is the
same in the cpp manual, but it is better to look at the compiler manual
for this kind of thing - after all, the gcc driver program can pass
different options to the cpp subprogram.)

To save people looking it up:

-I dir
-iquote dir
-isystem dir
-idirafter dir

Add the directory dir to the list of directories to be searched for
header files during preprocessing. If dir begins with ‘=’ or $SYSROOT,
then the ‘=’ or $SYSROOT is replaced by the sysroot prefix; see
--sysroot and -isysroot.

Directories specified with -iquote apply only to the quote form of
the directive, #include "file". Directories specified with -I, -isystem,
or -idirafter apply to lookup for both the #include "file" and #include
<file> directives.

You can specify any number or combination of these options on the
command line to search for header files in several directories. The
lookup order is as follows:

For the quote form of the include directive, the directory of
the current file is searched first.
For the quote form of the include directive, the directories
specified by -iquote options are searched in left-to-right order, as
they appear on the command line.
Directories specified with -I options are scanned in
left-to-right order.
Directories specified with -isystem options are scanned in left-to-right order.
Standard system directories are scanned.
Directories specified with -idirafter options are scanned in left-to-right order.

You can use -I to override a system header file, substituting your
own version, since these directories are searched before the standard
system header file directories. However, you should not use this option
to add directories that contain vendor-supplied system header files; use -isystem for that.

The -isystem and -idirafter options also mark the directory as a
system directory, so that it gets the same special treatment that is
applied to the standard system directories.

If a standard system include directory, or a directory specified
with -isystem, is also specified with -I, the -I option is ignored. The directory is still searched but as a system directory at its normal
position in the system include chain. This is to ensure that GCC’s
procedure to fix buggy system headers and the ordering for the
#include_next directive are not inadvertently changed. If you really
need to change the search order for system directories, use the
-nostdinc and/or -isystem options.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Wed Mar 13 22:07:23 2024

On 13/03/2024 19:56, Keith Thompson wrote:

David Brown <david.brown@hesbynett.no> writes:

On 13/03/2024 16:15, Keith Thompson wrote:

Michael S <already5chosen@yahoo.com> writes:
[...]

I just tried three compilers and [in absence of -I options] all 3 work >>>> as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.

However all three took local file when I had given them an option -I.
Not sure what to make of this. Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.

According to the GNU cpp manual, the "-I" option prepends
directories to
the search path used for <> headers, and the "-iquote" option prepends
directories to the search path used for "" headers.
I find that a bit surprising.

You are not quite correct - and I find /that/ a bit surprising!

The -I option applies equally to <...> and "..." headers.

I believe that's consistent with what I wrote, though I probably wasn't
clear enough.

Yes, after reading your expanded explanation, I agree with you here
(both parts). Thanks for that clarification.

To expand on it:

There are two lists of locations (typically directories). Let's call
them the <>-list (described in N1570 6.10.2p2) and the ""-list
(described in N1570 6.10.2p3).

#include <foo.h> searches the <>-list.

#include "foo.h" searches the ""-list; if that fails, it then searches
the <>-list as if for #include <foo.h>.

The -I option prepends directories to the <>-list, which means it
affects both #include <foo.h> and #include "foo.h". But for
#include "foo.h", a foo.h file in the same directory as the including
file will always be found first if it exists.

[...]

The complete description is here:
<https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html>.

[...]
[slightly reformatting quoted text]

1. For the quote form of the include directive, the directory of
the current file is searched first.

2. For the quote form of the include directive, the directories
specified by -iquote options are searched in left-to-right
order, as they appear on the command line.

3. Directories specified with -I options are scanned in
left-to-right order.

4. Directories specified with -isystem options are scanned in
left-to-right order.

5. Standard system directories are scanned.

6. Directories specified with -idirafter options are scanned in
left-to-right order.

What I called the ""-list is described in steps 1-2.
What I called the <>-list is described in steps 3-6.

[...]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Wed Mar 13 21:51:24 2024

On Wed, 13 Mar 2024 08:06:28 -0700, Keith Thompson wrote:

Any library from outside the implementation ...

So you are now distinguishing between libraries from “outside the implementation” from those that are within it? And saying that those from “outside” have no right to use mechanisms (such as they are) to minimize name conflicts with regular user code?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 22:10:04 2024

On 2024-03-13, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Wed, 13 Mar 2024 03:36:11 -0400, James Kuyper wrote:

Yes, I do, and so do implementors. Avoiding those clashes is their
responsibility.

Implementors of the C standard? What about providers of other libraries?

One possible point of view is that the integrators who put together a
GNU/Linux distro effectively take on the role of C implementors.

If a clash takes place among any libraries in Debian or Alpine or GNU
Guix or what have you, you can regard that as a bug in the distro. The
distro can fix it however they see fit: apply a local patch to one or
more libraries, and get possibly get that upstreamed, or not.

(It makes sense to get that upstreamed, because other distros are all
building most of the same libraries; a clash between libraries can
affect any distro the same way as any other.)

In any case, the C standard doesn't distinguish any party other than implementor and user. Libraries that are not in the implementation are
in the program being presented for translation and linkage, and
clashes in the program are the program's problem.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to James Kuyper on Wed Mar 13 21:52:27 2024

On Wed, 13 Mar 2024 03:36:11 -0400, James Kuyper wrote:

Yes, I do, and so do implementors. Avoiding those clashes is their responsibility.

Implementors of the C standard? What about providers of other libraries?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 22:16:08 2024

On 2024-03-13, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Wed, 13 Mar 2024 08:06:28 -0700, Keith Thompson wrote:

Any library from outside the implementation ...

So you are now distinguishing between libraries from “outside the implementation” from those that are within it? And saying that those from “outside” have no right to use mechanisms (such as they are) to minimize name conflicts with regular user code?

Libraries that are not part of the implementation are part of the
C program being presented to the C implementation for translation and
linkage.

They are subject to the requirements placed on a program.

If a program uses identifiers that, for instance, start with double underscores, then its behavior is undefined, and that goes for those
parts of the program which are libraries.

From the C point of view, external libraries are just translation units
that have been translated and retained in that form, accompanied by
header files.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 22:33:02 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 13 Mar 2024 08:06:28 -0700, Keith Thompson wrote:

Any library from outside the implementation ...

So you are now distinguishing between libraries from “outside the >implementation” from those that are within it? And saying that those from >“outside” have no right to use mechanisms (such as they are) to minimize >name conflicts with regular user code?

The implementation does not include third-party libraries.

Third party libraries are allowed to use any mechanism they
like to minimize name conflicts other than prefixing with
two underscores.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Wed Mar 13 22:50:07 2024

On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:

Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.

But there is no other such mechanism available.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 23:15:01 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:

Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.

But there is no other such mechanism available.

Of course there is. The most simple is to prefix
any external symbols with a library specific
prefix.

Or to suffix any symbol with two underscore characters
and a library-specific string.

Or obfuscate the extern symbols in the library and use #define
macros to map a readable name to the obfuscated symbol
in the corresponding header file(s).

As has been pointed out, extending the double-leading
underscore mechanism outside the implementation requires
some central authority to manage names across all
third-party libraries. (which is generally true for any prefix
mechanism, whether it is __ or fubar_).

We seem to have survived just fine without any such
disambiguation mechanism to date.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From James Kuyper@21:1/5 to Lawrence D'Oliveiro on Wed Mar 13 20:47:11 2024

On 3/13/24 17:52, Lawrence D'Oliveiro wrote:

On Wed, 13 Mar 2024 03:36:11 -0400, James Kuyper wrote:

Yes, I do, and so do implementors. Avoiding those clashes is their
responsibility.

Implementors of the C standard? What about providers of other libraries?

Avoiding conflicts is their responsibility, obviously. As a general
rule, they do so by choosing a library-specific prefix to use on all identifiers that might otherwise cause problems.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Thu Mar 14 00:40:31 2024

On 2024-03-13, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:

Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.

But there is no other such mechanism available.

Are you aware that working third party libraries exist, and name
collisions are fairly rare? How do you think that's possible?

It's possible because firstly, even if there are collisions latent
in the library mixture, not all libraries are used together in one
application. E.g. in an OS distro installation there might be a thousand libraries, but no application uses all thousand.

Secondly, even if two libraries are used in the same application, where
those libraries have a header-file-level clash, the clash only occurs if
their headers are included in the same translation unit in that program.

Thirdly, mere linkage of two libraries into the same program can only
cause a clash if it involves an external name.

Fourth, even if two libraries have a clashing external name, I think
that under certain dynamic linking paradigms, this is only a problem if
that name is used. If the same name refers to multiple entities, there
is an ambiguity, but if the program doesn't use that name, then the
ambiguity doesn't matter.

Fifth, if we are talking specifically about names used by macros for
naming local symbols inserted into the program, libraries not in the C implementation in fact can get away with using the __ space. If these identifiers don't land on a compiler keyword, there is no actual
problem. Now a third party library could choose such a name inside its
macro such that the C library has also used the same name inside its
macro. But for that to cause a clash, the macros have to be nested
together.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Kaz Kylheku@21:1/5 to Keith Thompson on Fri Mar 15 03:17:51 2024

On 2024-03-15, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>>> author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>>> this header. (I offer no opinion on whether that's a good tradeoff.) >>>[...]

An older version did use memset(). It was changed to use a loop in
1997, with a commit message that included:
"Don't use memset to prevent prototype trouble, use simple loop."
It may have been to avoid problems with pre-ANSI C compilers that didn't >>>support prototypes. That's still speculation on my part.

Here's the full fc20 version:
$ rpm -q -f /usr/include/bits/select.h
glibc-headers-2.18-19.fc20.x86_64

[...]

#if defined __GNUC__ && __GNUC__ >= 2

Whoever wrote this didn't know that if __GNUC__ doesn't exist, it will
expand as 0, which is false, so this is equivalent to just

#if __GNUC__ >= 2

[...]

#else /* ! GNU CC */

[...]

#endif /* GNU CC */

[...]

I don't see that in the GNU glibc git repo, even on branches with
"fedora" in their names. Perhaps it's a change applied by Red Hat and
not propagated upstream. (Though I'm not sure why Red Hat would need to allow for glibc not being compiled by gcc.)

That it's checking for GNU C at least 2 suggests it is an old
patch.

If you look at the source directory for Fedora's package, you see
there are a bunch of patches that get applied:

https://src.fedoraproject.org/rpms/glibc/tree/rawhide

Now if we go to "f20", a heck of lot more patches were applied:

https://src.fedoraproject.org/rpms/glibc/tree/f20

Now, don't waste your time; it's not in any of those patches (I looked).

f20 references glibc-2.18.tar.gz, and that's where that code is found,
in this file:

./glibc-2.18/sysdeps/x86/bits/select.h

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Fri Mar 15 09:22:54 2024

On 15/03/2024 04:44, Keith Thompson wrote:

Kaz Kylheku <433-929-6894@kylheku.com> writes:
[...]

#if defined __GNUC__ && __GNUC__ >= 2

Whoever wrote this didn't know that if __GNUC__ doesn't exist, it will
expand as 0, which is false, so this is equivalent to just

#if __GNUC__ >= 2

Or they did know that and decided that the longer version would be clearer.

Or they did know, and decided they did not want a spurious warning when compiling with "-Wundef" that generates a warning before replacing
undefined identifiers with 0 in #if directives. Personally, I always
use -Wundef in my own code, because I think the "default to 0" treatment
makes it far too easy for typos to go unnoticed. I have no idea if the
glibc folk agree with that and like to use -Wundef themselves, or if
they just like to make their code as "warning-proof" as possible.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Keith Thompson on Fri Mar 15 09:15:53 2024

On 15/03/2024 01:11, Keith Thompson wrote:

scott@slp53.sl.home (Scott Lurndal) writes:

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>> author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>> this header. (I offer no opinion on whether that's a good tradeoff.)

Note that __FD_ZERO is very clearly *not* intended to be invoked by
arbitrary code.

```

[code snipped]

```

That code is only selected if it is not compiled with
gcc. If it is gcc 2 or later, the header file uses

# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)

Oh? I don't see that code anywhere in the current glibc sources, in any older version of bits/select.h, or anywhere under /usr/include on my
system.

I see it in my older Mint system (based on Ubuntu bionic 18.04 LTS), but
not my newer one (based on Ubuntu jammy 22.04 LTS). So it looks like
was an optimisation that was useful in the past, but newer gcc gives the
same or better code from the pure C code. Keeping it in C rather than
inline assembly gives the compiler more information, even if the
generated object code is still the same, so that's always a good thing.

(And full memory clobbers are undesirable in performance code because it
can mean the compiler loses useful knowledge or has to re-load other
data from memory.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	299
Nodes:	16 (2 / 14)
Uptime:	55:52:32
Calls:	6,690
Files:	12,225
Messages:	5,345,062

Word For Today: =?UTF-8?B?4oCcVWdsaWZpY2F0aW9u4oCd?=

Who's Online

System Info