If the standard library includes code from two or more different implementers, all implementers have a very strong interest in avoiding
any clashes. I don't see a real problem here.
From /usr/include/«arch»/bits/select.h on my Debian system:
#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
They are called "reserved identifiers", a name which more directly
addresses their purpose. They don't just start with underscores - there
are several different sets of identifiers, reserved for different
purposes. See section 7.1.3 for details. They are provided by *an* implementation. Note the use of the singular.
From /usr/include/«arch»/bits/select.h on my Debian system:
#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)
Note how this macro brings the entire expression for “s” into the
scope containing those temporary “__i” and “__arr” variables. You just
better hope they won’t clash.
I think there is a clause in the C spec that says names beginning with underscores (“uglified” names, I think they’re called) are reserved
for library implementors or something. But what happens if one library implementation depends on another? What keeps the choices of names
from clashing in that situation? Just luck, I guess.
From /usr/include/«arch»/bits/select.h on my Debian system:
#define __FD_ZERO(s) \
do { \
unsigned int __i; \
fd_set *__arr = (s); \
for (__i = 0; __i < sizeof (fd_set) / sizeof (__fd_mask); ++__i) \
__FDS_BITS (__arr)[__i] = 0; \
} while (0)
Note how this macro brings the entire expression for “s” into the
scope containing those temporary “__i” and “__arr” variables. You just
better hope they won’t clash.
I think there is a clause in the C spec that says names beginning with underscores (“uglified” names, I think they’re called) are reserved
for library implementors or something. But what happens if one library implementation depends on another? What keeps the choices of names
from clashing in that situation? Just luck, I guess.
I just did this command on my Debian system:
dpkg-query -l lib*dev | wc -l
The problem I have with that is the singular form of “library”. In a typical Linux distro, you could have thousands of libraries installed.
I just did this command on my Debian system:
dpkg-query -l lib*dev | wc -l
and the answer came back “1037”. The idea that a C-language implementation and run-time environment is any sense monolithic seems hopelessly out of touch.
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-03-12, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
and the answer came back “1037”. The idea that a C-language
implementation and run-time environment is any sense monolithic seems
hopelessly out of touch.
There is no such out-of-touch idea. In (say) a Glibc-based system, only
the GCC, Glibc and kernel headers are part of the implementation (which
comprises C, POSIX plus GNU and Linux extensions), and only the GCC and
Glibc library components and their external names.
Other libraries are third parties; the __ and _[A-Z] namespace
simply doesn't belong to them.
C doesn't provide any special tools for the application developer and
third party code to avoid clashes among themselves.
That’s true, but AFAICT it’s exactly what Lawrence is complaining about: there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.
The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)
David Brown:
The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)
My hypothetical solution is to have a single function
returning a struct with pointers to all the public functions
of a module.
Anton Shepelev:
David Brown:
The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)
My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.
What stops that function name clashing with the single
function exported from other people's modules?
bart:
Anton Shepelev:
David Brown:
The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)
My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.
What stops that function name clashing with the single
function exported from other people's modules?
A much lower probability.
returning a struct with pointers to all the public
functions of a module.
The problem I have with that is the singular form of “library”. In a typical Linux distro, you could have thousands of libraries installed.
I just did this command on my Debian system:
dpkg-query -l lib*dev | wc -l
and the answer came back “1037”. The idea that a C-language implementation and run-time environment is any sense monolithic seems hopelessly out of touch.
On 12/03/2024 14:46, Anton Shepelev wrote:
David Brown:What stops that function name clashing with the single function exported
The limited support for avoiding name clashes in C (user-
level C, outside of the implementation internals) is
certainly something that he (or others) /could/ complain
about. It is a well-known issue, and it's a shame that
the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple
"namespace" solution that would hugely simplify avoiding
identifier clashes. (It wouldn't help for macros, but we
have inline functions to handle many cases.)
My hypothetical solution is to have a single function
returning a struct with pointers to all the public functions
of a module.
from other people's modules?
On 12/03/2024 15:09, Anton Shepelev wrote:
bart:
Anton Shepelev:
David Brown:
The limited support for avoiding name clashes in C
(user-level C, outside of the implementation
internals) is certainly something that he (or others)
/could/ complain about. It is a well-known issue, and
it's a shame that the C standards committee have never
dealt with it. I don't see why the language could not
adopt a simple "namespace" solution that would hugely
simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to
handle many cases.)
My hypothetical solution is to have a single function
returning a struct with pointers to all the public
functions of a module.
What stops that function name clashing with the single
function exported from other people's modules?
A much lower probability.
I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.
It turned out that one of them used this line:
#include "string.h"
and the other used:
#include "malloc.h"
People like reusing the same popular module names so much, they will
even use the names of standard headers!
The limited support for avoiding name clashes in C (user-level C,
outside of the implementation internals) is certainly something that he
(or others) /could/ complain about. It is a well-known issue, and it's
a shame that the C standards committee have never dealt with it. I
don't see why the language could not adopt a simple "namespace" solution
that would hugely simplify avoiding identifier clashes. (It wouldn't
help for macros, but we have inline functions to handle many cases.)
... everything it says about one is about
the implementation as a whole, not the individual parts.
On 2024-03-12, bart <bc@freeuk.com> wrote:
I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.
It turned out that one of them used this line:
#include "string.h"
and the other used:
#include "malloc.h"
In the TXR project, I have a "signal.h" header, which must not resolve
to <signal.h>. I also have "time.h" and "termios.h", "glob.h",
"regex.h", "alloca.h".
Choosing header names that are distinct from an implementation's
headers is:
1) unnecessary due the local-first search strategy of #include "..."
2) a fool's errand.
Regarding (2), no name that you choose is guaranteed not to be identical
to something in the implementation! Suppose I panic and rename
my "time.h" to "foo.h". Who is to say that some implementation doesn't
have a <foo.h> header?
There is no such rule that when you name a "whatever.h", you must
ensure there does not exist a <whatever.h>.
People like reusing the same popular module names so much, they will
even use the names of standard headers!
Sometimes deliberately so. Why did I call that header "termios.h"
is that the module is relates to is related to the POSIX termios;
the source file is called termios.c and includes <termios.h> as
well as its own "termios.h". This makes things readable; someone
looking at the directory listing can guess that these files
constitute a module which wraps termios.
Any other naming would obscure that to some degree, other than
perhaps longer names that contain "termios" as a substring.
That’s true, but AFAICT it’s exactly what Lawrence is complaining about: there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.
On Tue, 12 Mar 2024 08:03:50 +0000, Richard Kettlewell wrote:
That’s true, but AFAICT it’s exactly what Lawrence is complaining about: >> there’s nothing in the language spec to help those thousand other
libraries avoid name clashes.
My specific complaint was about temporary names being used internal to library macros. It’s a problem that is essentially impossible to solve
with string-based macro processors.
On 12/03/2024 18:50, Kaz Kylheku wrote:
On 2024-03-12, bart <bc@freeuk.com> wrote:
I tried my C compiler with a couple of open source projects recently
that both failed for the same mysterious reason.
It turned out that one of them used this line:
#include "string.h"
and the other used:
#include "malloc.h"
In the TXR project, I have a "signal.h" header, which must not resolve
to <signal.h>. I also have "time.h" and "termios.h", "glob.h",
"regex.h", "alloca.h".
Choosing header names that are distinct from an implementation's
headers is:
1) unnecessary due the local-first search strategy of #include "..."
2) a fool's errand.
It's confusing. So "string.h" means the standard header, so it is the
same as <string.h>, unless it happens to find a file called string.h
amongst the project files.
That is undesirable, unless you specifically want to shadow the standard headers. In the examples I saw, that was not the case.
Regarding (2), no name that you choose is guaranteed not to be identical
to something in the implementation! Suppose I panic and rename
my "time.h" to "foo.h". Who is to say that some implementation doesn't
have a <foo.h> header?
The C implementation? Surely that will list all the system headers that
it provides; it looks quite easy to avoid a clash!
There is no such rule that when you name a "whatever.h", you must
ensure there does not exist a <whatever.h>.
You mean that programs should be allowed to do this:
#include <string.h>
#include "string.h"
With the two headers doing totally different things.
I can guess the reasons why such a rule doesn't exist, because so many programs just carelessly used "..." instead of <...>, and they would all break if it was imposed.
So, is that /your/ file termios.c, or the one that implements the POSIX termios code?
On Tue, 12 Mar 2024 11:51:21 -0400, James Kuyper wrote:
... everything it says about one is about
the implementation as a whole, not the individual parts.
You don’t see the problem with trying avoid clashes between those parts?
This is something new I saw today: suppose I have hello.c in a
directory (hello.c uses '#include <stdio.h>').
If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.
It also seems, for a <...> file, to ignore the official repository
and look first within the user's project. So what exactly is the
difference between <...> and "..."? Is it just an extra set of backup
paths to look if it can't find anything within the user's files?
(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere
than the official headers location. I don't make a distinction
between <...> and "...".)
bart <bc@freeuk.com> writes:
That is undesirable, unless you specifically want to shadow the
standard headers. In the examples I saw, that was not the case.
You didn't mention that. If you'd tell us what project you're talking
about, maybe we could discuss it. Perhaps
Read section 6.10.2 of the standard. It describes the search rules for
the #include directive.
To summarize, #include <foo.h> searches for a header (probably but not necessarily a file) identified by foo.h. #include "foo.h" searches for
a *file* called foo.h, and if that fails it then searches for a header identified by <foo.h>. The sequences for both searches are implementation-defined.
I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.
However all three took local file when I had given them an option -I.
On Wed, 13 Mar 2024 11:45:31 +0000
bart <bc@freeuk.com> wrote:
This is something new I saw today: suppose I have hello.c in a
directory (hello.c uses '#include <stdio.h>').
If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.
It also seems, for a <...> file, to ignore the official repository
and look first within the user's project. So what exactly is the
difference between <...> and "..."? Is it just an extra set of backup
paths to look if it can't find anything within the user's files?
(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere
than the official headers location. I don't make a distinction
between <...> and "...".)
I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.
However all three took local file when I had given them an option -I.
Not sure what to make of this.
Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.
bart <bc@freeuk.com> writes:
I understand that the algorithm for finding include files was
implementation-defined, and typically depended on these inputs:
* Whether the filename used "..." or <...>
* Whether the file-name specified was absolute or relative
* The path of the source file in which the #include occurs
* Possibly, the complete stack of paths for the current sequence set of
nested #includes
* Possibly, on the CWD
* On where the compiler keeps its standard headers (which in turn may
depend on OS)
* On the set of -I directives given to the compiler (this is
something outside the remit of the standard, AIUI)
Yes -- but you left out a major part of it, that if a search for a ""
header fails, it continues by treating it as if it were a <> header.
If I create an empty file called 'stdio.h', then 4 compilers I tried
all picked up that file instead of their official stdio.h. That looks
a dangerous practice to me.
If they're behaving as you're describing, then they're not conforming.
I've tried gcc, clang, and tcc, and all pick up the correct <stdio.h>
header even if there's a "stdio.h" file in the current directory.
It's possible in principle that a compiler could include the current directory in the <> search path, but that would be surprising, and none
of the compilers I've tried do so.
What are these 4 compilers? Are you sure you used <stdio.h> and not "stdio.h"? Did you use any additional command line options? Might you
have set some environment variable like $C_INCLUDE PATH that affects the behavior?
(The 5th compiler I tried ignored it and worked as normal; that was
mine. I can make it fail using my '-ext' option to look elsewhere than
the official headers location. I don't make a distinction between
<...> and "...".)
Perhaps I'm missing something. If your compiler doesn't distinguish
between <> and "", then #include "stdio.h" should be equivalent to
#include <stdio.h>. You say it ignores a stdio.h file in the current directory. Then how can a source file include *any* header file in the current directory?
#include <stdio.h> // This includes the system header.
#include "stdio.h" // You say this ignores any local file and
// includes the system header.
#include "foo.h" // Does this not include a local file named "foo.h"?
(The standard does mention the possibility that the "foo.h" search is
not supported. Any such implementation would not be able to handle user-defined header files; perhaps they would have to be installed as "headers" somehow. In every implementation I know about, the compiler
will *at least* find the foo.h file if it's in the same directory as
the file that includes it.
Michael S <already5chosen@yahoo.com> writes:
[...]
I just tried three compilers and [in absence of -I options] all 3 work
as expected, i.e. ignored stdio.h in current directory.
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.
However all three took local file when I had given them an option -I.
Not sure what to make of this. Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.
According to the GNU cpp manual, the "-I" option prepends directories to
the search path used for <> headers, and the "-iquote" option prepends directories to the search path used for "" headers.
I find that a bit surprising.
I had never heard of the "-iquote"
option.
David Brown <david.brown@hesbynett.no> writes:
On 13/03/2024 16:15, Keith Thompson wrote:
Michael S <already5chosen@yahoo.com> writes:
[...]
I just tried three compilers and [in absence of -I options] all 3 work >>>> as expected, i.e. ignored stdio.h in current directory.According to the GNU cpp manual, the "-I" option prepends
None of the three was of the variety that you appear to prefer.
Mine's are mundane stuff.
However all three took local file when I had given them an option -I.
Not sure what to make of this. Whatever happens with
non-default options is probably in "implementation-defined"
domain as far as the C Standard is concerned, but I still
expected that such common option as -I would not affect standard
headers.
directories to
the search path used for <> headers, and the "-iquote" option prepends
directories to the search path used for "" headers.
I find that a bit surprising.
You are not quite correct - and I find /that/ a bit surprising!
The -I option applies equally to <...> and "..." headers.
I believe that's consistent with what I wrote, though I probably wasn't
clear enough.
To expand on it:
There are two lists of locations (typically directories). Let's call
them the <>-list (described in N1570 6.10.2p2) and the ""-list
(described in N1570 6.10.2p3).
#include <foo.h> searches the <>-list.
#include "foo.h" searches the ""-list; if that fails, it then searches
the <>-list as if for #include <foo.h>.
The -I option prepends directories to the <>-list, which means it
affects both #include <foo.h> and #include "foo.h". But for
#include "foo.h", a foo.h file in the same directory as the including
file will always be found first if it exists.
[...]
The complete description is here:[...]
<https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html>.
[slightly reformatting quoted text]
1. For the quote form of the include directive, the directory of
the current file is searched first.
2. For the quote form of the include directive, the directories
specified by -iquote options are searched in left-to-right
order, as they appear on the command line.
3. Directories specified with -I options are scanned in
left-to-right order.
4. Directories specified with -isystem options are scanned in
left-to-right order.
5. Standard system directories are scanned.
6. Directories specified with -idirafter options are scanned in
left-to-right order.
What I called the ""-list is described in steps 1-2.
What I called the <>-list is described in steps 3-6.
[...]
Any library from outside the implementation ...
On Wed, 13 Mar 2024 03:36:11 -0400, James Kuyper wrote:
Yes, I do, and so do implementors. Avoiding those clashes is their
responsibility.
Implementors of the C standard? What about providers of other libraries?
Yes, I do, and so do implementors. Avoiding those clashes is their responsibility.
On Wed, 13 Mar 2024 08:06:28 -0700, Keith Thompson wrote:
Any library from outside the implementation ...
So you are now distinguishing between libraries from “outside the implementation” from those that are within it? And saying that those from “outside” have no right to use mechanisms (such as they are) to minimize name conflicts with regular user code?
On Wed, 13 Mar 2024 08:06:28 -0700, Keith Thompson wrote:
Any library from outside the implementation ...
So you are now distinguishing between libraries from “outside the >implementation” from those that are within it? And saying that those from >“outside” have no right to use mechanisms (such as they are) to minimize >name conflicts with regular user code?
Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.
On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:
Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.
But there is no other such mechanism available.
On Wed, 13 Mar 2024 03:36:11 -0400, James Kuyper wrote:
Yes, I do, and so do implementors. Avoiding those clashes is their
responsibility.
Implementors of the C standard? What about providers of other libraries?
Lawrence D'Oliveiro <ldo@nz.invalid> writes:
On Wed, 13 Mar 2024 22:33:02 GMT, Scott Lurndal wrote:
Third party libraries are allowed to use any mechanism they like to
minimize name conflicts other than prefixing with two underscores.
But there is no other such mechanism available.
Are you aware that working third party libraries exist, and name
collisions are fairly rare? How do you think that's possible?
scott@slp53.sl.home (Scott Lurndal) writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...]
For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>>> author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>>> this header. (I offer no opinion on whether that's a good tradeoff.) >>>[...]
An older version did use memset(). It was changed to use a loop in
1997, with a commit message that included:
"Don't use memset to prevent prototype trouble, use simple loop."
It may have been to avoid problems with pre-ANSI C compilers that didn't >>>support prototypes. That's still speculation on my part.
Here's the full fc20 version:
$ rpm -q -f /usr/include/bits/select.h
glibc-headers-2.18-19.fc20.x86_64
#if defined __GNUC__ && __GNUC__ >= 2
[...]
#else /* ! GNU CC */[...]
#endif /* GNU CC */[...]
I don't see that in the GNU glibc git repo, even on branches with
"fedora" in their names. Perhaps it's a change applied by Red Hat and
not propagated upstream. (Though I'm not sure why Red Hat would need to allow for glibc not being compiled by gcc.)
Kaz Kylheku <433-929-6894@kylheku.com> writes:
[...]
#if defined __GNUC__ && __GNUC__ >= 2
Whoever wrote this didn't know that if __GNUC__ doesn't exist, it will
expand as 0, which is false, so this is equivalent to just
#if __GNUC__ >= 2
Or they did know that and decided that the longer version would be clearer.
scott@slp53.sl.home (Scott Lurndal) writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:[code snipped]
For context, here's the entire file from my system (Ubuntu 24.0.4,
package libc6-dev:amd64 2.35-0ubuntu3.6). I get the impression that the >>> author(s) decided not to use memset to avoid the required #include,
which might increase compilation times for code that indirectly includes >>> this header. (I offer no opinion on whether that's a good tradeoff.)
Note that __FD_ZERO is very clearly *not* intended to be invoked by
arbitrary code.
```
```
That code is only selected if it is not compiled with
gcc. If it is gcc 2 or later, the header file uses
# define __FD_ZERO(fdsp) \
do { \
int __d0, __d1; \
__asm__ __volatile__ ("cld; rep; " __FD_ZERO_STOS \
: "=c" (__d0), "=D" (__d1) \
: "a" (0), "0" (sizeof (fd_set) \
/ sizeof (__fd_mask)), \
"1" (&__FDS_BITS (fdsp)[0]) \
: "memory"); \
} while (0)
Oh? I don't see that code anywhere in the current glibc sources, in any older version of bits/select.h, or anywhere under /usr/include on my
system.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 299 |
Nodes: | 16 (2 / 14) |
Uptime: | 55:52:32 |
Calls: | 6,690 |
Files: | 12,225 |
Messages: | 5,345,062 |