• ld order of .o and .a

    From Siri Cruise@21:1/5 to Kaz Kylheku on Thu Feb 11 18:51:30 2021
    In article <20210211160324.919@kylheku.com>,
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    You can use the -( and -) arguments to parenthesize groups of .a
    files which are then mutually resolved.

    I'll have to remember that if it comes up again.

    Using this option has a significant performance cost. It is best

    I'm more concerned about the cognitive costs than the performance
    cost.

    --
    :-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
    'I desire mercy, not sacrifice.' /|\ Discordia: not just a religion but also a parody. This post / \
    I am an Andrea Doria sockpuppet. insults Islam. Mohammed

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philip Guenther@21:1/5 to Siri Cruise on Thu Feb 11 21:20:41 2021
    On Thursday, February 11, 2021 at 7:26:46 PM UTC-8, Siri Cruise wrote:
    In article
    <87v9ayk...@doppelsaurus.mobileactivedefense.com>,
    Rainer Weikusat <rwei...@talktalk.net> wrote:

    That's the GNU binutils linker and changing the default behaviour would probably break backwards-compatibility.
    I don't see how. Modern machines have enough memory to put
    definitions in a hash table by name and object file. If the
    library name is repeated, a double indexing allows the previous
    definitions to be deleted as the new object definitions are
    loaded. I have no pity anyone using load order for such esoteric
    reasons that that would still break.

    I started unices in 1980. I appreciated at the time why ld had to
    be stupid. I didn't expect people demand ld remain stupid when vm
    broke the 65KB and then 4MB boundaries.

    "History and economics matter"

    I assume I'm mistaken in part of this, but I believe that binutils ld exists because Cygnus needed a linker and had (created) the paying market of developers for embedded CPUs to support the core development; porting to the commercial unices was a stream
    of contributions on top of that, but the base exists because there were enough companies who could spend less by paying Cygnus to provide the common work.

    The second linker, gold, exists because a single company (Google) had resources to spare and saw a benefit worth the cost, to the point that I have heard that for some period of time (and still?) Chrome on { Android, Linux } could only be linked with
    gold and not with binutils ld because it's Just Too Big.

    However, gold is *not* a 100% replacement for binutils ld. both because of what might be called "differences in implementation defined behavior" (for the targets it supports) and because the latter supports _many_ more targets. The former means that
    making it the default would cause stuff to break which worked before and thus cost OS integrator effort, while the latter meant that no matter how much effort was spent on that you could never drop binutils ld. As an OS integrator, if you'll always have
    to deal with binutils ld then why waste time on gold? Let that cost be carried by the software projects which required it.

    The third linker, lld in the llvm project, again seems to exist because a company had needs not met by binutils ld (license, performance, support for execute-only-memory, whatever) and funded it and then the developers found an umbrella (llvm project)
    under which that could be shared and become a more inclusive community. Unlike gold, the differences from binutils ld (licence, performance, etc) where substantial enough for at least some OS integrators to get behind it and help build its community.
    But still, it hasn't killed binutils ld because it is not a complete replacement.


    Meanwhile, for the majority of developers NONE OF THIS MATTERS. Most developers never do anything big enough or complex enough for the linker to matter enough for them to spend any time arguing for a faster linker, particularly if there are others
    arguing against that for reasons of license ("politics"), compatibility, or disenfranchisement (platform support). When concentrated costs are weighed against diverse benefits (or concentrated benefits vs diverse costs), the concentrated forces tend to
    win because it _matters_ to them.

    So, we all have to deal with a sucky default linker on Linux because the groups that benefit from binutils being the default are concentrated while very few of the rest of us suffer enough to spend effort fighting about it.


    Philip Guenther

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Philip Guenther@21:1/5 to Siri Cruise on Thu Feb 11 21:31:00 2021
    On Thursday, February 11, 2021 at 4:05:51 PM UTC-8, Siri Cruise wrote:
    In article
    <87mtwaw...@doppelsaurus.mobileactivedefense.com>,
    Rainer Weikusat <rwei...@talktalk.net> wrote:

    Siri Cruise <chine...@yahoo.com> writes:
    Is linux ld sensitive to the file order the way macosx ld isn't?
    That would be so 1970s.

    I get a file list like -Llib -lgc -lcord ... -lm -liconv ... x.o
    y.o z.o . Loads fine on macosx. On linux it whines about missing
    symbols that I verify with nm are in libgc.a .

    It is.
    I had two cc in the makefile. To test I changed the one not
    called and missed the other. So changing the order didn't appear
    to work. It wasn't until I changed the order in both places that
    I realised linux is still using an ancient and archaic ld.

    Now I get to fix new and exciting crap.
    #define _GNU_SOURCE
    #define __USE_GNU

    Hmm, in all the glibc versions I have on hand, only _GNU_SOURCE is necessary; if that's defined then the internal <features.h> header will then define __USE_GNU, such that having defined the former (which has other effects too) the latter will always be
    defined. That matches the glibc info <spit> documentation only covering the former.

    If you've found a case where both of those are required, the glibc people will presumably take it as bug and request an example so they can fix it (...or explain how what you're doing is wrong...)


    Philip Guenther

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Siri Cruise@21:1/5 to Philip Guenther on Thu Feb 11 22:31:59 2021
    In article
    <e54a46f6-a9f9-4dc4-b723-0fa2c5f179c4n@googlegroups.com>,
    Philip Guenther <guenther@gmail.com> wrote:

    If you've found a case where both of those are required, the glibc people will presumably take it as bug and request an example so they can fix it (...or explain how what you're doing is wrong...)

    My experience is bug reporting is designed to discourage bug
    reporting. And I had never got a response. So I don't do it any
    longer. I patch with a few choice expletives and continue. I got
    my ed scripts to force configure and makefiles actually work.

    The only response I got was with exiv2. If I find any new
    problems there, I will give them the fix.

    If people really want feedback, they have acknowledge feedback
    and also realise sometimes the people reporting know more than
    they do.

    --
    :-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
    'I desire mercy, not sacrifice.' /|\ Discordia: not just a religion but also a parody. This post / \
    I am an Andrea Doria sockpuppet. insults Islam. Mohammed

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nicolas George@21:1/5 to All on Sun Feb 14 22:39:03 2021
    "James K. Lowden" , dans le message <20210214171905.192f2c39d7f7aa37e75c6cf8@speakeasy.net>, a écrit :
    You're not suggesting, are you, that I can't write my own memcpy, put
    it in memcpy.a, and link it into my C program, overriding the function defined in libc?

    Actually, no, you cannot do that: if you properly included <string.h>, the compiler is allowed to replace a call to memcpy by any code that has the
    same semantic, including inline code without any function call, or calls to function named differently. And in practice, they do.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James K. Lowden@21:1/5 to Kaz Kylheku on Sun Feb 14 17:19:05 2021
    On Sun, 14 Feb 2021 20:35:06 -0000 (UTC)
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    If it doesn't work that way, how is the linker supposed to choose
    between two implementations of a function defined in two
    libraries?

    "error: multiply defined symbol foo", I would hope.

    No, of course not. The One Definition Rule applies to data, not
    functions.

    You're not suggesting, are you, that I can't write my own memcpy, put
    it in memcpy.a, and link it into my C program, overriding the function
    defined in libc?

    One of us doesn't understand the other. I prepared for that person to
    be me, but right now I'm baffled at the discussion.

    --jkl

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to James K. Lowden on Mon Feb 15 00:08:36 2021
    On 2021-02-14, James K. Lowden <jklowden@speakeasy.net> wrote:
    On Sun, 14 Feb 2021 20:35:06 -0000 (UTC)
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    If it doesn't work that way, how is the linker supposed to choose
    between two implementations of a function defined in two
    libraries?

    "error: multiply defined symbol foo", I would hope.

    No, of course not. The One Definition Rule applies to data, not
    functions.

    A rule by that name is found in C++. C++ does not allow multiply
    defined functions, other than inline functions.

    $ g++ multiple.cc
    multiple.cc: In function ‘void foo()’:
    multiple.cc:5:6: error: redefinition of ‘void foo()’
    void foo()
    ^~~
    multiple.cc:1:6: note: ‘void foo()’ previously defined here
    void foo()
    ^~~

    C++ has overloading; a function is multiply defined if the same overload
    of it is multiply defined.

    You're not suggesting, are you, that I can't write my own memcpy, put
    it in memcpy.a, and link it into my C program, overriding the function defined in libc?

    Since something like that could happen by accident, there is something
    to be said for having it diagnosed if it literally occurs as above.

    Yet, some sort of documented extension can exist for doing the override.

    It could just be the above, with the presence of the diagnostic.

    In the GNU/Linux environment, if you write your own functions that clash
    with glibc functions, you're not "overriding" anything. Libc calls the
    original functions using their their internal names.

    This is necessary for basic ISO C conformance. A strictly conforming ISO
    C program can define a function called isatty() or read(). These
    definitions must not be mistakenly called by, say, stdio routines.

    The mechanism is supported by "weak symbols". When you write a read()
    function that's an ordinary symbol which "wins" over the weak symbol.
    That's a dynamic linking concept which has nothing to do with static
    "ld" or is library order.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygna: Cygwin Native Application Library: http://kylheku.com/cygnal

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James K. Lowden on Mon Feb 15 02:01:18 2021
    "James K. Lowden" <jklowden@speakeasy.net> writes:

    On Sun, 14 Feb 2021 20:35:06 -0000 (UTC)
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    If it doesn't work that way, how is the linker supposed to choose
    between two implementations of a function defined in two
    libraries?

    "error: multiply defined symbol foo", I would hope.

    No, of course not. The One Definition Rule applies to data, not
    functions.

    Maybe I've lost the context, but in C it applies to both. The trouble
    is that the rule is expressed in terms of "external definitions" and
    that is a concept that apples to the program text. The rule is probably supposed include definitions in libraries (where these is no program
    text) but a lawyer could probably argue it either way.

    Anyway, the rule (however one interprets the case of libraries) applies
    to any identifier with external linkage -- both objects and functions.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to James K. Lowden on Sun Feb 14 22:58:56 2021
    On 2/14/21 5:19 PM, James K. Lowden wrote:
    On Sun, 14 Feb 2021 20:35:06 -0000 (UTC)
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    If it doesn't work that way, how is the linker supposed to choose
    between two implementations of a function defined in two
    libraries?

    "error: multiply defined symbol foo", I would hope.

    No, of course not. The One Definition Rule applies to data, not
    functions.

    You're not suggesting, are you, that I can't write my own memcpy, put
    it in memcpy.a, and link it into my C program, overriding the function defined in libc?

    One of us doesn't understand the other. I prepared for that person to
    be me, but right now I'm baffled at the discussion.

    The following citations are from the C2011 standard:

    "An external definition is an external declaration that is also a
    definition of a function (other than an inline definition) or an object.
    If an identifier declared with external linkage is used in an expression
    (other than as part of the operand of a sizeof or _Alignof operator
    whose result is an integer constant), somewhere in the entire
    program there shall be exactly one external definition for the
    identifier; otherwise, there shall be no more than one." (6.9p5).

    Notice that the rules are the same whether the identifier identifies a
    function or an object.

    "If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
    constraint or runtime-constraint is violated, the behavior is
    undefined." (4p2).

    "All identifiers with external linkage in any of the following
    subclauses (including the future library directions) and errno are
    always reserved for use as identifiers with external linkage." (7.1.3p1)

    Note that memcpy one of the identifiers with external linkage in the
    following subclauses.

    "If the program declares or defines an identifier in a context in which
    it is reserved (other than as allowed by 7.1.4) ... the behavior is
    undefined." (7.1.3p2).

    Providing a definition for memcpy with external linkage is not one of
    the cases allowed by 7.1.4.

    It's quite common for implementations to define the behavior when
    multiple definitions are available, in a way that's convenient. The
    behavior you describe is one example of that. However, they are not
    required to do so.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to James Kuyper on Mon Feb 15 15:27:09 2021
    On 2021-02-15, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    It's quite common for implementations to define the behavior when
    multiple definitions are available, in a way that's convenient.

    That way remains convenient only when:

    1. the definitions are all controlled by your program, and do not
    clash with anything in the system.

    2. the definitions are identical (or "similar" according to some rules).

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygna: Cygwin Native Application Library: http://kylheku.com/cygnal

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James K. Lowden@21:1/5 to Kaz Kylheku on Mon Feb 15 15:44:27 2021
    On Mon, 15 Feb 2021 00:08:36 -0000 (UTC)
    Kaz Kylheku <563-365-8930@kylheku.com> wrote:

    You're not suggesting, are you, that I can't write my own memcpy,
    put it in memcpy.a, and link it into my C program, overriding the
    function defined in libc?

    Since something like that could happen by accident, there is something
    to be said for having it diagnosed if it literally occurs as above.

    Yet, some sort of documented extension can exist for doing the
    override.

    It could just be the above, with the presence of the diagnostic.

    I should have known better than to pick a C standard library function.
    OTOH, I learned something. After 4 decades, the only reasonable
    conclusion I can draw is that C programming is unlearnable.

    On the 3rd hand, I've been overriding functions since ever, and don't
    ever remember seeing a linker error if the same function appeared in
    multiple libraries. Here is a quick example:

    $ for F in nsymbol.c [ab].c
    do nl $F; done
    1 extern void foo(void);

    2 int main(int argc, char *argv[])
    3 {
    4 foo();
    5 return 0;
    6 }
    1 #include <stdio.h>

    2 void foo(void) {
    3 printf( "using %s:%s\n", __FILE__, __func__ );
    4 }
    1 #include <stdio.h>

    2 void foo(void) {
    3 printf( "using %s:%s\n", __FILE__, __func__ );
    4 }

    $ make -B nsymbol
    cc -c -o a.o a.c
    ar -r liba.a a.o
    cc -c -o b.o b.c
    ar -r libb.a b.o
    cc -std=c11 -onsymbol nsymbol.c -L. -la -lb

    $ ./nsymbol
    using a.c:foo

    That is the only behavior I ever remember seeing, and the only behavior
    I'd ever expect.

    We're outside the rules of the C standard here, afaik, and into the
    wolly world of whatever it is linkers do. I contend it's normal, and
    useful, and traditional for the linker to resolve symbols from
    libraries in the order in which the libraries are presented, full
    stop. Perhaps that behavior was the product of limited resources in
    the PDP/11 era, but it is also convenient, flexible, and
    deterministic.

    --jkl

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rainer Weikusat@21:1/5 to James K. Lowden on Mon Feb 15 22:40:41 2021
    "James K. Lowden" <jklowden@speakeasy.net> writes:

    [...]

    I contend it's normal, and
    useful, and traditional for the linker to resolve symbols from
    libraries in the order in which the libraries are presented, full
    stop. Perhaps that behavior was the product of limited resources in
    the PDP/11 era, but it is also convenient, flexible, and
    deterministic.

    ,----
    | The argument routines are concatenated in the orderspecified. The entry
    | point of the output is the beginning of the first routine. If any
    | argument is a library, it is searched, and onlythose routines defining
    | an unresolved external reference are loaded. If any routine loaded from
    | a library refers toan undefined symbol which does not become defined by
    | the end of the library, the library is searched again.
    `----

    ld(1) manpage, Unix 1st ed, as available here: http://man.cat-v.org/unix-1st/1/ld

    This behaviour predates C by some time.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)