• Re: Bug#1030785: -ffile-prefix-map option and reproducibility

    From =?UTF-8?Q?St=c3=a9phane_Glondu?=@21:1/5 to All on Tue Feb 14 09:10:01 2023
    XPost: linux.debian.maint.ocaml.maint

    Hi,

    Le 08/02/2023 à 10:58, Emilio Pozuelo Monfort a écrit :
    What is the purpose of having the build flags in a file in the .deb?

    ocamlc can act as a driver for the C compiler, to compile C stubs with
    the "right" flags. These flags are basically the CFLAGS ocaml was
    compiled with, plus some additional OCaml-specific flags (like -fwrapv).

    But I wonder now: shouldn't the CFLAGS part be queried from the
    environment at use time, rather than at OCaml compilation time? This
    setting of a non-C compiler calling a C compiler must happen often... I
    don't know what's the practice elsewhere.

    If I push the reasoning to the extreme, why doesn't gcc take into
    account CFLAGS itself?


    Cheers,

    --
    Stéphane

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Pentchev@21:1/5 to All on Tue Feb 14 09:30:01 2023
    XPost: linux.debian.maint.ocaml.maint

    On Tue, Feb 14, 2023 at 09:04:47AM +0100, Stéphane Glondu wrote:
    Hi,

    Le 08/02/2023 à 10:58, Emilio Pozuelo Monfort a écrit :
    What is the purpose of having the build flags in a file in the .deb?

    ocamlc can act as a driver for the C compiler, to compile C stubs with the "right" flags. These flags are basically the CFLAGS ocaml was compiled with, plus some additional OCaml-specific flags (like -fwrapv).

    But I wonder now: shouldn't the CFLAGS part be queried from the environment at use time, rather than at OCaml compilation time? This setting of a non-C compiler calling a C compiler must happen often... I don't know what's the practice elsewhere.

    I can't speak of many other systems, but at least with Perl's XS
    (the standard way to write Perl modules parts of which are compiled C code)
    and two of the ways this can be done in Python, it is the same:
    the C compiler's name and flags are recorded at the time the "wrapper" is built, so that they can be used later. At least IMHO, this has two main benefits:
    - people (or programs) that want to build a Perl/Python/OCaml/whatever
    module do not have to know anything about C compilers, flags, optimization,
    language-specific libraries[1] , etc. The consumers tell their own language
    ecosystem "look, I need to compile something that has C parts in it, go do
    something about it", and the wrapper for the C compiler knows exactly what
    to do
    - everything is compiled using the same compiler[2], the same optimization
    flags, the same libraries, etc., so one module that has C parts in it can
    call the C functions from another module directly with no fear of any kind
    of calling convention mismatch or whatever
    - to reinforce the previous point: everything is compiled using the same
    recorded settings *no matter what* environment variables or paths there
    may be in the current user's execution environment, so that nothing will
    break if somebody has the wrong environment variable defined when they
    build a module a couple of months later. Of course, there are some cases
    when some such systems allow additional flags to be specified or
    overridden, but IMHO it is good that this must be done explicitly and is
    most often not done at all, so that everything is built in the same way
    and it can all work together

    [1] At least with Perl and Python, C code very often invokes functions from
    the Perl/Python standard library, the C code does not know how to create
    a list or how to invoke a Perl/Python function, so it has to use
    the language's internal libraries for that.

    [2] Well, okay, that's not strictly true, since the binary called "gcc" now
    may not be the same that was provided by the C compiler package a couple
    of months ago, but it ought to be guaranteed to generate compatible code
    and object files.

    So yeah, I'd say that "record the C compiler flags at build time" is
    pretty much standard practice for such interface providers.

    G'luck,
    Peter

    --
    Peter Pentchev roam@ringlet.net roam@debian.org pp@storpool.com
    PGP key: http://people.FreeBSD.org/~roam/roam.key.asc
    Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEELuenpRf8EkzxFcNUZR7vsCUn3xMFAmPrRIYACgkQZR7vsCUn 3xNxUA/+MMA/O8KobXxYeUB3BDKDRXZnUeHAFyVRT5SuVl93dsMVchTdLVZp1qQR 42epDB1L2scchXhwE0FNRTCKBVKuJ7volvGwT9SkQ7exFkknag5cjiqlrFu7EeKr d9QYflh9XWHmv3ZyqZ4oK96Z7zQ5LpiOT+uJ/+P/jIp/Hxn6Ys94R1ZTTJzGl2SI Xug9tmfITVVEr2cSwfcoACj/PIGtwtyQHHfKgx8KPiZV25J6ErHqOBTpCmrcJ6IW pP7f8Fm72krUvT6lXNhLjF1LGsVgQVIIt2Jq0VFSKGBYH7khcUHsb+H6ocBFXJjQ w/Mq3urLItw553x3k/aS9hBMQL4XL/ZivxDIEGrKWqI2n5jvNpBsVFqWWeNwMUQv kKh5HuGOCx/HVkJ3B/R8gfUzNVpqfaM83WTkY9EPkgmRq8KfP/Lqih+QleVbsfYD I7xQsLUMv+XHqkoj+7z3wVu2SAVP6wJpRBWhVq946iLEtRyFz4ZpqgrxtZrBIlSW krepH6wpuUR9A0m7KOGqfqTUlbHMY65d6+2FxYJ9ClgLHAPSuy1UmYkmvGf5tXK8 bUasNylpplrmHaQ0XmanGbAimpsdVk12fm29fMTgnFEgXx7To2NNbLKbejkr341o jVTKhdc280nTdiBJwRnvKkKicVG8+ptydSCAlCrZ3/4LrKFR6aM=
    =8iVJ
    -
  • From Simon McVittie@21:1/5 to Peter Pentchev on Tue Feb 14 11:00:01 2023
    XPost: linux.debian.maint.ocaml.maint

    On Tue, 14 Feb 2023 at 10:21:31 +0200, Peter Pentchev wrote:
    I can't speak of many other systems, but at least with Perl's XS
    (the standard way to write Perl modules parts of which are compiled C code) and two of the ways this can be done in Python, it is the same:
    the C compiler's name and flags are recorded at the time the "wrapper" is built, so that they can be used later. At least IMHO, this has two main benefits:
    - people (or programs) that want to build a Perl/Python/OCaml/whatever
    module do not have to know anything about C compilers, flags, optimization,
    language-specific libraries[1] , etc. The consumers tell their own language
    ecosystem "look, I need to compile something that has C parts in it, go do
    something about it", and the wrapper for the C compiler knows exactly what
    to do
    - everything is compiled using the same compiler[2], the same optimization
    flags, the same libraries, etc., so one module that has C parts in it can
    call the C functions from another module directly with no fear of any kind
    of calling convention mismatch or whatever

    I think this is very common, but really a bit too simplistic. Some of
    the compiler flags are part of an interface between components, and for
    those flags, it makes sense to record them in the compiler-wrapper or
    in some metadata file like a pkg-config .pc file; but some of them are
    private implementation details of the component currently being compiled,
    and don't make sense to replicate between components.

    For instance, -lpython3.11 is certainly part of the interface, -O2 -g
    is not part of the interface *on Debian* but might be on other platforms
    (on glibc systems there's only one C runtime library, but Microsoft
    compilers use different and incompatible runtime libraries for release and debug builds), but facts about the build directory like -I/tmp/python3.11-3.11.2/Include or
    -ffile-prefix-map=/tmp/python3.11-3.11.2=. are certainly not part of the interface.

    Neither are warning-control options like -Wall or -Werror=implicit-function-declaration, really: just because the Python maintainers want warnings or even errors when comiling Python, that
    doesn't necessarily mean it's right to require all Python extension
    modules to be built with those same warnings.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Pentchev@21:1/5 to Simon McVittie on Tue Feb 14 16:10:01 2023
    XPost: linux.debian.maint.ocaml.maint

    On Tue, Feb 14, 2023 at 09:51:50AM +0000, Simon McVittie wrote:
    On Tue, 14 Feb 2023 at 10:21:31 +0200, Peter Pentchev wrote:
    I can't speak of many other systems, but at least with Perl's XS
    (the standard way to write Perl modules parts of which are compiled C code) and two of the ways this can be done in Python, it is the same:
    the C compiler's name and flags are recorded at the time the "wrapper" is built, so that they can be used later. At least IMHO, this has two main benefits:
    - people (or programs) that want to build a Perl/Python/OCaml/whatever
    module do not have to know anything about C compilers, flags, optimization,
    language-specific libraries[1] , etc. The consumers tell their own language
    ecosystem "look, I need to compile something that has C parts in it, go do
    something about it", and the wrapper for the C compiler knows exactly what
    to do
    - everything is compiled using the same compiler[2], the same optimization
    flags, the same libraries, etc., so one module that has C parts in it can
    call the C functions from another module directly with no fear of any kind
    of calling convention mismatch or whatever

    I think this is very common, but really a bit too simplistic. Some of
    the compiler flags are part of an interface between components, and for
    those flags, it makes sense to record them in the compiler-wrapper or
    in some metadata file like a pkg-config .pc file; but some of them are private implementation details of the component currently being compiled,
    and don't make sense to replicate between components.

    For instance, -lpython3.11 is certainly part of the interface, -O2 -g
    is not part of the interface *on Debian* but might be on other platforms
    (on glibc systems there's only one C runtime library, but Microsoft
    compilers use different and incompatible runtime libraries for release and debug builds), but facts about the build directory like -I/tmp/python3.11-3.11.2/Include or -ffile-prefix-map=/tmp/python3.11-3.11.2=. are certainly not part of the interface.

    Neither are warning-control options like -Wall or -Werror=implicit-function-declaration, really: just because the Python maintainers want warnings or even errors when comiling Python, that
    doesn't necessarily mean it's right to require all Python extension
    modules to be built with those same warnings.

    Right, when I said "record the compiler flags", I did not mean "and always
    pass them on verbatim". I think you may already know this, since you talk
    about Python, but yeah, in Python's case things are really not that simple. This command:

    python3 -c 'import pprint; import sysconfig; pprint.pp(dict(item for item in sysconfig.get_config_vars().items() if "CFLAGS" in item[0]));'

    ...displays all of the "system configuration variables" (pretty much exactly things recorded at Python build time) that have "CFLAGS" in their name, and
    at least with Python 3.11 in testing, there are *a lot* of those. Some of them are obviously module-specific configuration for the various Python standard library modules, but there are others, too.

    Other systems record compiler (and linker, etc) flags with different granularity,
    but yes, you are correct that it makes a lot of sense to take care what is recorded and how.

    G'luck,
    Peter

    --
    Peter Pentchev roam@ringlet.net roam@debian.org pp@storpool.com
    PGP key: http://people.FreeBSD.org/~roam/roam.key.asc
    Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEELuenpRf8EkzxFcNUZR7vsCUn3xMFAmPro7oACgkQZR7vsCUn 3xN3ERAAnkrUbc0TQHH4CJuG4e8wRHU6V1+c3aSouSx/PphDOe2Ig1/pE9DmxeUT l3O7XVKvAtRJwSJchtrVLjKeVWIPQAvw/pI1vJvb1NiTNEToigCd4127iOe8TmEr VduN/2ZWS/S0s5KlXaSR5W+LQPD4GPlp0lVawCgEJ7pVfCegXw3vbHxFMu9r+qSc MUrgj3g+kY/fjFsPpuRH+QVBtGENYVaaiyTdAFykeuJiMna0P3ARSjoO2goWG2V4 3+Y67jIZhELMucLI/NfJElGToGVIEOkTvp2juJImRARpcB58abzVfoEmAuDH+HHB 7kUpUPUIvgBzDhjSmDn/OO2Qt/uWL1c8WKSdZfnBui+B8zzCTDWBouEg9UznEWko MP3d3tg/GTFJ4QsAp9QOJ7W4rxQ5bV4V5bGPQq/GGNU8B+UxHxCpG0NiBKO+yZPN XwAPv0DBJpFzg70OrscsUrYwYhI/6yub2jlHAMPz/HgTEbCI4geQvusjoVTULS/j zkQBz7i391jApcXfOdaRkhz5y+oNhbTs1nkdYc9jAdEkdhBtq0oNnCLBj5QmS2NG Hrv5SxwpKh9vGEOq5Ic6N4AAoD9ySr3Wl5JRVYXFP8UWn67N1o+tAcKTngapy0zc rFTQujwHDMfN66bZUM0fZ9umDi1/uoPdCAD0hCMV4YyH6qAerMg=
    =5ZqU
    -
  • From Niko Tyni@21:1/5 to Peter Pentchev on Tue Feb 14 20:10:02 2023
    On Tue, Feb 14, 2023 at 05:07:49PM +0200, Peter Pentchev wrote:

    Right, when I said "record the compiler flags", I did not mean "and always pass them on verbatim". I think you may already know this, since you talk about Python, but yeah, in Python's case things are really not that simple. This command:

    python3 -c 'import pprint; import sysconfig; pprint.pp(dict(item for item in sysconfig.get_config_vars().items() if "CFLAGS" in item[0]));'

    ...displays all of the "system configuration variables" (pretty much exactly things recorded at Python build time) that have "CFLAGS" in their name, and at least with Python 3.11 in testing, there are *a lot* of those. Some of them
    are obviously module-specific configuration for the various Python standard library modules, but there are others, too.

    Other systems record compiler (and linker, etc) flags with different granularity,
    but yes, you are correct that it makes a lot of sense to take care what is recorded and how.

    Indeed.

    FWIW what we do with Perl is to filter away those flags that come from dpkg-buildflags [1], but record the others. The dpkg-buildflags ones
    get passed into XS module package builds separately by debhelper, so
    packages can individually opt out of things like hardening if necessary
    via the normal interface (DEB_BUILD_MAINT_OPTIONS etc.)

    There's some background that led to this in #657853 .

    Some important flags that really need to be recorded are those that affect
    the Perl <> XS module binary interface, in particular the LFS ones (-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64).

    [1] https://sources.debian.org/src/perl/5.36.0-7/debian/rules/#L182

    --
    Niko Tyni ntyni@debian.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)