• Bug#876055: Environment variable handling for reproducible builds

    From Russ Allbery@21:1/5 to All on Mon Sep 18 02:30:01 2017
    XPost: linux.debian.bugs.dist

    Package: debian-policy
    Version: 4.1.0.0
    Severity: normal

    Currently, Debian Policy requires all environment variables be held the
    same across builds for the build to be expected to be reproducible.
    However, the current approach of some reproducible build tools is to
    instead enumerate a set of fixed environment variables and allow other variables to vary.

    We should ideally converge on a single approach to environment variables
    and build reproducibility and make it easy for tools to implement that approach.

    I think the alternatives are:

    1. Enumerate environment variables to hold fixed. This is better in
    the sense that it allows packages to be reproducible under more
    situations, but it's unstable in the sense that we'll never be able to
    enumerate all environment variables that might possibly affect the
    build. It's also not testable in the sense that we can't set every
    possible environment variable.

    2. Set the entire environment to the environment specified in buildinfo
    when doing a reproducible build. I think this is conceptually the
    simplest, but it means that we should make every tool that builds
    official Debian packages use the same environment variable logic so
    that the buildinfo file completely captures the environment (without
    leaking random, inappropriate things into buildinfo). It also means
    effectively giving up on debian/rules build being a path for making a
    reproducible build, since we don't have control over that environment,
    but I think it will be hard to make that work anyway.

    3. List a set of environment variables that are permitted to vary in the
    reproducible build policy, and then have reproducible builds clean the
    environment except for that set and then apply the buildinfo environment
    variable set. This is very similar to 2. I think the primary advantage
    is that it lets us require packages build reproducibly in the presence
    of some settings that logically should not affect the build (USER, HOME,
    etc.), at the cost of making reproducible builds harder to achieve.
    It's mostly testable, in that one can try reproducible builds with
    various settings for those variables, although it would be hard to catch
    corner cases where only a specific setting causes issues.

    I personally lean towards 2, which is consistent with what's in Policy
    right now, but I can see definite merits in 3. I believe the reproducible builds project is currently sort of doing 1, but I have a hard time seeing
    how to make that viable on the testing side.

    -- System Information:
    Debian Release: buster/sid
    APT prefers unstable
    APT policy: (990, 'unstable'), (1, 'experimental')
    Architecture: amd64 (x86_64)
    Foreign Architectures: i386

    Kernel: Linux 4.12.0-1-amd64 (SMP w/4 CPU cores)
    Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
    Shell: /bin/sh linked to /bin/dash
    Init: systemd (via /run/systemd/system)

    Versions of packages debian-policy depends on:
    ii libjs-sphinxdoc 1.6.3-2

    debian-policy recommends no packages.

    Versions of packages debian-policy suggests:
    pn doc-base <none>

    -- no debconf information

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Sherwood@21:1/5 to Russ Allbery on Mon Sep 18 10:30:02 2017
    XPost: linux.debian.bugs.dist

    On 2017-09-18 00:26, Russ Allbery wrote:
    2. Set the entire environment to the environment specified in buildinfo
    when doing a reproducible build. I think this is conceptually the
    simplest, but it means that we should make every tool that builds
    official Debian packages use the same environment variable logic so
    that the buildinfo file completely captures the environment (without
    leaking random, inappropriate things into buildinfo). It also means
    effectively giving up on debian/rules build being a path for making
    a
    reproducible build, since we don't have control over that
    environment,
    but I think it will be hard to make that work anyway.

    FWIW this is the approach we've taken on both of the Baserock build
    tools, and for BuildStream [1].

    Given that it's trivially easy for a build script to try to call out to
    the internet (eg fetch tarball, git clone), or look for custom
    environment variables, we think it's clearly safest to put everything in
    a sandbox and be explicit about resources, network and environment
    variables.

    br
    Paul

    [1] https://wiki.gnome.org/Projects/BuildStream/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Daniel Kahn Gillmor@21:1/5 to Russ Allbery on Tue Sep 19 01:10:02 2017
    XPost: linux.debian.bugs.dist

    On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:
    I personally lean towards 2, which is consistent with what's in Policy
    right now, but I can see definite merits in 3. I believe the reproducible builds project is currently sort of doing 1, but I have a hard time seeing how to make that viable on the testing side.

    Thanks for raising this question, Russ!

    I'm not sure that we should let lack of exhaustive testing push us away
    from (1). (1) is in principle the right thing -- it's easy to make a
    build reproducible if we tell people that they have to do exactly one
    specific thing. But we generally want people to be able to run
    heterogenous systems, and not to force them into one particular
    environment.

    Consider someone who wants to see more logging from a build, for
    example. There could be an environment variable that encourages the
    toolchain to log more, but doesn't affect the binary objects created by
    the build. By going with choices (2) or (3) we effectively dismiss even considering the reproducibility of those builds, which seems like a
    shame.

    Does everything in policy need to be rigorously testable? or is it ok
    to have Policy state the desired outcome even if we don't know how (or
    don't have the resources) to test it fully today.

    I'd prefer for policy to be able to make strong advisory statements even without us being able to test them mechanically. This is already the
    case for (for example) "preferred form of modification" -- it's partly testable, but will never be 100% testable, and will always require
    research and discussion and thinking for the corner cases. Yet we
    continue to aim for it.

    Policy should be aiming high, not lowering the bar to meet what's
    concretely testable.

    --dkg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Daniel Kahn Gillmor on Tue Sep 19 01:50:01 2017
    XPost: linux.debian.bugs.dist

    Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:
    On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

    I personally lean towards 2, which is consistent with what's in Policy
    right now, but I can see definite merits in 3. I believe the
    reproducible builds project is currently sort of doing 1, but I have a
    hard time seeing how to make that viable on the testing side.

    Thanks for raising this question, Russ!

    I'm not sure that we should let lack of exhaustive testing push us away
    from (1). (1) is in principle the right thing -- it's easy to make a
    build reproducible if we tell people that they have to do exactly one specific thing. But we generally want people to be able to run
    heterogenous systems, and not to force them into one particular
    environment.

    Well... I would argue that the amount of time and effort that's gone into
    this project shows that it's not that easy to make a build reproducible
    even when telling people to do exactly one thing. :) But I get your
    point.

    Consider someone who wants to see more logging from a build, for
    example. There could be an environment variable that encourages the toolchain to log more, but doesn't affect the binary objects created by
    the build. By going with choices (2) or (3) we effectively dismiss even considering the reproducibility of those builds, which seems like a
    shame.

    This is the case for (2), but not for (3). Indeed, this is exactly the distinction between (2) and (3). It does mean that discovery of any new
    such environment variable would require a change to our whitelist in
    approach (3), so there would be some lag and the whitelist would become
    long over time (with a corresponding testing load). But (3) does try to achieve that use case without trying to anticipate any possible
    environment variable setting. It lets us be reactive to newly-discovered environment variables across which we want to stay reproducible.

    Does everything in policy need to be rigorously testable? or is it ok
    to have Policy state the desired outcome even if we don't know how (or
    don't have the resources) to test it fully today.

    I don't think everything has to be rigorously testable, but I do think
    it's a useful canary. If I can't test something, I start wondering
    whether that means I have problems with my underlying assumptions.

    In particular, for (1), we have no comprehensive list of environment
    variables that affect the behavior of tools, and that list would be
    difficult to create. Many pieces of software add their own environment variables with little coordination, and many of those variables could
    possibly affect tool output.

    I feel like the work for (1) and for (3) ends up being comparable; for (1)
    we have to maintain a blacklist, and for (3) we have to maintain a
    whitelist. But (3) is testable, whereas (1) is inherently aspirational
    and will always have to be aspirational. We're endlessly going to be discovering some other environment variable that changes tool output.

    I'm also unsure that (1) is even what we want to claim. Do we really want
    to say that builds are always reproducible if you don't change this short
    list of environment variables, no matter whatever other environment
    variables you set? There's some appeal in this for the end user, but it
    feels very frustrating for the package maintainer. At first glance, as a package maintainer, I'd think I'd have to maintain a huge blacklist of environment variables that I've discovered affect my toolchain somewhere,
    and explicitly unset them all in debian/rules. This doesn't feel like a
    good use of anyone's time (and may actually *break* other, non-reproducibility-related things that people want to do with my
    package).

    --
    Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vagrant Cascadian@21:1/5 to Vagrant Cascadian on Tue Sep 19 06:40:02 2017
    XPost: linux.debian.bugs.dist

    On 2017-09-18, Vagrant Cascadian wrote:
    On 2017-09-18, Russ Allbery wrote:
    Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:
    On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

    Does everything in policy need to be rigorously testable? or is it ok
    to have Policy state the desired outcome even if we don't know how (or
    don't have the resources) to test it fully today.

    I don't think everything has to be rigorously testable, but I do think
    it's a useful canary. If I can't test something, I start wondering
    whether that means I have problems with my underlying assumptions.

    In particular, for (1), we have no comprehensive list of environment
    variables that affect the behavior of tools, and that list would be
    difficult to create. Many pieces of software add their own environment
    variables with little coordination, and many of those variables could
    possibly affect tool output.

    There is a huge difference between variables that *might* affect the
    build as an unintended input that gets stored in a resulting packages in
    some manner, and variables that are designed to change the behavior of
    parts of the build toolchain.

    I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.

    Ok, after discussing on IRC a bit, I figured it might be worth expanding
    on that point a bit...


    The envioronment variables (and other variations) used by the
    reproducible builds test infrastructure:

    https://tests.reproducible-builds.org/debian/index_variations.html

    I'll try and summarize the rationale for each of the variables used,
    many of which have had actual impacts on the result of the builds:


    CAPTURE_ENVIRONMENT, BUILDUSERID, BUILDUSERNAME

    Some builds capture the entire environment, or most of the environment;
    setting arbitrary environment variables can help detect this.

    TZ

    The timezone used can change the results of embedded timestamps.

    LANG, LANGUAGE, LC_ALL

    The locale and language settings definitely change the strings embedded
    in some binaries, if tool output is translated.

    PATH, USER, HOME

    Some builds embed these.

    DEB_BUILD_OPTIONS=parallel=N

    The level of parallelism can change the build output, although other
    values in DEB_BUILD_OPTIONS values might be reasonably expected to
    change output (e.g. noautodbgsym).


    None of the above variables should change the resulting built package,
    with the possible exception of some other values of DEB_BUILD_OPTIONS.

    On the other hand, I would expect variables such as CC, MAKE,
    CROSS_COMPILE, CFLAGS, etc. to reasonably and likely change the result
    of the built package. They are, in a sense, part of the build toolchain environment.


    Without generating comprehensive blacklists and/or whitelists, is it
    plausible to come up with a policy description of the above two classes
    of variables? Given the above lists, it seems relatively obvious to me
    that there are basically two classes of variables, but I'm at a loss for
    how to really describe it in policy.

    You could give a reasonable test of:

    Is this variable intended to change the results of the binary, or is
    it changing the build as an unintended side-effect?

    That does require reasoned interpretation, though. I envision such tests
    being used in bug reports relating to reproducibility issues, on a
    case-by-case basis.


    It doesn't solve the testability issue on a policy level, but that could possibly be addressed outside of policy through best practices for reproducibility documentation.


    live well,
    vagrant

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEcDardHbDotegGFCHt4uC1IFLkbYFAlnAkRMACgkQt4uC1IFL kbZW3w//ZlMwCNWYbhN1cD/W+gModRlZpnSFiDiioWApLFmXBAaoubK16gWgN0Pu 6Z9zJCLWJrdk6LaqSyEp+mBg8+As8+g+jO7scPswPPib2mWDls7t5sVE0gylxA3j ji/Y9Rp0HGPWB1fss2SCjfpTgDjLp0YaIBcA8ovtblr/0Hs5G1eXhRxjwkrphNpR 5rAaE6JDV5KAWmNLU0b8xNHb26y16rMu4ITvH6mTX+ZTfmyJyOi8bEWJYrtm3Bwl /dg8pis2VJw0pnuwOcFIjCw6tPpys+whneRNRJY7EM6cxjdvdFwZgcDA02GNzN9h /OO6XG9QBDvduK8l2mDvC1yyfIbfnkyNG1XHDbWvzOcoifRqPzyY2AQVSQpoe2Di BZTWHGDOJHbjLttlcJcxhfwQO4H8hCL+c1Rp9flo07TyycnU8tACQ6ePBOmxiPaO gOidKxl2QgRR1ZoQFwr99eNWAH39T5pTUyKaDaaIfW3aWmYG2og56kUKWbAmG9Ld U45KtGOyUhnqzD3AFX6+z8+w24uA+4PFLZE/dpXurgXqPLb7npMHLUJIjx5sdmOr 50oCvaQBx9UWo4Kaa0eVl+XQD6/0NnlgeEG353H263DN8W9lW8+fD4SJyXi16Hfz hAd2/JsHrsXuKACHN2euXDoPNmtmuOn+Aejl6BZHM0gq4paynvE=
    =vH6y
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Vagrant Cascadian@21:1/5 to Russ Allbery on Tue Sep 19 04:10:02 2017
    XPost: linux.debian.bugs.dist

    On 2017-09-18, Russ Allbery wrote:
    Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:
    On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

    I personally lean towards 2, which is consistent with what's in Policy
    right now, but I can see definite merits in 3. I believe the
    reproducible builds project is currently sort of doing 1, but I have a
    hard time seeing how to make that viable on the testing side.

    Thanks for raising this question, Russ!

    Indeed!


    I'm not sure that we should let lack of exhaustive testing push us away
    from (1). (1) is in principle the right thing -- it's easy to make a
    build reproducible if we tell people that they have to do exactly one
    specific thing. But we generally want people to be able to run
    heterogenous systems, and not to force them into one particular
    environment.

    Well... I would argue that the amount of time and effort that's gone into this project shows that it's not that easy to make a build reproducible
    even when telling people to do exactly one thing. :) But I get your
    point.

    Much of the work has already been done by aspirational, principled
    folks... :)


    Does everything in policy need to be rigorously testable? or is it ok
    to have Policy state the desired outcome even if we don't know how (or
    don't have the resources) to test it fully today.

    I don't think everything has to be rigorously testable, but I do think
    it's a useful canary. If I can't test something, I start wondering
    whether that means I have problems with my underlying assumptions.

    In particular, for (1), we have no comprehensive list of environment variables that affect the behavior of tools, and that list would be
    difficult to create. Many pieces of software add their own environment variables with little coordination, and many of those variables could possibly affect tool output.

    There is a huge difference between variables that *might* affect the
    build as an unintended input that gets stored in a resulting packages in
    some manner, and variables that are designed to change the behavior of
    parts of the build toolchain.

    I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.


    I feel like the work for (1) and for (3) ends up being comparable; for (1)
    we have to maintain a blacklist, and for (3) we have to maintain a
    whitelist. But (3) is testable, whereas (1) is inherently aspirational
    and will always have to be aspirational. We're endlessly going to be discovering some other environment variable that changes tool output.

    Well, there can be a testable, automatable standard, and a higher,
    aspirational standard in parallel.

    Which largely seems consistant with what's already in policy... but I'm
    not sure it's appropriate to codify these whitelists or blacklists in
    policy.


    I'm also unsure that (1) is even what we want to claim. Do we really want
    to say that builds are always reproducible if you don't change this short list of environment variables, no matter whatever other environment
    variables you set?

    I don't think we want to make absolute claims; reproducible builds is
    about having greater confidence that the binaries are produced from the
    source, not absolute confidence.

    The ideal is to have as many builds as possible corroborated from a
    diverse group of build machines, developers, third-parties,
    sophisticated end-users, legal jurisdictions, etc.


    There's some appeal in this for the end user, but it
    feels very frustrating for the package maintainer. At first glance, as a package maintainer, I'd think I'd have to maintain a huge blacklist of environment variables that I've discovered affect my toolchain somewhere,
    and explicitly unset them all in debian/rules. This doesn't feel like a
    good use of anyone's time (and may actually *break* other, non-reproducibility-related things that people want to do with my
    package).

    In practice, for the vast majority of packages in Debian, it is a
    relatively small number of environment variables to get fairly solid reproducibility coverage... at least from what we've seen so far.

    The hard part is actually continuing to tease them out...


    live well,
    vagrant

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEcDardHbDotegGFCHt4uC1IFLkbYFAlnAbEMACgkQt4uC1IFL kbaRVA/9EbByimQh+2mCheXHmTtaX2uYTkxYPX+Ep+DINKKqg2vwmL1z6PVxNBKD 7tk20oy6LJIweNmo4qTKh+FVXYCkfbrYAGRENnue15HRkc9EPD6eiGP4Zt4K6hzT 5+teB7c+jcHm8O1oeaYTlMBfiqaR30Moro9rSA0knlO8yJzlq+mVR3QHvwC6e9xG J5ozcriy6iiFO8DU3hRDayKNTJGZB/5YDlCNeuKhRxr5Q+PCQ7O/O4Hmij/+upBX Gi0GDJ1gniuNlGm4LvT+MTjRcVG5imUCF7wTvhAcEeqAf7XPU74Ee6xqqPB9Npye 2VmLQoKfK2dmSLJZcAP2NP7fLogJ0SnVKYIHukbZZl6IIgQujkRZ1hCL5Ek7VpBq NPjSsFh2GXduHP/tstrqhKrbhxwkgXXkC5tAUhs9r9zNy9HRhyo9lYB2QZev7Me0 aFhf8V79B7dSKGdWWRKRu1dV9LbLVqpWnvMu2MtExWjtmBitThIXIggtPDjcK5vd hxTsN0j+2WvrVIOgsI429/Z4efXF6oCoBljvr6L59qBfwELs/xVE5SEWHiFth9xa qoP7zjLcUTmEJMURTPvyR5SOOOhMy6NyGvO48Fyb8lD8XA6pDPGkoxvcR3hq2EOK jMUGZzu7ok7NavIQOaI2n+Zh3nKnjfzDTRQEq5Yp4erf172s55A=
    =LH3Y
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ximin Luo@21:1/5 to All on Tue Sep 19 12:20:01 2017
    XPost: linux.debian.bugs.dist

    Simon McVittie:
    On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:
    [..]

    I consider unintended variables that affect the build output a bug, and
    variables designed and intended to change the behavior of the toolchain
    expected, reasonable behavior.

    There is a *huge* number of variables that are intended to change
    behaviour, and may or may not affect the behaviour of this specific
    package. Which of your categories are these in?

    For example, basically any well-behaved programming language or programming-language-like environment has an equivalent of PYTHONPATH, PERL5LIB, PKG_CONFIG_PATH and similar variables, [..]

    Similarly, there is an intractably huge number of environment variables
    that can affect the result of Automake and make. Do you know about all
    of them? Including RM, PC, AR, LOADLIBES (and those are just for make's implicit rules)? [..]


    I agree with this and this matches my own thoughts back in:

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#324 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#369

    I think the assumption has to be that every environment variable is potentially intended to affect the build unless otherwise stated [..]
    [..] It would be most useful if we were to identify a
    restricted subset of environment variables for which there is consensus
    that the variable is meant to be merely user preference and shouldn't
    affect the build [..]

    Perhaps those variables should be a whitelist, or perhaps there is
    some wording for Policy that would identify them while excluding the legitimately build-affecting ones - but either way I think the
    assumption should be "there is a limited subset of environment
    variables that are required to preserve reproducibility when varied,
    and the rest are uninteresting".


    These variables shouldn't be a whitelist because different buildsystems all the time can invent their own variables to affect themselves. We can't really "predict" something like PERL5LIB.

    However, neither should it be a blacklist because different run-time programs invent their own variables all the time to affect themselves, but in a way that really should not affect build processes. I have to set LANG=XX.YY in my user environment, that
    doesn't mean that all my builds should run differently from people in other countries.

    Therefore, I think it is better to try to reach some wording for Policy that communicates *intent*. Then, tools like dpkg-buildflag can have their own envvars that they force-set, which would be a subset of the ones allowed by Policy. Tools like
    reprotest can vary certain envvars that are "obviously" shouldn't affect the build like LC_ALL, USER, etc. Then in the middle there will be certain variables like RM and AR that could affect the build, which should be clear by Policy wording, but are too
    cumbersome to have dpkg-buildpackage try to enumerate a full whitelist and force-set them to a fixed value.

    Interpreter variables like PER5LIB and PYTHONPATH we would have to assume fall in the first category ("they are allowed to affect the build output") even though arguably they are also "run-time variables" because they are very tied to the interpreter and
    probably only developers really want to set the for specific purposes.

    So let's throw some wording out there already. To quote my earlier proposal:

    I would suggest amending:

    - a set of environment variable values; and
    + a set of reserved environment variable values; and

    then later:

    + A "reserved" environment variable is defined as DEB_*, DPKG_*, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by
    non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER [..]

    (The last time I erroneously included PATH in the final "excluded" list - because we have varied PATH but in a really trivial way on tests.r-b.org for ages - but I now agree with you that we shouldn't expect reproducibility when PATH is varied.)

    My reasoning, as echoed by others on this thread already, was:

    some other variables are used by non-build tools, such as LC_*, USER, etc. Since they affect non-build programs, they possibly may be set in a developer's normal environment, so just running "debian/rules build" will pick these up. Then, the build
    should stay the same despite these other variables.

    X

    --
    GPG: ed25519/56034877E1F87C35
    GPG: rsa4096/1318EFAC5FBBDBCE
    https://github.com/infinity0/pubkeys.git

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ximin Luo@21:1/5 to All on Tue Sep 19 12:50:02 2017
    XPost: linux.debian.bugs.dist

    Ximin Luo:
    [..]

    (The last time I erroneously included PATH in the final "excluded" list - because we have varied PATH but in a really trivial way on tests.r-b.org for ages - but I now agree with you that we shouldn't expect reproducibility when PATH is varied.)


    Actually thinking about it a bit more, the PATH point is a little subtle. It's certainly the case that (case a:) if I set PATH=/a vs PATH=/b and the files installed underneath /a and /b are *different*, then the build output must be "allowed" to vary
    since they will of course vary.

    However if (case b:) I set PATH=/a vs PATH=/b but the files underneath those paths are exactly the same, then is it right that the build output still is allowed to vary - e.g. might embed the strings "/a" vs "/b" in them? I think this subtlety is why we
    had been varying PATH for ages on tests.r-b.org, because we were confusing the two questions together.

    A strict interpretation of reproducibility might say "(a) can vary" except "(b) remains fixed", and vary PATH after checking that the different values still point to exactly the same file-trees.

    However for the purposes of this thread, to simplify the discussion, I think we can for now say "(a) can vary, including (b)", so that we can ignore the semantics of PATH variables. Once we solve the other issues we could revisit this one.

    X

    --
    GPG: ed25519/56034877E1F87C35
    GPG: rsa4096/1318EFAC5FBBDBCE
    https://github.com/infinity0/pubkeys.git

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Vagrant Cascadian on Tue Sep 19 11:10:02 2017
    XPost: linux.debian.bugs.dist

    (Re-sending this to the bug rather than to debian-policy, sorry for the duplicate on -policy)

    On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:
    There is a huge difference between variables that *might* affect the
    build as an unintended input that gets stored in a resulting packages in
    some manner, and variables that are designed to change the behavior of
    parts of the build toolchain.

    I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.

    There is a *huge* number of variables that are intended to change
    behaviour, and may or may not affect the behaviour of this specific
    package. Which of your categories are these in?

    For example, basically any well-behaved programming language or programming-language-like environment has an equivalent of PYTHONPATH, PERL5LIB, PKG_CONFIG_PATH and similar variables, which will pull in
    arbitrary code (perhaps from /opt or ~/.local or something) with
    arbitrary behaviour; and increasingly many tools respect the XDG Base
    Directory spec, so XDG_DATA_HOME and XDG_DATA_DIRS provide a search path
    for things normally found in /usr. I don't think it's desirable for
    maintainers to feel that every debian/rules needs to start with something
    like this (note this list is very incomplete, I could list dozens like
    this without trying very hard):

    undefine GI_TYPELIB_PATH
    undefine LD_LIBRARY_PATH
    undefine PERL5LIB
    undefine PKG_CONFIG_PATH
    undefine PYTHONPATH
    export PATH = /usr/bin:/usr/sbin:/bin:/sbin
    export XDG_DATA_DIRS = /usr/share

    ... and indeed if a maintainer did that, that would make it needlessly difficult for another maintainer to test-build their package against
    their new version of Perl or Python or GObject-Introspection or whatever.

    Similarly, there is an intractably huge number of environment variables
    that can affect the result of Automake and make. Do you know about all
    of them? Including RM, PC, AR, LOADLIBES (and those are just for make's implicit rules)?

    I think the assumption has to be that every environment variable is
    potentially intended to affect the build unless otherwise stated,
    because the set of environment variables that *could* affect the build
    is extremely large. It would be most useful if we were to identify a
    restricted subset of environment variables for which there is consensus
    that the variable is meant to be merely user preference and shouldn't
    affect the build - even better if there's some document like the devref
    that lists whether it is more appropriate for a package maintainer to
    unset each of those variables or reset them to some initial value if
    they become a problem.

    Perhaps those variables should be a whitelist, or perhaps there is
    some wording for Policy that would identify them while excluding the legitimately build-affecting ones - but either way I think the
    assumption should be "there is a limited subset of environment
    variables that are required to preserve reproducibility when varied,
    and the rest are uninteresting".

    The environment variables that are not sanitised by debuild
    might be an interesting starting point for classification -
    we know that for debuild users, the rest do not matter in
    practice. Dpkg::Build::Info::get_build_env_whitelist() is probably
    another interesting set (particularly since it's used by recent sbuild).

    In practice, for the vast majority of packages in Debian, it is a
    relatively small number of environment variables to get fairly solid reproducibility coverage... at least from what we've seen so far.

    Set PERL5LIB to a location containing libraries that change gtk-doc
    behaviour. All packages that use gtk-doc in buster (where gtk-doc was
    written in Perl) are now unreproducible.

    Set PYTHONPATH to a location containing libraries that change gtk-doc behaviour. All packages that use gtk-doc in sid (where gtk-doc was
    translated into Python) are now unreproducible.

    Set LD_LIBRARY_PATH or LD_PRELOAD to interpose libc functions and change
    their behaviour. Basically everything is now unreproducible.

    Set SHELL to a non-POSIX shell. Basically everything is potentially now unreproducible.

    I don't think trying to address those is a route that Debian should go
    down.

    Regards,
    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ximin Luo@21:1/5 to All on Tue Sep 19 13:30:02 2017
    XPost: linux.debian.bugs.dist

    Russ Allbery:
    [..] It does mean that discovery of any new
    such environment variable would require a change to our whitelist in
    approach (3), so there would be some lag and the whitelist would become
    long over time (with a corresponding testing load). But (3) does try to achieve that use case without trying to anticipate any possible
    environment variable setting. It lets us be reactive to newly-discovered environment variables across which we want to stay reproducible.


    I can also see the merits in your (3) suggestion but I don't think it would be appropriate to hard-code the list in Policy, because it would be too hard to change it and then people might end up relying on a very-incomplete list and then do stupid stuff
    that was counter to the original intention of the discussions around the policy. It would be better to find a generic wording (with some examples) similar to what I suggested elsewhere.

    Does everything in policy need to be rigorously testable? or is it ok
    to have Policy state the desired outcome even if we don't know how (or
    don't have the resources) to test it fully today.

    I don't think everything has to be rigorously testable, but I do think
    it's a useful canary. If I can't test something, I start wondering
    whether that means I have problems with my underlying assumptions.

    [..]

    The "strict" interpretation is in principle testable though - we just have to collect enough environment variables and decide which category they fall under, and add that logic to our build tools.

    I think in these early days, it would be fine for public package builders and reproducibility testers to do (3) as you suggested, i.e.

    - clean the environment
    - set certain variables to a fixed value (the "whitelist") and record these in buildinfo

    This "loose" interpretation of reproducibility still gives us some useful results, as well as testable reproducibility for end users, but as I said I don't think this should be Policy since the whitelist should be expanding quite quickly especially early
    on.

    OTOH, developer reproducibility checkers (such as reprotest) can be a little bit more strict. I can imagine something like:

    - reprotest runs 3 builds:
    - build 0 with current env
    - build 1 with current env + varying some "blacklist" envvars
    - build 2 with current env + varying some "non-whitelist" envvars

    If there are differences between build 1 and build 2, then reprotest reports "unexpected envvar $XXX affected the build" and the developer can then either submit it for inclusion on the "whitelist" or the "blacklist" based on the Policy wording. If it
    ends up on the blacklist then they would also have to fix their own package to be invariant under that envvar.

    So over time, this way we can build up a blacklist and a whitelist. But it shouldn't be in the original policy. And I don't think what I suggested above is a particularly disruptive or surprising process, especially since the "public" builders would only
    do the "looser" interpretation so people aren't bothered by bogus "unreproducible" reports.

    X

    --
    GPG: ed25519/56034877E1F87C35
    GPG: rsa4096/1318EFAC5FBBDBCE
    https://github.com/infinity0/pubkeys.git

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ximin Luo@21:1/5 to All on Mon Oct 2 23:20:03 2017
    XPost: linux.debian.bugs.dist

    Ximin Luo:
    [..]

    OTOH, developer reproducibility checkers (such as reprotest) can be a little bit more strict. I can imagine something like:

    - reprotest runs 3 builds:
    - build 0 with current env
    - build 1 with current env + varying some "blacklist" envvars
    - build 2 with current env + varying some "non-whitelist" envvars

    If there are differences between build 1 and build 2, then reprotest reports "unexpected envvar $XXX affected the build" and the developer can then either submit it for inclusion on the "whitelist" or the "blacklist" based on the Policy wording. If it
    ends up on the blacklist then they would also have to fix their own package to be invariant under that envvar.

    So over time, this way we can build up a blacklist and a whitelist. But it shouldn't be in the original policy. And I don't think what I suggested above is a particularly disruptive or surprising process, especially since the "public" builders would
    only do the "looser" interpretation so people aren't bothered by bogus "unreproducible" reports.


    I've implemented this in reprotest here in the "env-build" branch: https://anonscm.debian.org/cgit/reproducible/reprotest.git/log/?h=env-build

    It requires the python3-rstr package which is currently in NEW or you can get it here:
    https://people.debian.org/~infinity0/apt/pool/main/p/python-rstr/

    Run it like this:

    $ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env > out || true' out
    [..]
    --- /tmp/tmp1ujyb3xp/control
    +++ /tmp/tmp1ujyb3xp/experiment-blacklist
    ├── source-root
    │ ├── out
    │ │ @@ -1,57 +1,47 @@
    [.. big diff ..]
    Unreproducible even when varying blacklisted envvars: BROWSER, CLUTTER_IM_MODULE, COLORTERM, COLUMNS, DATEMSK, DBUS_SESSION_BUS_ADDRESS, [..] ftp_proxy, http_proxy, https_proxy
    This may or may not be caused by other factors; try re-running this again with --vary=-all
    # exit code 1

    $ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env | grep UNKNOWN > out || true' out
    [..]
    --- /tmp/tmp2m24l442/control
    +++ /tmp/tmp2m24l442/experiment-non-whitelist
    ├── source-root
    │ ├── out
    │ │ @@ -0,0 +1,10 @@
    │ │ +00000000: 5245 5052 4f54 4553 545f 4341 5054 5552 REPROTEST_CAPTUR │ │ +00000010: 455f 454e 5649 524f 4e4d 454e 545f 554e E_ENVIRONMENT_UN │ │ +00000020: 4b4e 4f57 4e5f 314b 6254 4a76 6362 6749 KNOWN_1KbTJvcbgI │ │ +00000030: 464a 7661 394a 364d 6762 417a 7a57 5377 FJva9J6MgbAzzWSw │ │ +00000040: 5f5