• Bug#844431: Revised patch: seeking seconds

    From Russ Allbery@21:1/5 to Ximin Luo on Wed Aug 16 19:50:02 2017
    XPost: linux.debian.bugs.dist

    Ximin Luo <infinity0@debian.org> writes:

    Fair enough. I actually spotted that but thought it was better to get "something" into Policy rather than nitpick. I guess other people were thinking similar things. Well, lesson learnt, I will be more forceful
    next time.

    The sentence I amended said "most environment variables" so our intent
    is clear. If we want to fix this now, I would suggest amending:

    - a set of environment variable values; and
    + a set of reserved environment variable values; and

    then later:

    + A "reserved" environment variable is defined as DEB_*, DPKG_, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by
    non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER, PATH and likely any variables ending with *PATH.

    We intentionally didn't spell this out in this much detail because it felt better to defer this (stricter) bar until we have documentation of the *.buildinfo file, and also because we were worried about the list changing (once it goes into Policy, it's more irritating to change). The current standard in Policy is intentionally weaker than this in order to be
    simpler.

    I still lean towards taking this approach, because I'm pretty worried
    about the scope of:

    other variables explicitly used by buildsystems to affect build output

    That's not really an enumerable list. My recommendation, if you want to
    allow some environment variables to vary without affecting
    reproducibility, is to explicitly list the set of environment variables
    that can vary, rather than trying to list the ones that have to remain
    fixed.

    But, more fundamentally, I'm dubious that weakening the environment
    variable set is a good use of anyone's time. Why not define reproducible builds as setting a specific set of environment variables and no others?
    We're long past the point where building packages in an isolated
    environment with a fixed set of environment variables is a great hardship
    or even particularly unusual. I think the effort would be better spent on fixing (with enumerated exceptions) the set of environment variables set
    by buildds, sbuild, pbuilder, and other infrastructure that builds
    packages than in making packages tolerate random environment variables
    being set during the build. It's really hard to track down all the
    environment variable settings that might affect Autoconf, the build tools, document formatters, and so forth.

    --
    Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ximin Luo@21:1/5 to All on Wed Aug 16 21:30:03 2017
    XPost: linux.debian.bugs.dist

    Russ Allbery:
    Ximin Luo <infinity0@debian.org> writes:

    Fair enough. I actually spotted that but thought it was better to get
    "something" into Policy rather than nitpick. I guess other people were
    thinking similar things. Well, lesson learnt, I will be more forceful
    next time.

    The sentence I amended said "most environment variables" so our intent
    is clear. If we want to fix this now, I would suggest amending:

    - a set of environment variable values; and
    + a set of reserved environment variable values; and

    then later:

    + A "reserved" environment variable is defined as DEB_*, DPKG_, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by
    non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER, PATH and likely any variables ending with *PATH.

    We intentionally didn't spell this out in this much detail because it felt better to defer this (stricter) bar until we have documentation of the *.buildinfo file, and also because we were worried about the list changing (once it goes into Policy, it's more irritating to change). The current standard in Policy is intentionally weaker than this in order to be
    simpler.

    I still lean towards taking this approach, because I'm pretty worried
    about the scope of:

    other variables explicitly used by buildsystems to affect build output

    That's not really an enumerable list. My recommendation, if you want to allow some environment variables to vary without affecting
    reproducibility, is to explicitly list the set of environment variables
    that can vary, rather than trying to list the ones that have to remain
    fixed.


    Intuitively it feels weird to say "if you vary USER, the output must remain fixed", but also "if you vary RANDOMUNIQUESPECIALSNOWFLAKEVARIABLE then the output is allowed to change".

    Certain environment variables have become convention to affect a build, like CFLAGS, and even debuild(1) doesn't clear them - but clears the other envvars. That is what I was going on.

    But, more fundamentally, I'm dubious that weakening the environment
    variable set is a good use of anyone's time. Why not define reproducible builds as setting a specific set of environment variables and no others? We're long past the point where building packages in an isolated
    environment with a fixed set of environment variables is a great hardship
    or even particularly unusual. I think the effort would be better spent on fixing (with enumerated exceptions) the set of environment variables set
    by buildds, sbuild, pbuilder, and other infrastructure that builds
    packages than in making packages tolerate random environment variables
    being set during the build. It's really hard to track down all the environment variable settings that might affect Autoconf, the build tools, document formatters, and so forth.


    My proposal was the opposite, to *strengthen* the definition that was already accepted - I *don't* think we should track down all those variables and make packages immune to them, that is why I added "other variables explicitly used by buildsystems to
    affect build output" etc. OTOH, some other variables are used by non-build tools, such as LC_*, USER, etc. Since they affect non-build programs, they possibly may be set in a developer's normal environment, so just running "debian/rules build" will pick
    these up. Then, the build should stay the same despite these other variables.

    If a build tool needs to be run in a specific locale, it should either use a locale-independent sorting program, or set LC_ALL explicitly itself regardless of what the parent environment says.

    This doesn't contradict us from using a fixed or mostly-clean environment in sbuild, pbuilder, debuild, etc.

    Now that I think about it however, it's probably not reasonable to expect that the output remains the same when PATH is changed. On tests.r-b.org we vary it by appending a dummy value [1] but if the user adds their own stuff to the beginning then the
    output may well change. There is probably no point in trying to prevent that in all packages. In a sense, it does very much affect what build tools are run, even though non-build programs also use it. However, my gut feeling still says that it's not
    right for the locale (LC_*) to affect a build process. I will try to think of a more precise way to express this difference.

    X

    [1] https://tests.reproducible-builds.org/debian/index_variations.html

    --
    GPG: ed25519/56034877E1F87C35
    GPG: rsa4096/1318EFAC5FBBDBCE
    https://github.com/infinity0/pubkeys.git

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Lamb@21:1/5 to All on Thu Aug 17 03:50:02 2017
    XPost: linux.debian.bugs.dist

    Hi Bill,

    Now compare with reproducible build. You get some error report you
    cannot reproduce, do some change following the help provided and
    hope for the best. Then some day later you get the same error
    report.

    I'd dearly love to know when/where this occurred if you can provide a reference.

    This is not our, and certainly not my own, intention when filing reproducibility-related bugs, which always include a well-intentioned
    patch.


    Best wishes,

    --
    ,''`.
    : :' : Chris Lamb
    `. `'` lamby@debian.org / chris-lamb.co.uk
    `-

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bill Allombert@21:1/5 to Chris Lamb on Sun Aug 20 22:30:01 2017
    XPost: linux.debian.bugs.dist

    On Wed, Aug 16, 2017 at 05:40:23PM -0700, Chris Lamb wrote:
    Hi Bill,

    Now compare with reproducible build. You get some error report you
    cannot reproduce, do some change following the help provided and
    hope for the best. Then some day later you get the same error
    report.

    I'd dearly love to know when/where this occurred if you can provide a reference.

    This happens for errors listed on the reproducible-build.org website.
    I do not speak about bug report here.

    This is not our, and certainly not my own, intention when filing reproducibility-related bugs, which always include a well-intentioned
    patch.

    I know from experience you and the reproducible-build team report
    excellent bug report with good patches. That is not the issue but you do
    not need policy to continue doing that.

    However, are not maintainers expected to make their packages
    policy-compliant without waiting for bug report ?
    Are not maintainers supposed to be proactive and try to fix
    issues that they became aware without waiting for someone to fill a bug ?
    Are not users allowed to fill bugs when packages does not seem to comply
    with documented expectation ?

    As I said, if this policy is only meant to be a vehicule for the reproducible-build team, then it is fine by me. However if it means
    for general audience, then it is premature.
    It would be best to focus first on requiring all generators to be deterministic. One this step is reached (we are already close thanks to
    the reproducible-build team work), it will much easier for maintainers
    to deal wih reproducibility issues because they will not need
    work-around.

    Cheers,
    --
    Bill. <ballombe@debian.org>

    Imagine a large red swirl here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Lamb@21:1/5 to All on Mon Aug 21 03:40:01 2017
    XPost: linux.debian.bugs.dist

    Bill,

    Now compare with reproducible build. You get some error report you
    cannot reproduce, do some change following the help provided and
    hope for the best. Then some day later you get the same error
    report.

    I'd dearly love to know when/where this occurred if you can provide a reference.

    This happens for errors listed on the reproducible-build.org website.
    I do not speak about bug report here.

    Oh, I see. I was interpreting your "same error report" and similar
    phrasings to mean some active, human-driven notification and interaction
    such as a bug report or private mail, rather than something cold and
    automated from a not — yet! — perfect CI framework.

    Thank you for clarifying, but please do also bear with us whilst we
    improve the reliability of the jenkins.debian.org results.


    Best wishes,

    --
    ,''`.
    : :' : Chris Lamb
    `. `'` lamby@debian.org / chris-lamb.co.uk
    `-

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)