Forum: >>> Magnum BBS <<<

Bug#876055: Environment variable handling for reproducible builds

From Russ Allbery@21:1/5 to All on Mon Sep 18 02:30:01 2017

XPost: linux.debian.bugs.dist

Package: debian-policy
Version: 4.1.0.0
Severity: normal

Currently, Debian Policy requires all environment variables be held the
same across builds for the build to be expected to be reproducible.
However, the current approach of some reproducible build tools is to
instead enumerate a set of fixed environment variables and allow other variables to vary.

We should ideally converge on a single approach to environment variables
and build reproducibility and make it easy for tools to implement that approach.

I think the alternatives are:

1. Enumerate environment variables to hold fixed. This is better in
the sense that it allows packages to be reproducible under more
situations, but it's unstable in the sense that we'll never be able to
enumerate all environment variables that might possibly affect the
build. It's also not testable in the sense that we can't set every
possible environment variable.

2. Set the entire environment to the environment specified in buildinfo
when doing a reproducible build. I think this is conceptually the
simplest, but it means that we should make every tool that builds
official Debian packages use the same environment variable logic so
that the buildinfo file completely captures the environment (without
leaking random, inappropriate things into buildinfo). It also means
effectively giving up on debian/rules build being a path for making a
reproducible build, since we don't have control over that environment,
but I think it will be hard to make that work anyway.

3. List a set of environment variables that are permitted to vary in the
reproducible build policy, and then have reproducible builds clean the
environment except for that set and then apply the buildinfo environment
variable set. This is very similar to 2. I think the primary advantage
is that it lets us require packages build reproducibly in the presence
of some settings that logically should not affect the build (USER, HOME,
etc.), at the cost of making reproducible builds harder to achieve.
It's mostly testable, in that one can try reproducible builds with
various settings for those variables, although it would be hard to catch
corner cases where only a specific setting causes issues.

I personally lean towards 2, which is consistent with what's in Policy
right now, but I can see definite merits in 3. I believe the reproducible builds project is currently sort of doing 1, but I have a hard time seeing
how to make that viable on the testing side.

-- System Information:
Debian Release: buster/sid
APT prefers unstable
APT policy: (990, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.12.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages debian-policy depends on:
ii libjs-sphinxdoc 1.6.3-2

debian-policy recommends no packages.

Versions of packages debian-policy suggests:
pn doc-base <none>

-- no debconf information

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Paul Sherwood@21:1/5 to Russ Allbery on Mon Sep 18 10:30:02 2017

XPost: linux.debian.bugs.dist

On 2017-09-18 00:26, Russ Allbery wrote:

2. Set the entire environment to the environment specified in buildinfo
when doing a reproducible build. I think this is conceptually the
simplest, but it means that we should make every tool that builds
official Debian packages use the same environment variable logic so
that the buildinfo file completely captures the environment (without
leaking random, inappropriate things into buildinfo). It also means
effectively giving up on debian/rules build being a path for making
a
reproducible build, since we don't have control over that
environment,
but I think it will be hard to make that work anyway.

FWIW this is the approach we've taken on both of the Baserock build
tools, and for BuildStream [1].

Given that it's trivially easy for a build script to try to call out to
the internet (eg fetch tarball, git clone), or look for custom
environment variables, we think it's clearly safest to put everything in
a sandbox and be explicit about resources, network and environment
variables.

br
Paul

[1] https://wiki.gnome.org/Projects/BuildStream/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Daniel Kahn Gillmor@21:1/5 to Russ Allbery on Tue Sep 19 01:10:02 2017

XPost: linux.debian.bugs.dist

On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

I personally lean towards 2, which is consistent with what's in Policy
right now, but I can see definite merits in 3. I believe the reproducible builds project is currently sort of doing 1, but I have a hard time seeing how to make that viable on the testing side.

Thanks for raising this question, Russ!

I'm not sure that we should let lack of exhaustive testing push us away
from (1). (1) is in principle the right thing -- it's easy to make a
build reproducible if we tell people that they have to do exactly one
specific thing. But we generally want people to be able to run
heterogenous systems, and not to force them into one particular
environment.

Consider someone who wants to see more logging from a build, for
example. There could be an environment variable that encourages the
toolchain to log more, but doesn't affect the binary objects created by
the build. By going with choices (2) or (3) we effectively dismiss even considering the reproducibility of those builds, which seems like a
shame.

Does everything in policy need to be rigorously testable? or is it ok
to have Policy state the desired outcome even if we don't know how (or
don't have the resources) to test it fully today.

I'd prefer for policy to be able to make strong advisory statements even without us being able to test them mechanically. This is already the
case for (for example) "preferred form of modification" -- it's partly testable, but will never be 100% testable, and will always require
research and discussion and thinking for the corner cases. Yet we
continue to aim for it.

Policy should be aiming high, not lowering the bar to meet what's
concretely testable.

--dkg

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Russ Allbery@21:1/5 to Daniel Kahn Gillmor on Tue Sep 19 01:50:01 2017

XPost: linux.debian.bugs.dist

Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:

On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

I personally lean towards 2, which is consistent with what's in Policy
right now, but I can see definite merits in 3. I believe the
reproducible builds project is currently sort of doing 1, but I have a
hard time seeing how to make that viable on the testing side.

Thanks for raising this question, Russ!

I'm not sure that we should let lack of exhaustive testing push us away
from (1). (1) is in principle the right thing -- it's easy to make a
build reproducible if we tell people that they have to do exactly one specific thing. But we generally want people to be able to run
heterogenous systems, and not to force them into one particular
environment.

Well... I would argue that the amount of time and effort that's gone into
this project shows that it's not that easy to make a build reproducible
even when telling people to do exactly one thing. :) But I get your
point.

Consider someone who wants to see more logging from a build, for
example. There could be an environment variable that encourages the toolchain to log more, but doesn't affect the binary objects created by
the build. By going with choices (2) or (3) we effectively dismiss even considering the reproducibility of those builds, which seems like a
shame.

This is the case for (2), but not for (3). Indeed, this is exactly the distinction between (2) and (3). It does mean that discovery of any new
such environment variable would require a change to our whitelist in
approach (3), so there would be some lag and the whitelist would become
long over time (with a corresponding testing load). But (3) does try to achieve that use case without trying to anticipate any possible
environment variable setting. It lets us be reactive to newly-discovered environment variables across which we want to stay reproducible.

Does everything in policy need to be rigorously testable? or is it ok
to have Policy state the desired outcome even if we don't know how (or
don't have the resources) to test it fully today.

I don't think everything has to be rigorously testable, but I do think
it's a useful canary. If I can't test something, I start wondering
whether that means I have problems with my underlying assumptions.

In particular, for (1), we have no comprehensive list of environment
variables that affect the behavior of tools, and that list would be
difficult to create. Many pieces of software add their own environment variables with little coordination, and many of those variables could
possibly affect tool output.

I feel like the work for (1) and for (3) ends up being comparable; for (1)
we have to maintain a blacklist, and for (3) we have to maintain a
whitelist. But (3) is testable, whereas (1) is inherently aspirational
and will always have to be aspirational. We're endlessly going to be discovering some other environment variable that changes tool output.

I'm also unsure that (1) is even what we want to claim. Do we really want
to say that builds are always reproducible if you don't change this short
list of environment variables, no matter whatever other environment
variables you set? There's some appeal in this for the end user, but it
feels very frustrating for the package maintainer. At first glance, as a package maintainer, I'd think I'd have to maintain a huge blacklist of environment variables that I've discovered affect my toolchain somewhere,
and explicitly unset them all in debian/rules. This doesn't feel like a
good use of anyone's time (and may actually *break* other, non-reproducibility-related things that people want to do with my
package).

--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vagrant Cascadian@21:1/5 to Vagrant Cascadian on Tue Sep 19 06:40:02 2017

XPost: linux.debian.bugs.dist

On 2017-09-18, Vagrant Cascadian wrote:

On 2017-09-18, Russ Allbery wrote:

Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:

On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

Does everything in policy need to be rigorously testable? or is it ok
to have Policy state the desired outcome even if we don't know how (or
don't have the resources) to test it fully today.

I don't think everything has to be rigorously testable, but I do think
it's a useful canary. If I can't test something, I start wondering
whether that means I have problems with my underlying assumptions.

In particular, for (1), we have no comprehensive list of environment
variables that affect the behavior of tools, and that list would be
difficult to create. Many pieces of software add their own environment
variables with little coordination, and many of those variables could
possibly affect tool output.

There is a huge difference between variables that *might* affect the
build as an unintended input that gets stored in a resulting packages in
some manner, and variables that are designed to change the behavior of
parts of the build toolchain.

I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.

Ok, after discussing on IRC a bit, I figured it might be worth expanding
on that point a bit...

The envioronment variables (and other variations) used by the
reproducible builds test infrastructure:

https://tests.reproducible-builds.org/debian/index_variations.html

I'll try and summarize the rationale for each of the variables used,
many of which have had actual impacts on the result of the builds:

CAPTURE_ENVIRONMENT, BUILDUSERID, BUILDUSERNAME

Some builds capture the entire environment, or most of the environment;
setting arbitrary environment variables can help detect this.

TZ

The timezone used can change the results of embedded timestamps.

LANG, LANGUAGE, LC_ALL

The locale and language settings definitely change the strings embedded
in some binaries, if tool output is translated.

PATH, USER, HOME

Some builds embed these.

DEB_BUILD_OPTIONS=parallel=N

The level of parallelism can change the build output, although other
values in DEB_BUILD_OPTIONS values might be reasonably expected to
change output (e.g. noautodbgsym).

None of the above variables should change the resulting built package,
with the possible exception of some other values of DEB_BUILD_OPTIONS.

On the other hand, I would expect variables such as CC, MAKE,
CROSS_COMPILE, CFLAGS, etc. to reasonably and likely change the result
of the built package. They are, in a sense, part of the build toolchain environment.

Without generating comprehensive blacklists and/or whitelists, is it
plausible to come up with a policy description of the above two classes
of variables? Given the above lists, it seems relatively obvious to me
that there are basically two classes of variables, but I'm at a loss for
how to really describe it in policy.

You could give a reasonable test of:

Is this variable intended to change the results of the binary, or is
it changing the build as an unintended side-effect?

That does require reasoned interpretation, though. I envision such tests
being used in bug reports relating to reproducibility issues, on a
case-by-case basis.

It doesn't solve the testability issue on a policy level, but that could possibly be addressed outside of policy through best practices for reproducibility documentation.

live well,
vagrant

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEcDardHbDotegGFCHt4uC1IFLkbYFAlnAkRMACgkQt4uC1IFL kbZW3w//ZlMwCNWYbhN1cD/W+gModRlZpnSFiDiioWApLFmXBAaoubK16gWgN0Pu 6Z9zJCLWJrdk6LaqSyEp+mBg8+As8+g+jO7scPswPPib2mWDls7t5sVE0gylxA3j ji/Y9Rp0HGPWB1fss2SCjfpTgDjLp0YaIBcA8ovtblr/0Hs5G1eXhRxjwkrphNpR 5rAaE6JDV5KAWmNLU0b8xNHb26y16rMu4ITvH6mTX+ZTfmyJyOi8bEWJYrtm3Bwl /dg8pis2VJw0pnuwOcFIjCw6tPpys+whneRNRJY7EM6cxjdvdFwZgcDA02GNzN9h /OO6XG9QBDvduK8l2mDvC1yyfIbfnkyNG1XHDbWvzOcoifRqPzyY2AQVSQpoe2Di BZTWHGDOJHbjLttlcJcxhfwQO4H8hCL+c1Rp9flo07TyycnU8tACQ6ePBOmxiPaO gOidKxl2QgRR1ZoQFwr99eNWAH39T5pTUyKaDaaIfW3aWmYG2og56kUKWbAmG9Ld U45KtGOyUhnqzD3AFX6+z8+w24uA+4PFLZE/dpXurgXqPLb7npMHLUJIjx5sdmOr 50oCvaQBx9UWo4Kaa0eVl+XQD6/0NnlgeEG353H263DN8W9lW8+fD4SJyXi16Hfz hAd2/JsHrsXuKACHN2euXDoPNmtmuOn+Aejl6BZHM0gq4paynvE=
=vH6y
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Vagrant Cascadian@21:1/5 to Russ Allbery on Tue Sep 19 04:10:02 2017

XPost: linux.debian.bugs.dist

On 2017-09-18, Russ Allbery wrote:

Daniel Kahn Gillmor <dkg@fifthhorseman.net> writes:

On Sun 2017-09-17 16:26:25 -0700, Russ Allbery wrote:

I personally lean towards 2, which is consistent with what's in Policy
right now, but I can see definite merits in 3. I believe the
reproducible builds project is currently sort of doing 1, but I have a
hard time seeing how to make that viable on the testing side.

Thanks for raising this question, Russ!

Indeed!

I'm not sure that we should let lack of exhaustive testing push us away
from (1). (1) is in principle the right thing -- it's easy to make a
build reproducible if we tell people that they have to do exactly one
specific thing. But we generally want people to be able to run
heterogenous systems, and not to force them into one particular
environment.

Well... I would argue that the amount of time and effort that's gone into this project shows that it's not that easy to make a build reproducible
even when telling people to do exactly one thing. :) But I get your
point.

Much of the work has already been done by aspirational, principled
folks... :)

Does everything in policy need to be rigorously testable? or is it ok
to have Policy state the desired outcome even if we don't know how (or
don't have the resources) to test it fully today.

I don't think everything has to be rigorously testable, but I do think
it's a useful canary. If I can't test something, I start wondering
whether that means I have problems with my underlying assumptions.

In particular, for (1), we have no comprehensive list of environment variables that affect the behavior of tools, and that list would be
difficult to create. Many pieces of software add their own environment variables with little coordination, and many of those variables could possibly affect tool output.

There is a huge difference between variables that *might* affect the
build as an unintended input that gets stored in a resulting packages in
some manner, and variables that are designed to change the behavior of
parts of the build toolchain.

I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.

I feel like the work for (1) and for (3) ends up being comparable; for (1)
we have to maintain a blacklist, and for (3) we have to maintain a
whitelist. But (3) is testable, whereas (1) is inherently aspirational
and will always have to be aspirational. We're endlessly going to be discovering some other environment variable that changes tool output.

Well, there can be a testable, automatable standard, and a higher,
aspirational standard in parallel.

Which largely seems consistant with what's already in policy... but I'm
not sure it's appropriate to codify these whitelists or blacklists in
policy.

I'm also unsure that (1) is even what we want to claim. Do we really want
to say that builds are always reproducible if you don't change this short list of environment variables, no matter whatever other environment
variables you set?

I don't think we want to make absolute claims; reproducible builds is
about having greater confidence that the binaries are produced from the
source, not absolute confidence.

The ideal is to have as many builds as possible corroborated from a
diverse group of build machines, developers, third-parties,
sophisticated end-users, legal jurisdictions, etc.

There's some appeal in this for the end user, but it
feels very frustrating for the package maintainer. At first glance, as a package maintainer, I'd think I'd have to maintain a huge blacklist of environment variables that I've discovered affect my toolchain somewhere,
and explicitly unset them all in debian/rules. This doesn't feel like a
good use of anyone's time (and may actually *break* other, non-reproducibility-related things that people want to do with my
package).

In practice, for the vast majority of packages in Debian, it is a
relatively small number of environment variables to get fairly solid reproducibility coverage... at least from what we've seen so far.

The hard part is actually continuing to tease them out...

live well,
vagrant

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEcDardHbDotegGFCHt4uC1IFLkbYFAlnAbEMACgkQt4uC1IFL kbaRVA/9EbByimQh+2mCheXHmTtaX2uYTkxYPX+Ep+DINKKqg2vwmL1z6PVxNBKD 7tk20oy6LJIweNmo4qTKh+FVXYCkfbrYAGRENnue15HRkc9EPD6eiGP4Zt4K6hzT 5+teB7c+jcHm8O1oeaYTlMBfiqaR30Moro9rSA0knlO8yJzlq+mVR3QHvwC6e9xG J5ozcriy6iiFO8DU3hRDayKNTJGZB/5YDlCNeuKhRxr5Q+PCQ7O/O4Hmij/+upBX Gi0GDJ1gniuNlGm4LvT+MTjRcVG5imUCF7wTvhAcEeqAf7XPU74Ee6xqqPB9Npye 2VmLQoKfK2dmSLJZcAP2NP7fLogJ0SnVKYIHukbZZl6IIgQujkRZ1hCL5Ek7VpBq NPjSsFh2GXduHP/tstrqhKrbhxwkgXXkC5tAUhs9r9zNy9HRhyo9lYB2QZev7Me0 aFhf8V79B7dSKGdWWRKRu1dV9LbLVqpWnvMu2MtExWjtmBitThIXIggtPDjcK5vd hxTsN0j+2WvrVIOgsI429/Z4efXF6oCoBljvr6L59qBfwELs/xVE5SEWHiFth9xa qoP7zjLcUTmEJMURTPvyR5SOOOhMy6NyGvO48Fyb8lD8XA6pDPGkoxvcR3hq2EOK jMUGZzu7ok7NavIQOaI2n+Zh3nKnjfzDTRQEq5Yp4erf172s55A=
=LH3Y
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ximin Luo@21:1/5 to All on Tue Sep 19 12:20:01 2017

XPost: linux.debian.bugs.dist

Simon McVittie:

On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:

[..]

I consider unintended variables that affect the build output a bug, and
variables designed and intended to change the behavior of the toolchain
expected, reasonable behavior.

There is a *huge* number of variables that are intended to change
behaviour, and may or may not affect the behaviour of this specific
package. Which of your categories are these in?

For example, basically any well-behaved programming language or programming-language-like environment has an equivalent of PYTHONPATH, PERL5LIB, PKG_CONFIG_PATH and similar variables, [..]

Similarly, there is an intractably huge number of environment variables
that can affect the result of Automake and make. Do you know about all
of them? Including RM, PC, AR, LOADLIBES (and those are just for make's implicit rules)? [..]

I agree with this and this matches my own thoughts back in:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#324 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#369

I think the assumption has to be that every environment variable is potentially intended to affect the build unless otherwise stated [..]
[..] It would be most useful if we were to identify a
restricted subset of environment variables for which there is consensus
that the variable is meant to be merely user preference and shouldn't
affect the build [..]

Perhaps those variables should be a whitelist, or perhaps there is
some wording for Policy that would identify them while excluding the legitimately build-affecting ones - but either way I think the
assumption should be "there is a limited subset of environment
variables that are required to preserve reproducibility when varied,
and the rest are uninteresting".

These variables shouldn't be a whitelist because different buildsystems all the time can invent their own variables to affect themselves. We can't really "predict" something like PERL5LIB.

However, neither should it be a blacklist because different run-time programs invent their own variables all the time to affect themselves, but in a way that really should not affect build processes. I have to set LANG=XX.YY in my user environment, that
doesn't mean that all my builds should run differently from people in other countries.

Therefore, I think it is better to try to reach some wording for Policy that communicates *intent*. Then, tools like dpkg-buildflag can have their own envvars that they force-set, which would be a subset of the ones allowed by Policy. Tools like
reprotest can vary certain envvars that are "obviously" shouldn't affect the build like LC_ALL, USER, etc. Then in the middle there will be certain variables like RM and AR that could affect the build, which should be clear by Policy wording, but are too
cumbersome to have dpkg-buildpackage try to enumerate a full whitelist and force-set them to a fixed value.

Interpreter variables like PER5LIB and PYTHONPATH we would have to assume fall in the first category ("they are allowed to affect the build output") even though arguably they are also "run-time variables" because they are very tied to the interpreter and
probably only developers really want to set the for specific purposes.

So let's throw some wording out there already. To quote my earlier proposal:

I would suggest amending:

- a set of environment variable values; and
+ a set of reserved environment variable values; and

then later:

+ A "reserved" environment variable is defined as DEB_*, DPKG_*, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by

non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER [..]

(The last time I erroneously included PATH in the final "excluded" list - because we have varied PATH but in a really trivial way on tests.r-b.org for ages - but I now agree with you that we shouldn't expect reproducibility when PATH is varied.)

My reasoning, as echoed by others on this thread already, was:

some other variables are used by non-build tools, such as LC_*, USER, etc. Since they affect non-build programs, they possibly may be set in a developer's normal environment, so just running "debian/rules build" will pick these up. Then, the build

should stay the same despite these other variables.

X

--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ximin Luo@21:1/5 to All on Tue Sep 19 12:50:02 2017

XPost: linux.debian.bugs.dist

Ximin Luo:

[..]

(The last time I erroneously included PATH in the final "excluded" list - because we have varied PATH but in a really trivial way on tests.r-b.org for ages - but I now agree with you that we shouldn't expect reproducibility when PATH is varied.)

Actually thinking about it a bit more, the PATH point is a little subtle. It's certainly the case that (case a:) if I set PATH=/a vs PATH=/b and the files installed underneath /a and /b are *different*, then the build output must be "allowed" to vary
since they will of course vary.

However if (case b:) I set PATH=/a vs PATH=/b but the files underneath those paths are exactly the same, then is it right that the build output still is allowed to vary - e.g. might embed the strings "/a" vs "/b" in them? I think this subtlety is why we
had been varying PATH for ages on tests.r-b.org, because we were confusing the two questions together.

A strict interpretation of reproducibility might say "(a) can vary" except "(b) remains fixed", and vary PATH after checking that the different values still point to exactly the same file-trees.

However for the purposes of this thread, to simplify the discussion, I think we can for now say "(a) can vary, including (b)", so that we can ignore the semantics of PATH variables. Once we solve the other issues we could revisit this one.

X

--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Simon McVittie@21:1/5 to Vagrant Cascadian on Tue Sep 19 11:10:02 2017

XPost: linux.debian.bugs.dist

(Re-sending this to the bug rather than to debian-policy, sorry for the duplicate on -policy)

On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:

There is a huge difference between variables that *might* affect the
build as an unintended input that gets stored in a resulting packages in
some manner, and variables that are designed to change the behavior of
parts of the build toolchain.

I consider unintended variables that affect the build output a bug, and variables designed and intended to change the behavior of the toolchain expected, reasonable behavior.

There is a *huge* number of variables that are intended to change
behaviour, and may or may not affect the behaviour of this specific
package. Which of your categories are these in?

For example, basically any well-behaved programming language or programming-language-like environment has an equivalent of PYTHONPATH, PERL5LIB, PKG_CONFIG_PATH and similar variables, which will pull in
arbitrary code (perhaps from /opt or ~/.local or something) with
arbitrary behaviour; and increasingly many tools respect the XDG Base
Directory spec, so XDG_DATA_HOME and XDG_DATA_DIRS provide a search path
for things normally found in /usr. I don't think it's desirable for
maintainers to feel that every debian/rules needs to start with something
like this (note this list is very incomplete, I could list dozens like
this without trying very hard):

undefine GI_TYPELIB_PATH
undefine LD_LIBRARY_PATH
undefine PERL5LIB
undefine PKG_CONFIG_PATH
undefine PYTHONPATH
export PATH = /usr/bin:/usr/sbin:/bin:/sbin
export XDG_DATA_DIRS = /usr/share

... and indeed if a maintainer did that, that would make it needlessly difficult for another maintainer to test-build their package against
their new version of Perl or Python or GObject-Introspection or whatever.

Similarly, there is an intractably huge number of environment variables
that can affect the result of Automake and make. Do you know about all
of them? Including RM, PC, AR, LOADLIBES (and those are just for make's implicit rules)?

I think the assumption has to be that every environment variable is
potentially intended to affect the build unless otherwise stated,
because the set of environment variables that *could* affect the build
is extremely large. It would be most useful if we were to identify a
restricted subset of environment variables for which there is consensus
that the variable is meant to be merely user preference and shouldn't
affect the build - even better if there's some document like the devref
that lists whether it is more appropriate for a package maintainer to
unset each of those variables or reset them to some initial value if
they become a problem.

Perhaps those variables should be a whitelist, or perhaps there is
some wording for Policy that would identify them while excluding the legitimately build-affecting ones - but either way I think the
assumption should be "there is a limited subset of environment
variables that are required to preserve reproducibility when varied,
and the rest are uninteresting".

The environment variables that are not sanitised by debuild
might be an interesting starting point for classification -
we know that for debuild users, the rest do not matter in
practice. Dpkg::Build::Info::get_build_env_whitelist() is probably
another interesting set (particularly since it's used by recent sbuild).

In practice, for the vast majority of packages in Debian, it is a
relatively small number of environment variables to get fairly solid reproducibility coverage... at least from what we've seen so far.

Set PERL5LIB to a location containing libraries that change gtk-doc
behaviour. All packages that use gtk-doc in buster (where gtk-doc was
written in Perl) are now unreproducible.

Set PYTHONPATH to a location containing libraries that change gtk-doc behaviour. All packages that use gtk-doc in sid (where gtk-doc was
translated into Python) are now unreproducible.

Set LD_LIBRARY_PATH or LD_PRELOAD to interpose libc functions and change
their behaviour. Basically everything is now unreproducible.

Set SHELL to a non-POSIX shell. Basically everything is potentially now unreproducible.

I don't think trying to address those is a route that Debian should go
down.

Regards,
smcv

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ximin Luo@21:1/5 to All on Tue Sep 19 13:30:02 2017

XPost: linux.debian.bugs.dist

Russ Allbery:

[..] It does mean that discovery of any new
such environment variable would require a change to our whitelist in
approach (3), so there would be some lag and the whitelist would become
long over time (with a corresponding testing load). But (3) does try to achieve that use case without trying to anticipate any possible
environment variable setting. It lets us be reactive to newly-discovered environment variables across which we want to stay reproducible.

I can also see the merits in your (3) suggestion but I don't think it would be appropriate to hard-code the list in Policy, because it would be too hard to change it and then people might end up relying on a very-incomplete list and then do stupid stuff
that was counter to the original intention of the discussions around the policy. It would be better to find a generic wording (with some examples) similar to what I suggested elsewhere.

Does everything in policy need to be rigorously testable? or is it ok
to have Policy state the desired outcome even if we don't know how (or
don't have the resources) to test it fully today.

I don't think everything has to be rigorously testable, but I do think
it's a useful canary. If I can't test something, I start wondering
whether that means I have problems with my underlying assumptions.

[..]

The "strict" interpretation is in principle testable though - we just have to collect enough environment variables and decide which category they fall under, and add that logic to our build tools.

I think in these early days, it would be fine for public package builders and reproducibility testers to do (3) as you suggested, i.e.

- clean the environment
- set certain variables to a fixed value (the "whitelist") and record these in buildinfo

This "loose" interpretation of reproducibility still gives us some useful results, as well as testable reproducibility for end users, but as I said I don't think this should be Policy since the whitelist should be expanding quite quickly especially early
on.

OTOH, developer reproducibility checkers (such as reprotest) can be a little bit more strict. I can imagine something like:

- reprotest runs 3 builds:
- build 0 with current env
- build 1 with current env + varying some "blacklist" envvars
- build 2 with current env + varying some "non-whitelist" envvars

If there are differences between build 1 and build 2, then reprotest reports "unexpected envvar $XXX affected the build" and the developer can then either submit it for inclusion on the "whitelist" or the "blacklist" based on the Policy wording. If it
ends up on the blacklist then they would also have to fix their own package to be invariant under that envvar.

So over time, this way we can build up a blacklist and a whitelist. But it shouldn't be in the original policy. And I don't think what I suggested above is a particularly disruptive or surprising process, especially since the "public" builders would only
do the "looser" interpretation so people aren't bothered by bogus "unreproducible" reports.

X

--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ximin Luo@21:1/5 to All on Mon Oct 2 23:20:03 2017

XPost: linux.debian.bugs.dist

Ximin Luo:

[..]

OTOH, developer reproducibility checkers (such as reprotest) can be a little bit more strict. I can imagine something like:

- reprotest runs 3 builds:
- build 0 with current env
- build 1 with current env + varying some "blacklist" envvars
- build 2 with current env + varying some "non-whitelist" envvars

If there are differences between build 1 and build 2, then reprotest reports "unexpected envvar $XXX affected the build" and the developer can then either submit it for inclusion on the "whitelist" or the "blacklist" based on the Policy wording. If it

ends up on the blacklist then they would also have to fix their own package to be invariant under that envvar.

So over time, this way we can build up a blacklist and a whitelist. But it shouldn't be in the original policy. And I don't think what I suggested above is a particularly disruptive or surprising process, especially since the "public" builders would

only do the "looser" interpretation so people aren't bothered by bogus "unreproducible" reports.

I've implemented this in reprotest here in the "env-build" branch: https://anonscm.debian.org/cgit/reproducible/reprotest.git/log/?h=env-build

It requires the python3-rstr package which is currently in NEW or you can get it here:
https://people.debian.org/~infinity0/apt/pool/main/p/python-rstr/

Run it like this:

$ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env > out || true' out
[..]
--- /tmp/tmp1ujyb3xp/control
+++ /tmp/tmp1ujyb3xp/experiment-blacklist
├── source-root
│ ├── out
│ │ @@ -1,57 +1,47 @@
[.. big diff ..]
Unreproducible even when varying blacklisted envvars: BROWSER, CLUTTER_IM_MODULE, COLORTERM, COLUMNS, DATEMSK, DBUS_SESSION_BUS_ADDRESS, [..] ftp_proxy, http_proxy, https_proxy
This may or may not be caused by other factors; try re-running this again with --vary=-all
# exit code 1

$ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env | grep UNKNOWN > out || true' out
[..]
--- /tmp/tmp2m24l442/control
+++ /tmp/tmp2m24l442/experiment-non-whitelist
├── source-root
│ ├── out
│ │ @@ -0,0 +1,10 @@
│ │ +00000000: 5245 5052 4f54 4553 545f 4341 5054 5552 REPROTEST_CAPTUR │ │ +00000010: 455f 454e 5649 524f 4e4d 454e 545f 554e E_ENVIRONMENT_UN │ │ +00000020: 4b4e 4f57 4e5f 314b 6254 4a76 6362 6749 KNOWN_1KbTJvcbgI │ │ +00000030: 464a 7661 394a 364d 6762 417a 7a57 5377 FJva9J6MgbAzzWSw │ │ +00000040: 5f5

Who's Online
Recent Visitors
- Bob Worm
  Mon Apr 15 22:55:47 2024
  from Wales, Uk via Telnet
- Keyop
  Mon Apr 15 22:12:21 2024
  from Huddersfield, West Yorkshire via SSH
- Jb
  Mon Apr 15 10:32:29 2024
  from Wroclaw, Poland via Telnet
- Bob Worm
  Tue Apr 16 06:34:15 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	291
Nodes:	16 (2 / 14)
Uptime:	142:40:10
Calls:	6,611
Calls today:	1
Files:	12,159
Messages:	5,310,437

Bug#876055: Environment variable handling for reproducible builds

Who's Online

Recent Visitors

System Info