I feel like there is probably consensus against the use of PyPi-provided upstream source tarballs in preference for what will usually be a GitHub release tarball, so I made an MR to this effect (moderate recommendation rather than a "must" directive):
https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16
Comments, corrections, requests for additional information, and
objections welcome :-) I'm also curious if there isn't consensus by
this point and if it requires further discussion
I feel like there is probably consensus against the use of PyPi-provided
upstream source tarballs in preference for what will usually be a GitHub
release tarball, so I made an MR to this effect (moderate recommendation
rather than a "must" directive):
https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16
Comments, corrections, requests for additional information, and
objections welcome :-) I'm also curious if there isn't consensus by
this point and if it requires further discussion
I work on a vast ecosystem of Python-based projects which consider
the sdist tarballs they upload to PyPI to be their official release
tarballs, because they encode information otherwise only available
in revision control metadata (version information, change history,
copyright holders). The proposal is somewhat akin to saying that a
tarball created via `make dist` is unsuitable for packaging.
"GitHub tarballs" (aside from striking me as a blatant endorsement
of a wholly non-free software platform) lack this metadata, being
only a copy of the file contents from source control while missing
other relevant context Git would normally provide.
I feel like there is probably consensus against the use of PyPi-provided upstream source tarballs in preference for what will usually be a GitHub release tarball
On 2021-06-25 16:42:42 -0400 (-0400), Nicholas D Steeves wrote:
I feel like there is probably consensus against the use of PyPi-provided
upstream source tarballs in preference for what will usually be a GitHub
release tarball, so I made an MR to this effect (moderate recommendation
rather than a "must" directive):
https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16
Comments, corrections, requests for additional information, and
objections welcome :-) I'm also curious if there isn't consensus by
this point and if it requires further discussion
I work on a vast ecosystem of Python-based projects which consider
the sdist tarballs they upload to PyPI to be their official release
tarballs, because they encode information otherwise only available
in revision control metadata (version information, change history,
copyright holders). The proposal is somewhat akin to saying that a
tarball created via `make dist` is unsuitable for packaging.
"GitHub tarballs" (aside from striking me as a blatant endorsement
of a wholly non-free software platform) lack this metadata, being
only a copy of the file contents from source control while missing
other relevant context Git would normally provide.
On Fri, 25 Jun 2021, Jeremy Stanley wrote:
I tend to agree about PyPI being the official releases for a lot of projects. "GitHub tarballs" also tend to include other undesirable stuff for distribution like upstream CI/CD configuration files, etc.
On Fri, 25 Jun 2021 at 16:42:42 -0400, Nicholas D Steeves wrote:
I feel like there is probably consensus against the use of PyPi-provided
upstream source tarballs in preference for what will usually be a GitHub
release tarball
This is not really consistent with what devref says:
The defining characteristic of a pristine source tarball is that the
.orig.tar.{gz,bz2,xz} file is byte-for-byte identical to a tarball
officially distributed by the upstream author
— https://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.en.html#best-practices-for-orig-tar-gz-bz2-xz-files
Sites like Github and Gitlab that generate tarballs from git contents
don't (can't?) guarantee that the exported tarball will never change -
I'm fairly sure `git archive` doesn't try to make that guarantee - so it seems hard to say that the official source code release artifact is always the one that appears as a side-effect of the upstream project's git hosting platform.
That doesn't *necessarily* mean that the equivalent of a `git archive`
is always the wrong thing (and indeed there are a lot of packages where
it's the only reasonably easily-obtained thing that is suitable for our requirememnts), but I don't think it's as simple or clear-cut as you
are implying.
devref also says:
A repackaged .orig.tar.{gz,bz2,xz} ... should, except where impossible
for legal reasons, preserve the entire building and portablility
infrastructure provided by the upstream author. For example, it is
not a sufficient reason for omitting a file that it is used only
when building on MS-DOS. Similarly, a Makefile provided by upstream
should not be omitted even if the first thing your debian/rules does
is to overwrite it by running a configure script.
I think devref goes too far on this - for projects where the official upstream release artifact contains a significant amount of content we
don't want (convenience copies, portability glue, generated files, etc.), checking the legal status of everything can end up being more work than
the actual packaging, and that's work that isn't improving the quality of
our operating system (which is, after all, the point).
However, PyPI sdist archives are (at least in some cases) upstream's
official source code release artifact, so I think a blanket recommendation that we ignore them probably goes too far in the other direction.
I'd prefer to mention both options and have "use your best judgement,
like you have to do for every other aspect of the packaging" as a recommendation :-)
A recommendation is non-binding, and the intent of this proposal is to
say that the most "sourceful" form of source is the *most* suitable for Debian packages. The inverse of this is that `make dist` is less
suitable for Debian packages. Neither formulation of this premise
applies to a scope outside of Debian. In other words, just because a particular form of source packaging and distribution is not considered
ideal in Debian does not in any comment on its suitability for other purposes. Would you prefer to see a note like "PyPi is a good thing for
the Python ecosystem, but sdists are not the preferred form of Debian
source tarballs"?
It's also worth mentioning that upstream's "official release"[...]
preference is not necessarily relevant to a Debian context. Take
for example the case where upstream exclusively supports a Flatpak
and/or Snap package...
Thinking about an ideal solution, and the interesting PBR case, I
remember that gbp is supposed to be able to associate gbp tags with
upstream commits (or possibly tags), so maybe it's also possible to do
this:
1. When gbp import-orig finds a new release
2. Fetch upstream remote as well
3. Run PBR against the upstream release tag
4. Stage this[ese] file[s]
5. Either append them to the upstream tarball before committing to the
pristine-tar branch, or generate the upstream tarball from the
upstream branch (intent being that the upstream branch's HEAD should
be identical to the contents of that tarball)
6. Gbp creates upstream/x.y tag
7. Gbp merges to Debian packaging branch.
And yes, I agree moderate is better, but I must sadly confess
ignorance to the technical reasons why PyPI is sometimes more
appropriate. Without technical reasons it seems like a case of
ideological compromise (based on the standards I've been mentored
to and the feedback I've received over the years).
1. Cryptographically signed tags in a Git repository, with
versioning, revision history, release notes and authorship either
embedded within or tied to the Git metadata.
2. Cryptographically signed tarballs of the file tree corresponding
to a tag in the Git repository, with versioning, revision
history, release notes and authorship extracted into files
included directly within the tarball.
Saying that a raw dump of the file content from a revision control
system is recommended over using upstream's sdists presumes all
upstreams are the same. They're not, and which is preferable (or
doable, or even legal) differs from one to another. Just because
some sdists, or even many, are not suitable as a basis for packaging
doesn't mean that sdists are a bad idea to base packages on. Yes,
basing packages on bad sdists is bad, it's hard to disagree with
that.
The proposal is somewhat akin to saying that a
tarball created via `make dist` is unsuitable for packaging.
Does PyPi provide immutable releases?
On Jun 25, 2021, at 11:47 PM, Brian Thompson <brian@hashvault.io> wrote:
On Fri, Jun 25, 2021 at 07:01:39PM -0400, Nicholas D Steeves wrote:
Does PyPi provide immutable releases?
From experience, I can tell you that yes, releases cannot be overwritten,
but they can be "yanked". Pypi states that a yanked release is:
"A release that is always ignored by an installer, unless it is the
only release that matches a version specifier (using either '==' or
'===)."
--
Best regards,
Brian T
On Fri, Jun 25, 2021 at 11:42 PM Jeremy Stanley wrote:[..]
[...]2. Cryptographically signed tarballs of the file tree corresponding
to a tag in the Git repository, with versioning, revision
history, release notes and authorship extracted into files
included directly within the tarball.
I would like to see #2 split into two separate tarballs, one for the
exact copy of the git tree and one containing the data about the other tarball. Then use dpkg-source v3 secondary tarballs to add the data
about the git repo to the Debian source package.
Probably we should start systematically comparing upstream VCS repos
with upstream sdists and reacting to the differences. So far, I've
reacted by ignoring the sdists completely.
I would like to see #2 split into two separate tarballs, one for the
exact copy of the git tree and one containing the data about the other tarball. Then use dpkg-source v3 secondary tarballs to add the data
about the git repo to the Debian source package.
Take for example the
case where upstream exclusively supports a Flatpak and/or Snap
package...
Hi Team!
I feel like there is probably consensus against the use of PyPi-provided upstream source tarballs in preference for what will usually be a GitHub release tarball, so I made an MR to this effect (moderate recommendation rather than a "must" directive):
https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16
To me, the most important thing is that all packages must at least
run the upstream testsuite when it exists (I'm planning on writing
a policy proposal saying this after the freeze). If PyPi releases
include them, I think it's fine (but they often don't).
To me, the most important thing is that all packages must at least run
the upstream testsuite when it exists (I'm planning on writing a policy proposal saying this after the freeze). If PyPi releases include them, I think it's fine (but they often don't).
if we package from PyPi, that don't contain the testsuite, that[...]
result in packages with any test, and that isn't good.
Also, I'm not sure, but the docs aren't in PyPi, isn't?
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 349 |
Nodes: | 16 (2 / 14) |
Uptime: | 115:42:47 |
Calls: | 7,612 |
Files: | 12,786 |
Messages: | 5,683,742 |