• Automated copyright reviews using REUSE/SPDX as alternative to DEP-5

    From Stephan Lachnit@21:1/5 to All on Wed Jan 26 13:00:03 2022
    Since I feel this fits to the current discussion on the mailing list,
    let me quickly introduce you to an idea I had for a while to improve
    the copyright review situation.
    TLDR: for projects using REUSE, we could generate d/copyright
    automatically and approve the copyright check in NEW automatically.

    - What is REUSE?
    The REUSE specification [1] is a specification to make copyright machine-readable in the source files itself. It is straightforward to implement, add (e.g.) "SPDX-FileCopyrightText: 2019 Jane Doe <jane@example.com>" and "SPDX-License-Identifier: GPL-3.0-or-later" as
    comments to your source file's header and you are done. The license
    identifiers are standardized by the SPDX [2] and are similar to what
    we already use in Debian (see also [7], although a bit outdated).

    The spec is made by the Free Software Foundation Europe (FSFE) and is
    already implemented in several projects [3]. They also provide a tool (available as "reuse" in Debian [4]) which can lint a source folder on
    REUSE completeness and it can export the license information to an
    SPDX bill of materials.

    - What is an SPDX bill of materials?
    It is a machine-readable format that specifies the licenses of each
    file in tag/value style like DEP-5. However compared to DEP-5 it is
    much less human readable, i.e. it includes much more meta information,
    and does not contain the license texts. One useful aspect is that it
    also includes the checksum of each file. I appended an example of how
    such a document might look like below.

    The spec is from the Software Package Data Exchange (SPDX), a project
    hosted by the Linux Foundation. The spec is also available as ISO/IEC 5962:2021.

    - What has this to do with Debian?
    My idea is to allow SPDX documents in addition to DEP-5. The advantage
    is that - if supported upstream - REUSE can generate such reports
    automatically during package build time, so there is no need to write d/copyright manually anymore. It is also much less error-prone, as
    this can be done every time there is a new source package and does not
    suffer from human mistakes like forgetting to check some files during
    the copyright review.

    The license identifiers can be parsed to check if the package falls
    under free/contrib or non-free (except when custom licenses are used).
    Packages levering REUSE could skip the manual d/copyright check in NEW entirely, even when it is a new source package. Writing a sanity
    validator would not be a hard task, there probably already exists one.

    Note that since the licenses are not part of d/copyright anymore,
    those have to be provided in another way. REUSE specifies that
    licenses are in a top-level folder called "LICENSES", so we could
    simply install that folder along the copyright file. We could also
    depend on the "spdx-licenses" package [5] and symlink all non-custom
    licenses to reduce duplicate files, however since a license usually
    needs to be shipped with any code/binary distribution this might get a
    bit complicated.

    Another, IMHO less preferred, way would be to write a converter tool
    from SPDX to DEP-5, but still do auto-approvals. Such a converter tool
    has been proposed before [6].

    - Final thoughts:
    Besides the quality-of-life improvements, using this also has the
    advantage of using an industry standard, i.e. shared work on tooling.
    I heard that Fedora is also thinking about implementing this idea.

    I've been in contact with one of the responsibles at the FSFE for a
    while, and they really like this idea and are open to suggestions from
    our side if we need any particular changes to the tooling. I already
    have a couple of changes we need in mind, in particular with regards
    to adding copyright of the debian folder without adding a header to
    each file, but upstream already has some ideas for that.

    Note that I don't want DEP-5 to go away - it is unlikely that every
    project will follow the REUSE spec and writing an SPDX document by
    hand has no significant advantages over DEP-5. Besides, using the file-exclusion function in DEP-5 for uscan is quite useful for ds/dfsg
    packages (although that could also be moved to an external file).

    For now, let me just hear what you think about this idea in general.
    If someone would be willing to help in this endeavor (e.g. creating
    dh_reuse, writing a DEP), let me know.

    Regards,
    Stephan

    [1] https://reuse.software/spec/
    [2] https://spdx.dev/licenses/
    [3] https://api.reuse.software/projects
    [4] https://tracker.debian.org/pkg/reuse
    [5] https://tracker.debian.org/pkg/spdx-licenses
    [6] https://wiki.debian.org/SPDX
    [7] https://wiki.debian.org/Proposals/CopyrightFormat#Differences_between_DEP5_and_SPDX


    Example for SPDX bill of materials:
    """
    SPDXVersion: SPDX-2.1
    DataLicense: CC0-1.0
    SPDXID: SPDXRef-DOCUMENT
    DocumentName: u2
    DocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-0ed6ddb2-edbd-4664-8b7e-029432c8e421 Creator: Person: Anonymous ()
    Creator: Organization: Anonymous ()
    Creator: Tool: reuse-0.14.0
    Created: 2022-01-26T10:42:59Z
    CreatorComment: <text>This document was created automatically using
    available reuse information consistent with REUSE.</text>
    Relationship: SPDXRef-DOCUMENT describes SPDXRef-3c8056cd1f4f60322830f1e79d55ea13

    FileName: ./update_copyright_years.py
    SPDXID: SPDXRef-3c8056cd1f4f60322830f1e79d55ea13
    FileChecksum: SHA1: 65fc75079eb9d85953b39c6fb832e86c7b7e113a
    LicenseConcluded: NOASSERTION
    LicenseInfoInFile: MIT
    FileCopyrightText: <text>SPDX-FileCopyrightText: 2022 Stephan Lachnit <stephanlachnit@debian.org></text>
    """

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Max Mehl@21:1/5 to All on Wed Jan 26 14:30:01 2022
    Thank you Stephan for bringing REUSE into the discussion and Cc'ing me
    (I am not part of this list). Please let me just add two small things to
    your otherwise encompassing mail.

    ~ Stephan Lachnit [2022-01-26 12:49 +0100]:
    - What is REUSE?

    If you have ~15min of time and rather fancy video intros than text, this recording of a recent talk could be for you [^1]. Otherwise, please
    check out the official website [^2] that also offers a tutorial that
    should get you up to speed.

    Note that I don't want DEP-5 to go away - it is unlikely that every
    project will follow the REUSE spec and writing an SPDX document by
    hand has no significant advantages over DEP-5. Besides, using the file-exclusion function in DEP-5 for uscan is quite useful for ds/dfsg packages (although that could also be moved to an external file).

    FWIW, as you may have already noticed, REUSE makes use of DEP-5 as well,
    as one (and honestly the least preferred) of the three ways how you can
    label your files. We have a better file-based format in the works [^3],
    and would probably also provide a converter from DEP-5 to this new
    REUSE.yaml format.

    That would mean that the REUSE helper tool in the future could take
    DEP-5 files, convert them to the modern format, and run a lint to check
    whether everything is fine – and if you want, also generate a SBOM.

    But already now, a DEP-5 file could be provided to REUSE. One would have
    to check whether the ones Debian provides would work in the default
    location for DEP-5 files in REUSE (`.reuse/dep5`). If not, I suspect
    there would be no large changes needed.

    With this, I just would like to emphasise that Debian's extra care about
    proper licensing is a great plus and comes in handy if you were to
    streamline and extend it by widely supported best practices like REUSE.
    As Stephan said, I'd be thrilled to work together with you to make
    licensing and copyright in Debian and ideally also upstream easier and
    more understandable for users and developers.

    Best,
    Max


    [^1]: https://www.sfscon.it/talks/reuse/

    [^2]: https://reuse.software/

    [^3]: https://github.com/fsfe/reuse-docs/issues/81

    --
    Max Mehl - Programme Manager - Free Software Foundation Europe
    Contact and information: https://fsfe.org/about/mehl | @mxmehl
    Become a supporter of software freedom: https://fsfe.org/join

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Max Mehl@21:1/5 to All on Wed Jan 26 16:00:02 2022
    ~ Stephan Lachnit [2022-01-26 15:20 +0100]:
    But already now, a DEP-5 file could be provided to REUSE. One would have
    to check whether the ones Debian provides would work in the default
    location for DEP-5 files in REUSE (`.reuse/dep5`). If not, I suspect
    there would be no large changes needed.

    Probably too technical at this stage, but a conversion tool in
    combination with the yaml format could actually be quite useful.
    E.g. one could have a debian/REUSE.yaml sub-file for the copyright information of the package build files and a debian/REUSE-source.yaml
    file in case the source does not follow the REUSE spec. If the
    reuse-tool would have an option to specify a different file for the
    root REUSE.yaml, we could actually use it for all packages with
    relatively low migration work on the maintainer side.

    Perhaps it's really too early, but hey. The current idea for REUSE.yaml
    is that, other than the DEP-5 implementation, you can have multiple of
    these files in your project. Each file can describe files relative to
    it, but not in parent directories.

    Consider you have a sub-directory with binary files you'd like to mark,
    e.g. icons you also re-use in other projects. You could create a
    REUSE.yaml file in the same directory describing them. Whenever you
    copy the directory in another project, the licensing/copyright info is retained.

    This is the reason why it should not describe parent directories.
    Therefore, such a file in /debian could only cover files in this
    directory. But I understand that in this special scenario it would be
    useful. I could think of ways how either Debian or REUSE introduce some
    hacks around it, but before we elaborate on this I think there are other
    more fundamental decisions Debian would have to make first ;)

    Best,
    Max

    --
    Max Mehl - Programme Manager - Free Software Foundation Europe
    Contact and information: https://fsfe.org/about/mehl | @mxmehl
    Become a supporter of software freedom: https://fsfe.org/join

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to max.mehl@fsfe.org on Wed Jan 26 15:30:02 2022
    On Wed, Jan 26, 2022 at 1:59 PM Max Mehl <max.mehl@fsfe.org> wrote:

    FWIW, as you may have already noticed, REUSE makes use of DEP-5 as well,
    as one (and honestly the least preferred) of the three ways how you can
    label your files. We have a better file-based format in the works [^3],
    and would probably also provide a converter from DEP-5 to this new
    REUSE.yaml format.

    That would mean that the REUSE helper tool in the future could take
    DEP-5 files, convert them to the modern format, and run a lint to check whether everything is fine – and if you want, also generate a SBOM.

    But already now, a DEP-5 file could be provided to REUSE. One would have
    to check whether the ones Debian provides would work in the default
    location for DEP-5 files in REUSE (`.reuse/dep5`). If not, I suspect
    there would be no large changes needed.

    Probably too technical at this stage, but a conversion tool in
    combination with the yaml format could actually be quite useful.
    E.g. one could have a debian/REUSE.yaml sub-file for the copyright
    information of the package build files and a debian/REUSE-source.yaml
    file in case the source does not follow the REUSE spec. If the
    reuse-tool would have an option to specify a different file for the
    root REUSE.yaml, we could actually use it for all packages with
    relatively low migration work on the maintainer side.

    Regards,
    Stephan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phil Morrell@21:1/5 to Stephan Lachnit on Thu Jan 27 00:50:02 2022
    https://matija.suklje.name/how-and-why-to-properly-write-copyright-statements-in-your-code

    TLDR: I think REUSE.software is a bad idea that is worse than what
    Debian already invented with Machine-readable debian/copyright file. I
    guess if upstream uses it, there's no reason not to ignore that as a
    source of copyright assertions.

    On Wed, Jan 26, 2022 at 12:49:34PM +0100, Stephan Lachnit wrote:
    It is straightforward to
    implement, add (e.g.) "SPDX-FileCopyrightText: 2019 Jane Doe <jane@example.com>" and "SPDX-License-Identifier: GPL-3.0-or-later" as comments to your source file's header and you are done.

    I *am* a big fan of SPDX-License-Identifier, but the above being straightforward is only true for the most trivial of examples. REUSE
    advocate for sprinkling .license files around your repo for e.g. logos
    and other binaries. Same story with multiple authors, they recommend
    using multiple FileCopyrightText's initially, then split it out to a
    separate AUTHORS file and use something like "Project X contributors".

    https://reuse.software/tutorial/#binary-and-uncommentable-files

    Ultimately, when everything becomes too much, REUSE falls back to
    recommending Debian's copyright format anyway! So even if upstream sees
    the value in taking some copyright busywork off our hands, why not
    suggest they just use it in the first place in e.g. the LICENSE file.

    https://reuse.software/faq/#bulk-license

    My idea is to allow SPDX documents in addition to DEP-5.

    Firstly, I didn't think it was called DEP-5 anymore - it was accepted
    into policy in 2012 as "copyright-format" titled "Machine-readable debian/copyright file", so no longer a proposal for enhancement. This
    would be a minor pedantic point (a colloquialism) except for the fact
    that REUSE encourages it as part of their interface: `.reuse/dep5`.

    Note that I don't want DEP-5 to go away - it is unlikely that every
    project will follow the REUSE spec and writing an SPDX document by
    hand has no significant advantages over DEP-5. Besides, using the file-exclusion function in DEP-5 for uscan is quite useful for ds/dfsg packages (although that could also be moved to an external file).

    I think this undermines your previous point about it being less prone to failure - if we could trust upstream assertions on copyright, the NEW
    review wouldn't be a problem in the first place. The point about uscan
    is interesting, since if upstream does take on the hard work of license verification such that packaging is just checking, then they're unlikely
    to have any Files-Excluded, so that would have to merged somehow.

    -----BEGIN PGP SIGNATURE-----

    iHUEABYKAB0WIQSBP39/Unco6Ai78+TbymUJHySObAUCYfHbrgAKCRDbymUJHySO bLHIAP9EyZSw0+Lr7ZDMyGVgeZI5ewymxgp0LRQIqxPKa3j9+AEAyOYhkEMs7Ec/ s9i128+wRX6t9ZpVZEt8CRmWed0ywAk=
    =ORv3
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Phil Morrell on Thu Jan 27 01:10:02 2022
    Phil Morrell <debian@emorrp1.name> writes:

    The point about uscan is interesting, since if upstream does take on the
    hard work of license verification such that packaging is just checking,
    then they're unlikely to have any Files-Excluded, so that would have to merged somehow.

    My intuition (I admit that I haven't done a survey) is that Files-Excluded
    is less frequently used for cases where upstream has not done license verification and is more frequently used for cases where upstream
    disagrees with Debian over what is and is not free software. (IETF RFCs
    are a sadly common example.)

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From nick black@21:1/5 to All on Thu Jan 27 01:30:02 2022
    Russ Allbery left as an exercise for the reader:
    My intuition (I admit that I haven't done a survey) is that Files-Excluded
    is less frequently used for cases where upstream has not done license verification and is more frequently used for cases where upstream
    disagrees with Debian over what is and is not free software. (IETF RFCs
    are a sadly common example.)

    just as a personal example, i've got a fairly elaborate
    Files-Excluded [0] for freely-distributed but DFSG-incompatible
    media included with my upstream tarballs. doing so was easier
    than trying to recreate e.g. jpegs as GIMP xcfs.

    [0] https://salsa.debian.org/debian/notcurses/-/blob/master/debian/copyright#L9

    --
    nick black -=- https://www.nick-black.com
    to make an apple pie from scratch,
    you need first invent a universe.

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEmi//dHmU4oe+xCLxX0NADCHL+swFAmHx51YACgkQX0NADCHL +szwCQ/+MwoUEO1IGgMFVWtImUA/Y0V3n1XnfV68X7oDy3WrLDmvLulGLzvM7rLa JwwH2X2Q8ddsfnoQEyXiDFDLDMyxahOTtAehBh/dignL/VDzGV7VzZGeWJFPciMp Gxsc5wOSpf/bqkRvAOb02mNHsMRCGMmLUWT86rNqQyMsJi/xEwXS2Y6tALQZ0+yB hpAS4KkOJ7dK9BL6luo08/lXuk8E8WfxnPzEYzd7MTCOpNbtP+PsdbLInCK2pQgr WjJGgchBwfuP/r7vvPKzcyn2ZSH13QoMquqLMf9tByGtJ4FYbqu7iEqWK59xQe6B 1nRRDdi7DtHBliup9txt3wlgWM+YECNFYdOG0SvC/2nEZYITSVRz8+EUkJmFnl9m Pla6C+4R6az2bOF9W2JE/0/pBKYuva9i6gq2eJOYbP/F8qxXGN00dp0I3sl6qAVn GM4KlUqbvOVKZ4qJ7EqQYVqm7atfqT5cGxkGsCntLrT2MRtoAc+v2W2FTiluENHU TKFC/BHD+2VxVozU0Dfn3BETAjCrOVL7qsNOjT8vxwhoFkbWGIkMc8pPTEKc2piT rPPsyAXGVV5AuSP4saFvelynD+di2eLS6f+VENsYzarbKcsjmE8dCE5mUlBDhRyA PLumSvUd6DIzOVD9P281wW20x7ObHlSGtW7XdCB9docjxEGeBAA=
    =4/E2
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet G
  • From Stephan Lachnit@21:1/5 to debian@emorrp1.name on Thu Jan 27 11:30:02 2022
    On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <debian@emorrp1.name> wrote:

    TLDR: I think REUSE.software is a bad idea that is worse than what
    Debian already invented with Machine-readable debian/copyright file. I
    guess if upstream uses it, there's no reason not to ignore that as a
    source of copyright assertions.

    I expected some concerns about the complexity of the SPDX document,
    but certainly not about standardized copyright information in source
    files.

    Yes, Debian may have invented the machine-readable copyright bill, but
    not machine-readable copyright information in source files. This is
    what REUSE is all about, and it greatly reduces manual labor - I don't understand how this can be seen as bad.

    On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <debian@emorrp1.name> wrote:

    I *am* a big fan of SPDX-License-Identifier, but the above being straightforward is only true for the most trivial of examples. REUSE
    advocate for sprinkling .license files around your repo for e.g. logos
    and other binaries. Same story with multiple authors, they recommend
    using multiple FileCopyrightText's initially, then split it out to a
    separate AUTHORS file and use something like "Project X contributors".

    No, it does not only work for trivial examples. Take any project with
    a significant amount of code, e.g. [1], and most of the time you will
    find that every source file has the copyright information in the
    header. The problem is, there has been no standardized way to parse
    them. That's why we have tools like licensecheck that try to find it
    out. With REUSE, it gets much much easier.

    Wrt to the .license files: yes they're ugly, but still better than no automation at all. With the new yaml spec, I suspect that these will
    go away.
    Wrt to multiple authors: this is not the fault of REUSE, but just how
    copyright works.

    Ultimately, when everything becomes too much, REUSE falls back to recommending Debian's copyright format anyway! So even if upstream sees
    the value in taking some copyright busywork off our hands, why not
    suggest they just use it in the first place in e.g. the LICENSE file.

    Sight, yes, because Debian's format is afaik the only standardized,
    easy to parse format out there. But the reason why it is there is
    *not* for "when everything becomes", but for files that you cannot and
    don't want to alter. For example, if you regularly import 3rd-party
    code that does not follow REUSE and you don't want to edit the header
    all the time. Note that if everyone would use REUSE, that would not be
    a problem. Another example is when you have tiny example code or
    configs that you want to present to a user, but without any
    distracting comments (think beginner tutorials).

    However, they want to switch from DEP-5 to a more flexible (i.e.
    non-central, relocatable) spec [2]. And there is good reason to do so:
    for example we as Debian can specify the copyright information from
    our packaging separate from the upstream code, without conflict. DEP-5
    does not allow that.

    Firstly, I didn't think it was called DEP-5 anymore - it was accepted
    into policy in 2012 as "copyright-format" titled "Machine-readable debian/copyright file", so no longer a proposal for enhancement. This
    would be a minor pedantic point (a colloquialism) except for the fact
    that REUSE encourages it as part of their interface: `.reuse/dep5`.

    Yes it is called "Machine-readable debian/copyright file Version 1.0",
    but everybody knows it _is_ DEP-5, it is even in the spec in the
    second sentence of the abstract. The spec _is_ still DEP-5, being
    accepted doesn't change that.

    I think this undermines your previous point about it being less prone to failure - if we could trust upstream assertions on copyright, the NEW
    review wouldn't be a problem in the first place.

    I strongly disagree. First of all, upstream knows way better where
    they copy the code from than packagers do. And projects that use REUSE
    are more likely to write that somewhere down as your average NPM
    package that puts a "under MIT license" in the readme and copies
    minified code from everywhere.
    And as a second point, if you write a debian/copyright, you are most
    likely to trust what is in the header, and I suspect the copyright
    review in NEW is not different from this regard. I mean how can one
    even know if the copyright information is wrong?
    Yes there are cases where copyright information is missing and one can
    try to search it, I've done this not just once, but if a project uses
    REUSE headers, this doesn't happen.

    Regards,
    Stephan

    [1] https://gitlab.cern.ch/geant4/geant4
    [2] https://github.com/fsfe/reuse-docs/issues/81

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phil Morrell@21:1/5 to Stephan Lachnit on Fri Jan 28 09:50:01 2022
    On Thu, Jan 27, 2022 at 11:27:45AM +0100, Stephan Lachnit wrote:
    On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <debian@emorrp1.name> wrote:

    TLDR: I think REUSE.software is a bad idea that is worse than what
    Debian already invented with Machine-readable debian/copyright file. I guess if upstream uses it, there's no reason not to ignore that as a
    source of copyright assertions.

    I expected some concerns about the complexity of the SPDX document,
    but certainly not about standardized copyright information in source
    files.

    Yes, Debian may have invented the machine-readable copyright bill, but
    not machine-readable copyright information in source files.

    Erm, no that's not what I'm saying? I'll requote my agreement with

    I *am* a big fan of SPDX-License-Identifier

    I will admit I'm somewhat skeptical in how often file-level copies
    happen these days, rather than folder-level or whole project forks. But
    it's easy enough to convince people to slap a single-line license
    comment in to avoid ambiguity.

    what REUSE is all about, and it greatly reduces manual labor - I don't understand how this can be seen as bad.

    Because being REUSE-compliant IMO greatly *increases* manual labor as
    soon as you're dealing with non-text forms, multiple authors and
    aggregation of differing copyright assertions. These are all things that
    the debian copyright-format has already solved without (as much) manual busywork, so if upstream is agreeable to formally documenting their
    copyrights, I'd much rather they just used that format in LICENSE.

    Firstly, I didn't think it was called DEP-5 anymore - it was accepted
    into policy in 2012 as "copyright-format" titled "Machine-readable debian/copyright file", so no longer a proposal for enhancement. This
    would be a minor pedantic point (a colloquialism) except for the fact
    that REUSE encourages it as part of their interface: `.reuse/dep5`.

    Yes it is called "Machine-readable debian/copyright file Version 1.0",
    but everybody knows it _is_ DEP-5, it is even in the spec in the
    second sentence of the abstract.

    Sure, and that's fine as a colloquialism, but you haven't addressed my objection to REUSE formalising that as part of the spec.

    The spec _is_ still DEP-5, being accepted doesn't change that.

    Sure it does, compare `#files-field` in both specs, from 2019 policy
    upgrading checklist 4.4.1. Perhaps that change should have bumped a
    version number, but it's a bit late now.

    I think this undermines your previous point about it being less prone to failure - if we could trust upstream assertions on copyright, the NEW review wouldn't be a problem in the first place.

    I strongly disagree. First of all, upstream knows way better where
    they copy the code from than packagers do.
    ...
    And as a second point, if you write a debian/copyright, you are most
    likely to trust what is in the header, and I suspect the copyright
    review in NEW is not different from this regard. I mean how can one
    even know if the copyright information is wrong?
    Yes there are cases where copyright information is missing and one can
    try to search it, I've done this not just once, but if a project uses
    REUSE headers, this doesn't happen.

    That has not been my experience for projects without a long history,
    they tend to not care about copyright initially, slap a generic
    assertion on it at some point, but without noticing they've included
    e.g. an embedded copy of zlib or less formally - used an image with a
    vague gratis use but omitting a formal license.

    It's only either later, or from the ITP scrutiny that some confusion
    over pedigree comes to light, someone fires off an email to an early contributor and gets the accurate license information. Or for Debian,
    the path gets added to Files-Excluded and patched out of compilation.

    And projects that use REUSE
    are more likely to write that somewhere down as your average NPM
    package that puts a "under MIT license" in the readme and copies
    minified code from everywhere.

    Sure, but instead of wasting my time encouraging upstream to become REUSE-compliant, I would much rather promote a better standard like
    Debian's. I was curious and found approximately 40 instances of REUSE in codesearch, but multiple thousands of the (optional) copyright-format.

    -----BEGIN PGP SIGNATURE-----

    iHUEABYKAB0WIQSBP39/Unco6Ai78+TbymUJHySObAUCYfOsUQAKCRDbymUJHySO bC+VAQD3wpnj2qvdQmx8jYyDVHZDMH0rRGct4Sq79Xet/TdicQEAwI150oFVYeCI 7mVdnUfBpaAHfu/M+0qh9DqO2pSU1AU=
    =FOcU
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to debian@emorrp1.name on Sat Jan 29 11:40:01 2022
    On Fri, Jan 28, 2022 at 9:42 AM Phil Morrell <debian@emorrp1.name> wrote:

    On Thu, Jan 27, 2022 at 11:27:45AM +0100, Stephan Lachnit wrote:
    On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <debian@emorrp1.name> wrote:

    TLDR: I think REUSE.software is a bad idea that is worse than what
    Debian already invented with Machine-readable debian/copyright file. I guess if upstream uses it, there's no reason not to ignore that as a source of copyright assertions.

    I expected some concerns about the complexity of the SPDX document,
    but certainly not about standardized copyright information in source
    files.

    Yes, Debian may have invented the machine-readable copyright bill, but
    not machine-readable copyright information in source files.

    Erm, no that's not what I'm saying? I'll requote my agreement with

    I *am* a big fan of SPDX-License-Identifier

    Yes I saw that line, but you also wrote
    TLDR: I think REUSE.software is a bad idea

    I apologize for the misunderstanding. Maybe next time write something
    like "While I am a big fan of copyright information in source files, I
    find certain aspects of REUSE bad".
    Because there may be valid concerns about the spec, however this is
    not really relevant for my proposal: It's mainly about allowing a
    different copyright format than [DEP-5 style], which _can_ be created automatically via REUSE.

    I will admit I'm somewhat skeptical in how often file-level copies
    happen these days, rather than folder-level or whole project forks. But
    it's easy enough to convince people to slap a single-line license
    comment in to avoid ambiguity.

    Obviously we as Debian are not a big fan of file-level copies anyways,
    but let's just say that REUSE wasn't written just for Debian. There
    are enough industry projects that use tons of imported code whether we
    like it or not, but it's certainly better with standardized copyright information than without.

    what REUSE is all about, and it greatly reduces manual labor - I don't understand how this can be seen as bad.

    Because being REUSE-compliant IMO greatly *increases* manual labor as
    soon as you're dealing with non-text forms, multiple authors and
    aggregation of differing copyright assertions. These are all things that
    the debian copyright-format has already solved without (as much) manual busywork, so if upstream is agreeable to formally documenting their copyrights, I'd much rather they just used that format in LICENSE.

    But it does not increase the manual labor for us! It actually
    decreases our work, that's what this is all about!

    The main point of my proposal: we, as package maintainers, don't have
    to do the bulk work anymore, upstream does it. We can just use this
    information which we would have done by hand otherwise. This is not
    about pushing REUSE to upstream projects from our side at all, but
    rather using it downstream to decrease manual labor if it already
    exists upstream.

    Yes it is called "Machine-readable debian/copyright file Version 1.0",
    but everybody knows it _is_ DEP-5, it is even in the spec in the
    second sentence of the abstract.

    Sure, and that's fine as a colloquialism, but you haven't addressed my objection to REUSE formalising that as part of the spec.

    If you look at [1]:
    Definitions
    [...]
    DEP5 — Machine-readable debian/copyright file, Version 1.0. Where the REUSE Specification and DEP5 state different things, the REUSE Specification takes precedence. Specifically in the case of the Copyright and License tags.

    And they link to the proper spec, so it is nothing but an abbreviation.

    The spec _is_ still DEP-5, being accepted doesn't change that.

    Sure it does, compare `#files-field` in both specs, from 2019 policy upgrading checklist 4.4.1. Perhaps that change should have bumped a
    version number, but it's a bit late now.

    Oh, thanks, I didn't know that!

    That has not been my experience for projects without a long history,
    they tend to not care about copyright initially, slap a generic
    assertion on it at some point, but without noticing they've included
    e.g. an embedded copy of zlib or less formally - used an image with a
    vague gratis use but omitting a formal license.

    It's only either later, or from the ITP scrutiny that some confusion
    over pedigree comes to light, someone fires off an email to an early contributor and gets the accurate license information. Or for Debian,
    the path gets added to Files-Excluded and patched out of compilation.

    Yes, surely copyright assertion mistakes happen from time to time. But
    these can happen everywhere, whether they slap a generic assertion on
    it or not. Just using the information REUSE provides doesn't mean that
    the code is free from any review, just from the tedious copyright
    review. If one detects an embedded copy of zlib, or really any other
    embedded code, this needs to be addressed anyway. Detecting these has
    nothing to do with any automated copyright review tools, but rather if
    a maintainer can actually detect the code.

    Maybe I should clarify what I mean by automated: I want to automate
    the process of creating and updating d/copyright, as well as the
    review in NEW.
    I consider making sure that the source code actually uses REUSE
    correctly still a duty of the maintainer. If you think we can't trust
    our maintainers enough, I'm open to discuss the idea that new source
    packages still need a manual copyright review in NEW, which would be
    mostly equivalent to the current situation except that updates to debian/copyright can be enforced even if there is no new binary (which currently is not the case btw).

    And projects that use REUSE
    are more likely to write that somewhere down as your average NPM
    package that puts a "under MIT license" in the readme and copies
    minified code from everywhere.

    Sure, but instead of wasting my time encouraging upstream to become REUSE-compliant, I would much rather promote a better standard like
    Debian's. I was curious and found approximately 40 instances of REUSE in codesearch, but multiple thousands of the (optional) copyright-format.

    First of all I don't want to force any maintainer to promote REUSE
    upstream - this is entirely up to the individual. I also don't want to
    force maintainers to use REUSE if supported upstream, if they want
    they could still do it with [DEP-5]. Again, this proposal is for an *alternative* to the current way. I don't see the need to depreciate
    our current system.

    Second of all - feel free to promote Debian's spec! I won't stop or
    even discourage you or anyone else to implement it - any standard is
    better than the "mess" we have right now. Even REUSE was influenced by
    it. But the [DEP-5] spec is clearly designed to be included in
    Debian's packaging, and less so for outside use.

    Quote from the spec [2]:
    Establishes a standard, machine-readable format for debian/copyright files within Debian packages

    Imagine an upstream project using [DEP-5]. Afaik there is no tool that
    can merge it with the copyright information for the packaging in the
    debian/ folder, so that we can actually use it for automation of
    upstream copyright information. Yes one can copy and paste, which is
    better than nothing, but worse than what REUSE aims to offer. Clearly,
    the specs have a different scope: REUSE for upstream use, [DEP-5] for downstream use.


    To stress it again: This is *not* about deprecating the current
    [DEP-5] spec, whether REUSE is a good spec, if there could be a better
    spec or if upstream projects should use it. This *is* about using the information REUSE provides if supported upstream as a way to
    automatically create copyright information. In particular, the
    underlying question is if we want to allow the *SPDX* standard (not
    REUSE) as an alternative way for developers to provide
    debian/copyright. REUSE naturally appears in this process as it allows
    to automatically create such SPDX documents, but REUSE itself has no
    direct impact on usage of the SPDX spec in debian/copyright.

    Regards,
    Stephan


    [1] https://reuse.software/spec/
    [2] https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonas Smedegaard@21:1/5 to All on Tue Feb 8 16:40:02 2022
    Quoting Stephan Lachnit (2022-01-26 12:49:34)
    - What is an SPDX bill of materials?
    It is a machine-readable format that specifies the licenses of each
    file in tag/value style like DEP-5. However compared to DEP-5 it is
    much less human readable, i.e. it includes much more meta information,
    and does not contain the license texts.

    - What has this to do with Debian?
    My idea is to allow SPDX documents in addition to DEP-5. The advantage
    is that - if supported upstream - REUSE can generate such reports automatically during package build time, so there is no need to write d/copyright manually anymore.

    I am sceptical towards this proposal.

    An important feature to me with current machine-readable format is that
    really it is machine-and-human-readable.

    Another important feature to me is that there is only one format (in
    addition to unformatted content, which hopefully we can put past us at
    some point).

    Today, I can as DD help proof-read and change *any* package in Debian.

    If we permit a debian/copyright format that is not human-readable, it
    means that I cannot confidently proof-read and change the contents of
    the debian subdir without the help of machine-parsers, and I would need
    to know two formats with different goals.

    I would like to instead welcome the REUSE developers in helping Debian
    evolve next version of the existing machine-readable format to better
    align with SPDX.


    - Jonas

    --
    * Jonas Smedegaard - idealist & Internet-arkitekt
    * Tlf.: +45 40843136 Website: http://dr.jones.dk/

    [x] quote me freely [ ] ask before reusing [ ] keep private --=============="53567198246669456=MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    Content-Description: signature
    Content-Type: application/pgp-signature; name="signature.asc"; charset="us-ascii"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEn+Ppw2aRpp/1PMaELHwxRsGgASEFAmICjrUACgkQLHwxRsGg ASFRtw/9GKsy/rk4eyNs6uAs8hkMS5NR8gKqJ2RPZqXKBv5hPmSjk35R/DkLuSxE lr9QX/n6EMaknzYQwQ/Buu4UCUPGnwOyuj7h7+dD2+Q7862ZPyAYHsYTfR7LjuLn wkLnExweNIXb2F0Nawk88Q8XsukpWwLSkWGnYDozz9dUgr3Pud4u0qHl7eYkS8EU EFkkwS8Hp3dApTbvccDnWtQUBB85L6iEqtyzmedsGB/vTKHcIK/y200YySeiQQK+ 24KyF8msG1jUcX3bfJXV52Ewn2lmYNfkcOxi/dXKyXTznZcOMupWOwp1aAKxxhu5 19zrsR99Ge9O23SWXXyd1um2Oy3pfKgsu4OUARBRemOzeBEDuGYK5JgiFjZ3Qb7b LM6wRdVmErpVRiAbSbWe46HVxABlxciVDdEL1Ip6tw9g+3una2EJe1SXWZLhn9fL vbKa9Sr0GP0IBzlpE3iv2DN91OYJIY1y8fFmw6fqBQk2bWGBLhXAYKwk70efGnld qpSNNugSAEuzmOm4P
  • From Scott Kitterman@21:1/5 to All on Tue Feb 8 17:10:01 2022
    On Tuesday, February 8, 2022 10:39:36 AM EST Jonas Smedegaard wrote:
    Quoting Stephan Lachnit (2022-01-26 12:49:34)

    - What is an SPDX bill of materials?
    It is a machine-readable format that specifies the licenses of each
    file in tag/value style like DEP-5. However compared to DEP-5 it is
    much less human readable, i.e. it includes much more meta information,
    and does not contain the license texts.

    - What has this to do with Debian?
    My idea is to allow SPDX documents in addition to DEP-5. The advantage
    is that - if supported upstream - REUSE can generate such reports automatically during package build time, so there is no need to write d/copyright manually anymore.

    I am sceptical towards this proposal.

    An important feature to me with current machine-readable format is that really it is machine-and-human-readable.

    Another important feature to me is that there is only one format (in
    addition to unformatted content, which hopefully we can put past us at
    some point).

    Today, I can as DD help proof-read and change *any* package in Debian.

    If we permit a debian/copyright format that is not human-readable, it
    means that I cannot confidently proof-read and change the contents of
    the debian subdir without the help of machine-parsers, and I would need
    to know two formats with different goals.

    I would like to instead welcome the REUSE developers in helping Debian
    evolve next version of the existing machine-readable format to better
    align with SPDX.

    Since Debian policy requires verbatim copies of licenses (or links to /usr/ share/common-licenses), I think any policy compliant debian/copyright will
    have to be human readable, but I'm not that familiar with SPDX, so maybe it will surprise me.

    I would be good to understand how this proposal supports Debian Policy.

    Scott K

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmICk4EACgkQeNfe+5rV mvFfEA//d9oxWK5sGGznC/L9hErtxOUY5LZACgnVhxqncbdB6+0UJc1x30ERJlk4 2hcelRg10LkruxwIXoU3w4Rp3giCBukjowdyywWuVrPXl9xgm+qzMrXBWEnikZPx GvaLtIODue380gJJi2x+AdkJmUmo54H74cl9URzFDXIpfUIhKRU5hiHF0dQucJyG B5Gzs2zj5P7txo4DqzFmE0EAUngHIo4jTHmeOIgWsonShzHLglMv1cOuzLNLVSy6 sdIh+zlGGDN0nQEaoqQCwbUf8OJRRG20t6WzEZLDDe3VBifrGTS2ma+rC0bs9IO2 uMWla2YCW0il3Y1BAMmO3ny88wZTYTEqGn84VJTF4EDHL/idpxYd72MTZn9wZqYX BBglAGp1LkU8r/uO5YBF9ePxDIuq/FIL8Lofxgg/+WoO0Ohf8UmVXI0S11h1Ndn0 9Ab0VKAYtU+wVRVEoGNqPv92VV3CIoY4RekCZQP80I52Bq19NueMi2AeoiFysvsk 3aVL4JQx5t/wWD/x129L3Qq5f87PEd2Bl3dtpSeMdKH3S7dry3xdM4sjg/SwqSDV kN7uqPDoO6T7S7Wf1HTV6FED2z6t5QmZ+uDjipBhz6rVSjZvqYsAgbI/Y58y4JgN IjFz296dCdmj0KZSFuZHNRmR/3v+VjntSneSmW8NFbd+B0rAEjs=
    =JKN0
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to jonas@jones.dk on Tue Feb 8 19:00:02 2022
    Hi Jonas,

    On Tue, Feb 8, 2022 at 4:39 PM Jonas Smedegaard <jonas@jones.dk> wrote:

    I am sceptical towards this proposal.

    An important feature to me with current machine-readable format is that really it is machine-and-human-readable.

    Thank you for your input! I'm aware of this concern, however I think
    it is not something that can't be solved.

    For one, while not as trivial to under as the current machine-readable copyright, it's still "human-readable" (i.e. a tag:value style text
    file). I would do the following comparison: if you only know Python
    (DEP-5), C++ (SPDX) might look a bit weird, but you can get the gist
    of it.

    However, I also think the human-readable aspect is less important here
    because it is an output format. What I mean with this is that the
    information is already there in a human readable way: either via REUSE
    or in the file headers directly. While it is theoretically possible to
    write SPDX documents by hand, I would not treat them with the same
    trust as one created by REUSE.

    Another important feature to me is that there is only one format (in
    addition to unformatted content, which hopefully we can put past us at
    some point).

    Today, I can as DD help proof-read and change *any* package in Debian.

    Regarding reviews: I plan to write a SPDX-to-DEP5 converter anyway to
    get a better feel for the spec. I will probably also write a copyright
    review tool that will show you the copyright header of each file based
    on DEP5 or SPDX information for validation / manual review. This will
    make proof-reading copyright information much easier.

    But to stress this again: the goal is to *replace* the manual
    copyright reviews by something much better: automatic copyright
    reviews. There are three areas of interest for copyright information:
    a) for developers writing it b) for the user receiving it and c) the
    legal side.

    Regarding a: From hand DEP5 is better, but for automation SPDX is equally good. Regarding b: I think they don't care anyway. Like which user reads the debian/copyright really? If at all, you are interested in the
    copyright of a certain library you wish to use, but this doesn't
    require the extensive file-by-file information of DEP5. Most likely
    the documentation provides much clearer information.
    Regarding c: SPDX is as good as DEP5 if not even better due to file hashes.

    If we permit a debian/copyright format that is not human-readable, it
    means that I cannot confidently proof-read and change the contents of
    the debian subdir without the help of machine-parsers, and I would need
    to know two formats with different goals.>

    I don't see the problem with machine parsers. We already use a lot of
    different tools for our processes (git, dput, dpkg, debhelper,
    lintian, uscan, a mail program, a text editor, ...), adding one more
    shouldn't be a big deal. It needs to be provided of course, but I plan
    to do that.

    I would like to instead welcome the REUSE developers in helping Debian
    evolve next version of the existing machine-readable format to better
    align with SPDX.

    While this would be nice, I think this is just unrealistic. While I
    may implement DEP5 output to REUSE, I still want to use SPDX because
    it is already an existing industry standard having all the "features"
    we want. Adding things like file hashes and referencing / merging
    other DEP5 documents is certainly possible, it would make the format
    less readable and in the end just SPDX looking differently.


    On Tue, Feb 8, 2022 at 5:00 PM Scott Kitterman <debian@kitterman.com> wrote:

    Since Debian policy requires verbatim copies of licenses (or links to /usr/ share/common-licenses), I think any policy compliant debian/copyright will have to be human readable, but I'm not that familiar with SPDX, so maybe it will surprise me.

    You can find an example in my initial mail [1].

    I would be good to understand how this proposal supports Debian Policy.

    It would require a minor change: putting the verbatim license texts in
    a single file is not possible anymore. But I don't why just copying
    the licenses to "/usr/share/doc/PACKAGE/licenses/LICENSE" in addition
    to the SPDX formatted debian/copyright would be any worse than the
    current way.


    Regards,
    Stephan

    [1] https://lists.debian.org/debian-devel/2022/01/msg00309.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Kitterman@21:1/5 to All on Tue Feb 8 19:20:01 2022
    On Tuesday, February 8, 2022 12:53:22 PM EST Stephan Lachnit wrote:
    On Tue, Feb 8, 2022 at 5:00 PM Scott Kitterman <debian@kitterman.com> wrote:
    Since Debian policy requires verbatim copies of licenses (or links to
    /usr/
    share/common-licenses), I think any policy compliant debian/copyright will have to be human readable, but I'm not that familiar with SPDX, so maybe
    it
    will surprise me.

    You can find an example in my initial mail [1].

    I would be good to understand how this proposal supports Debian Policy.

    It would require a minor change: putting the verbatim license texts in
    a single file is not possible anymore. But I don't why just copying
    the licenses to "/usr/share/doc/PACKAGE/licenses/LICENSE" in addition
    to the SPDX formatted debian/copyright would be any worse than the
    current way.


    Regards,
    Stephan

    [1] https://lists.debian.org/debian-devel/2022/01/msg00309.html

    Personally, I don't view that as a minor change.

    I think before starting a DEP on this you ought to work out the policy implications. Currently any package using your proposed approach would be instantly RC buggy.

    Scott K
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmICsnYACgkQeNfe+5rV mvFUUQ/9H+r42udZ0mcc+M9wAZp8rPKYrfdtuj3FwY2Q8k+qoNppR3vhB7r3A+Jd 5Cp9Ti+6ZFTxQqMKiFTNsCoB7S2bhs6N2hwDhvUvqKPopaq7ysNUFqm7vDtFIdqp nP9RpTa+MfNurFjRY6KRQ8w9n3RbjxCpZEpmRt5ykWZtoR6DL6Xpxk5qMca/Ka8W mHfTggbXANIA5x8GFkk2Yuw9xLk9PWbb2ssEd5H6Y1dFaUNc2XWq2ExJ+TWXA7uH fjYCPKkik06Y16hDQXYzD1CAao8wLhe+l48sxxvmucME2hy7Ou0j6TvLxBHRVjsb BkxeBh2vQwtB4X3p+RvYFbrev5S1SrZnLt7RLvQYA9P2A2s4HDqbs/vXOT1t7zWx vQsbtDyCnIr9XuIJMn76+YCyn7Nq23/MuTYdz+wLlUebMs4bxaM5GlCbFGc6JPDh Qyhyq5KkUVpbWfUvT6qs0msiHuPKgKYJGR1KdUM2zrGk1yrswY68IxjeRfveLLKA LlxhlzkOWYlhifQuqAwt/i4MyhF5TB1nyl/vC2+ayWHhR+0g/qausYeJ77JzS4bF ji+9GWvAHhd8euTiSgtMw9IY+SKUK18MWnyF2OgjXkGgi2QZ7jENJpNn2h3PDPuQ KfD/iSIGVUY6kLnLqiEaB9GdANcT4qRQIZOsHCQPbHS2d4mA1Fg=
    =ucs7
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to debian@kitterman.com on Tue Feb 8 19:30:01 2022
    The easy solution would just be allow both. Either only a single file with verbatim text or an SPDX document with licenses in a separate folder.

    Regards,
    Stephan

    On Tue, 8 Feb 2022, 19:12 Scott Kitterman, <debian@kitterman.com> wrote:

    On Tuesday, February 8, 2022 12:53:22 PM EST Stephan Lachnit wrote:
    On Tue, Feb 8, 2022 at 5:00 PM Scott Kitterman <debian@kitterman.com>
    wrote:
    Since Debian policy requires verbatim copies of licenses (or links to /usr/
    share/common-licenses), I think any policy compliant debian/copyright
    will
    have to be human readable, but I'm not that familiar with SPDX, so
    maybe
    it
    will surprise me.

    You can find an example in my initial mail [1].

    I would be good to understand how this proposal supports Debian Policy.

    It would require a minor change: putting the verbatim license texts in
    a single file is not possible anymore. But I don't why just copying
    the licenses to "/usr/share/doc/PACKAGE/licenses/LICENSE" in addition
    to the SPDX formatted debian/copyright would be any worse than the
    current way.


    Regards,
    Stephan

    [1] https://lists.debian.org/debian-devel/2022/01/msg00309.html

    Personally, I don't view that as a minor change.

    I think before starting a DEP on this you ought to work out the policy implications. Currently any package using your proposed approach would be instantly RC buggy.

    Scott K

    <div dir="auto"><div>The easy solution would just be allow both. Either only a single file with verbatim text or an SPDX document with licenses in a separate folder.</div><div dir="auto"><br></div><div dir="auto">Regards,</div><div dir="auto">Stephan<br><
    <div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">On Tue, 8 Feb 2022, 19:12 Scott Kitterman, &lt;<a href="mailto:debian@kitterman.com">debian@kitterman.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0
    .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tuesday, February 8, 2022 12:53:22 PM EST Stephan Lachnit wrote:<br>
    &gt; On Tue, Feb 8, 2022 at 5:00 PM Scott Kitterman &lt;<a href="mailto:debian@kitterman.com" target="_blank" rel="noreferrer">debian@kitterman.com</a>&gt; wrote:<br>
    &gt; &gt; Since Debian policy requires verbatim copies of licenses (or links to<br>
    &gt; &gt; /usr/<br>
    &gt; &gt; share/common-licenses), I think any policy compliant debian/copyright will<br>
    &gt; &gt; have to be human readable, but I&#39;m not that familiar with SPDX, so maybe<br>
    &gt; &gt; it<br>
    &gt; &gt; will surprise me.<br>
    &gt; <br>
    &gt; You can find an example in my initial mail [1].<br>
    &gt; <br>
    &gt; &gt; I would be good to understand how this proposal supports Debian Policy.<br>
    &gt; <br>
    &gt; It would require a minor change: putting the verbatim license texts in<br> &gt; a single file is not possible anymore. But I don&#39;t why just copying<br>
    &gt; the licenses to &quot;/usr/share/doc/PACKAGE/licenses/LICENSE&quot; in addition<br>
    &gt; to the SPDX formatted debian/copyright would be any worse than the<br> &gt; current way.<br>
    &gt; <br>
    &gt; <br>
    &gt; Regards,<br>
    &gt; Stephan<br>
    &gt; <br>
    &gt; [1] <a href="https://lists.debian.org/debian-devel/2022/01/msg00309.html" rel="noreferrer noreferrer" target="_blank">https://lists.debian.org/debian-devel/2022/01/msg00309.html</a><br>

    Personally, I don&#39;t view that as a minor change. <br>

    I think before starting a DEP on this you ought to work out the policy <br> implications.  Currently any package using your proposed approach would be <br>
    instantly RC buggy.<br>

    Scott K</blockquote></div></div></div>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Kitterman@21:1/5 to All on Tue Feb 8 20:00:01 2022
    Technically it would be the simplest, but there's a process for policy changes that's more involved than writing emails to d-devel. I'm suggesting you
    engage with it on this topic if you want the results of your work to be usable in Debian.

    Scott K

    On Tuesday, February 8, 2022 1:27:19 PM EST Stephan Lachnit wrote:
    The easy solution would just be allow both. Either only a single file with verbatim text or an SPDX document with licenses in a separate folder.

    Regards,
    Stephan

    On Tue, 8 Feb 2022, 19:12 Scott Kitterman, <debian@kitterman.com> wrote:
    On Tuesday, February 8, 2022 12:53:22 PM EST Stephan Lachnit wrote:
    On Tue, Feb 8, 2022 at 5:00 PM Scott Kitterman <debian@kitterman.com>

    wrote:
    Since Debian policy requires verbatim copies of licenses (or links to /usr/
    share/common-licenses), I think any policy compliant debian/copyright

    will

    have to be human readable, but I'm not that familiar with SPDX, so

    maybe

    it
    will surprise me.

    You can find an example in my initial mail [1].

    I would be good to understand how this proposal supports Debian
    Policy.

    It would require a minor change: putting the verbatim license texts in
    a single file is not possible anymore. But I don't why just copying
    the licenses to "/usr/share/doc/PACKAGE/licenses/LICENSE" in addition
    to the SPDX formatted debian/copyright would be any worse than the current way.


    Regards,
    Stephan

    [1] https://lists.debian.org/debian-devel/2022/01/msg00309.html

    Personally, I don't view that as a minor change.

    I think before starting a DEP on this you ought to work out the policy implications. Currently any package using your proposed approach would be instantly RC buggy.

    Scott K


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmICvVUACgkQeNfe+5rV mvFJQRAAkN89/Sq9CWKK48Z8nnzJI4HJ9oz0ws5bPlO6frK4TtH1ZlE8FHvNJo1z NPOFYwthMH5/GXEQpSf1A2GkeTvgNkH9jG1XrhLjkJDvh4WjIOK/hh9qFQqT3PS7 UIru6ZnBUkseTuex3kYYltG6POm0JgHSL23eDslXxWD1UmmDsiKpOXbciivmFLfV hNMr1o4cIHGjsdoktDdHDItuIZGXir8ST1WceN//6DM7jER0F7CPcUpD6SVMSB2i 7EAGSlosDxxfLCryGKj692iVbQLkDwSxAEbiZkS9Og5cl1Sr4hdoCgdAd+OVVxE2 e2w3dW6jbPNmJmEZkTWWbJCTbUAIBZtDiNnbOW03cnm7/lT8x56EaBq5lAy1ecaY 67THDPeXdOH4ez0GZ4MQKYRkwqe7wLWXLS7VrfdmoualwRBjrOZsTcdwQc++je1Z uHt8SQbZquVmnSisspydT84tdrd/vS5qMps2yq8JDhyVQvKJmNPMI7kt4kuishAt Odqf1IAi6Ol6WPL+UB2XJRj6GLlwfnOP3V9e7pOATWwLBuQP4onvFrkkVw7rvfcT F6qU2YBEjpjyIypXEHOiJSKBLiAaqv0v/CVvwX7HZtPpc67AlG62agQBkA+W/Hy3 w6wJfJQzGdSWHPtWLtN88vThS0MPiamfDl0F/8J/Erui0BOHQPo=
    =v54T
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Kitterman@21:1/5 to All on Tue Feb 8 20:50:02 2022
    On Tuesday, February 8, 2022 2:38:29 PM EST Russ Allbery wrote:
    Scott Kitterman <debian@kitterman.com> writes:
    Technically it would be the simplest, but there's a process for policy changes that's more involved than writing emails to d-devel. I'm suggesting you engage with it on this topic if you want the results of
    your work to be usable in Debian.

    Speaking as a Policy Editor, the way that Stephan is engaging with this process is fine. We're in the early stages of discussion and
    understanding the shape of the problem. Writing emails ot debian-devel,
    and writing a DEP, are exactly the right way to engage with Policy on this topic right now.

    Thanks. I haven't tried to do it myself before and didn't know.

    Scott K
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmICx9QACgkQeNfe+5rV mvGbAQ//SCMff5jOZATyHIer4NmEAoNEM4jv+VnNcduOzZrEQcvR7dtzFkbqqMpb RehB0hmsuLEs4U7O5DPvSc1eK2qHdxssiWARARTzYBKebszBuMMzvG1KAFY4uCeO 89B5m6AgELIM6wtrt5rN/Vdo/2kgz52Sqlk/qsq9E6oDjOd7yuvVTkM6MGI3q65D xUTo+mNT9XsLERWw/YSTLbYXgzf6JQpCEdmX+85adOGu0Q9xfgbs+Kg+ltqMQTLc JJxvSuJ8xsXTO7TeRkGR7jK1zxOIQuWz/1SNRr/+BkzBpRB3/vjMvIizU4UcQfUb enhSWuT+xCVBsdxhabcC6ggTo6MmkRXoRYoE1rUaIwoV4YgLL2c6rdD8SeF91oiN PavPmvOYSZrFJfED5vf04SmBQohI+t8CxxRhj+r4sAuA1lYcK8gInIySBnB/QIwG fll4kWaeYjtM0E52JjIpyn6ocAzs7mDF9qDVfDQ4FCazBHGCn/3NwdvFrIvJD3hg alQPbM+opcn5x6lqaP2U3qPYprVn9tuva5DR5kOlnCFDdbLhlp9p2aL42OUP5+Y1 MhHuY4QMmmKvt/nr3153nutzB55P3B++jNyd939ekmCjjkPra8bOts3bSCW5m9+o 51Vh8Z1L/KBENhBOdtIBob6bbn7eaJotyc61SRa+qOpZsujnYjE=
    =VLvL
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Scott Kitterman on Tue Feb 8 20:40:01 2022
    Scott Kitterman <debian@kitterman.com> writes:

    Technically it would be the simplest, but there's a process for policy changes that's more involved than writing emails to d-devel. I'm
    suggesting you engage with it on this topic if you want the results of
    your work to be usable in Debian.

    Speaking as a Policy Editor, the way that Stephan is engaging with this
    process is fine. We're in the early stages of discussion and
    understanding the shape of the problem. Writing emails ot debian-devel,
    and writing a DEP, are exactly the right way to engage with Policy on this topic right now.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Stephan Lachnit on Tue Feb 8 20:50:02 2022
    Stephan Lachnit <stephanlachnit@debian.org> writes:

    It would require a minor change: putting the verbatim license texts in a single file is not possible anymore. But I don't why just copying the licenses to "/usr/share/doc/PACKAGE/licenses/LICENSE" in addition to the
    SPDX formatted debian/copyright would be any worse than the current way.

    I recommend thinking about how to generate an existing debian/copyright
    file and putting the SPDX-formatted one in a different location. You're
    going to want to decouple the new work from any transition that requires
    other people change tools. There are a lot of tools that make assumptions about debian/copyright, and trying to track them all down will be counterproductive for trying to make forward progress on exposing
    information in a more interoperable format.

    The way I see this, there are three different things that have been
    discussed on this thread:

    1. Consuming upstream data in SPDX or REUSE format and automating the
    generation of required Debian files using it.

    2. Exposing SPDX data for our packages for downstream consumption.

    3. Aligning the standard way to present Debian copyright information with
    other standards.

    I can tell you from prior experience with DEP-5 that 3 is wildly
    controversial and will produce the most pushback. There are a lot of
    packagers who flatly refuse to even use the DEP-5 format, and (pace
    Jonas's aspirations) I don't expect that to change any time soon.

    I think that's fine for what you're trying to accomplish, because I think
    1 and 2 are where the biggest improvements can be found. As long as your system for managing copyright and license information can spit out a DEP-5 debian/copyright file (in its current or a minorly modified format, not
    with new files elsewhere that would have to be extracted from the
    package), you are then backward-compatible with the existing system and
    that takes 3 largely off the table (which is what you want). Then you can demonstrate the advantages of the new system and people can choose to be convinced by those advantages or not, similar to DEP-5, and we can reach
    an organic consensus without anyone feeling like they're forced to change
    how they do things.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonas Smedegaard@21:1/5 to All on Tue Feb 8 20:40:02 2022
    Hi Stephan,

    Quoting Stephan Lachnit (2022-02-08 18:53:22)
    On Tue, Feb 8, 2022 at 4:39 PM Jonas Smedegaard <jonas@jones.dk>
    wrote:

    I am sceptical towards this proposal.

    An important feature to me with current machine-readable format is
    that really it is machine-and-human-readable.

    Thank you for your input! I'm aware of this concern, however I think
    it is not something that can't be solved.

    For one, while not as trivial to under as the current machine-readable copyright, it's still "human-readable" (i.e. a tag:value style text
    file). I would do the following comparison: if you only know Python
    (DEP-5), C++ (SPDX) might look a bit weird, but you can get the gist
    of it.

    For starters, the format adds one SHA1 hash per source file, right?

    Sure I can "just" ignore all FileChecksum: lines, but anyone working
    with XML will know that plaintext does not equal human-readable.


    However, I also think the human-readable aspect is less important here because it is an output format. What I mean with this is that the information is already there in a human readable way: either via REUSE
    or in the file headers directly. While it is theoretically possible to
    write SPDX documents by hand, I would not treat them with the same
    trust as one created by REUSE.

    Here you seem to assume that humans need not be involved in authoring
    the contents or at least that human-facing interfaces for smart tools
    exist and is expressive enough to cover all that is needed.

    That is quite an assumption, I dare say.

    Writing the debian/copyright file for ghostscript took quite some time. Singularity is imminent, I know, and I wouldn't mind machines taking
    over the task of classifying tights statements, when they are up to the
    task - but until then I will want to proof-read and intervene as needed.
    My own experience is that they are not yet there - you seem to claim
    they have already surpassed humans for this task...

    Can you show me (off list if too long for an attachment) how your new not-really-needing-manual-editing file for ghostscript looks like, so I
    can compare with my lesser trusted human-laboured product?


    Another important feature to me is that there is only one format (in addition to unformatted content, which hopefully we can put past us
    at some point).

    Today, I can as DD help proof-read and change *any* package in
    Debian.

    Regarding reviews: I plan to write a SPDX-to-DEP5 converter anyway to
    get a better feel for the spec. I will probably also write a copyright review tool that will show you the copyright header of each file based
    on DEP5 or SPDX information for validation / manual review. This will
    make proof-reading copyright information much easier.

    Seems to are now talking not about a format, but a detection mechanism.


    But to stress this again: the goal is to *replace* the manual
    copyright reviews by something much better: automatic copyright
    reviews.

    Great. But orthogonal to switching format: detection tools can
    serialize their findings in current machine-readable format. Either by themselves, or for tools that can only output REUSE format *AND* if that output fully covers Debian needs, then othr tools can reformat that to
    current format.

    My point here is not that there is no benefit in using REUSE. My point
    is that detecting rights information is orthogonal to serializing it.


    There are three areas of interest for copyright information:
    a) for developers writing it b) for the user receiving it and c) the
    legal side.

    Regarding a: From hand DEP5 is better, but for automation SPDX is equally good.
    Regarding b: I think they don't care anyway. Like which user reads the debian/copyright really? If at all, you are interested in the
    copyright of a certain library you wish to use, but this doesn't
    require the extensive file-by-file information of DEP5. Most likely
    the documentation provides much clearer information.
    Regarding c: SPDX is as good as DEP5 if not even better due to file hashes.

    So new format is at best "equally good" as current format, except that outperforms current format by adding file hashes.

    That is probably a simplification. Ok, let's then use it as an example:

    You can add file hashes to debian/copyright files *today* - the standard permits unofficial fields, and we could then elevate certain fileds to
    make them official in a later revision of the current format.

    Adding hashes would clutter the files, making them less readable, but in
    your argument that's a feature with no real drawback, so let's play
    along for now.

    Any feature improvements that cannot be an evolution of current format?


    If we permit a debian/copyright format that is not human-readable,
    it means that I cannot confidently proof-read and change the
    contents of the debian subdir without the help of machine-parsers,
    and I would need to know two formats with different goals.

    I don't see the problem with machine parsers. We already use a lot of different tools for our processes (git, dput, dpkg, debhelper,
    lintian, uscan, a mail program, a text editor, ...), adding one more shouldn't be a big deal. It needs to be provided of course, but I plan
    to do that.

    Only 2 of those you list are mandatory: dpkg and RFC822 email - the rest
    are optional, some quite popular but even then routinely bypassed.


    I would like to instead welcome the REUSE developers in helping
    Debian evolve next version of the existing machine-readable format
    to better align with SPDX.

    While this would be nice, I think this is just unrealistic. While I
    may implement DEP5 output to REUSE, I still want to use SPDX because
    it is already an existing industry standard having all the "features"
    we want. Adding things like file hashes and referencing / merging
    other DEP5 documents is certainly possible, it would make the format
    less readable and in the end just SPDX looking differently.

    How do you know that SPDX already cover all the features we want?

    And if if does, then how is SPDX not a simple superset of current
    format, and therefore a simple matter of identifting and adding missing pieces?

    I would be quite happy if our work on evolving debian/copyright would
    result in a future revision being identical to REUSE format.

    What I dislike is requiring all developers to master 3 formats instead
    of currently only two: freeform-human-only and (also-)machine-readable.

    Current format was designed to a) cover the existing needs of Debian,
    and b) not discourage too many developers from using it - to raise the likelihood of a future possibility that we fully embrace it as the one
    single format for us all to use.

    As for a) it might turn out that Debian is not special - i.e. that all
    our needs are fully covered by the industry standard that sprung up
    inspired by our earlier work. That would be great. Let's explore if
    that is a fact. I invite to exploring that by taking our existing
    format and morphing it step by step, checking at each step if we loose something and if so if that is acceptable.

    As for b) I highly doubt that those insisting on writing their copyright
    files by hand would embrace a lesser-human-readable format instead.
    Please note that those that would happily embrace *any* format which
    would relieve them of doing work themselves do not count here - they
    already use "licensecheck --deb-machine *" or one of the wrappers for
    that command.


    - Jonas

    --
    * Jonas Smedegaard - idealist & Internet-arkitekt
    * Tlf.: +45 40843136 Website: http://dr.jones.dk/

    [x] quote me freely [ ] ask before reusing [ ] keep private --==============T94653809895620745=MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    Content-Description: signature
    Content-Type: application/pgp-signature; name="signature.asc"; charset="us-ascii"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEn+Ppw2aRpp/1PMaELHwxRsGgASEFAmICxjUACgkQLHwxRsGg ASFnZBAAlredBvNXNk1qMvEpJhYbtwWStaohJpfE0BYAe4xrNm9vXrgF9uHCXflC 1r1j76leiGUhOiyET1y7bhEFdZ6xa+IOORpscnEiXmw56to1qAa3xEXO7w5Lan1g v2rGqTtIamg7fm5XjsrosE0iOt6Nh+3CLxj8JpwHdKSOU9SOJ5tFGpWM2pHxDklb nLFR6DrHgTVb4LPD1hnft9Y5hWnMWmeoaDTnP4JmbSWCXIgho3dNc94EBS+vp/n0 wwYLWH3/CQH/H8LXWpwV6xW8gE+pmoaztwXVy0LNXPUm1a85DqKgoO7GY9l0wHFG YiS5GIzIq/+JK0M6lDWLkPnz6hq6thZPS0BRKVUVlpnt3vyLTToEYfXtQVtHiWN4 45pmg1VqU9NVplcKWEha0fNNyXkurdvitqTluJFncog6loA94kZbyjGYFlprhzvv iq8IEvkyybl9l6+YcE+fFTwk4ezQ9pdHiBKLzko/Y1F/mwk4IhHzYPLhywdukwQg D+3EYolLLyfiSIqeF
  • From Russ Allbery@21:1/5 to Russ Allbery on Tue Feb 8 21:40:02 2022
    Russ Allbery <rra@debian.org> writes:

    I can tell you from prior experience with DEP-5 that 3 is wildly controversial and will produce the most pushback. There are a lot of packagers who flatly refuse to even use the DEP-5 format, and (pace
    Jonas's aspirations) I don't expect that to change any time soon.

    That last parenthetical was in retrospect probably not very clear,
    particularly to people who aren't familiar with that specific English
    idiom. For the record, this is "pace" in the sense of:

    https://www.merriam-webster.com/word-of-the-day/pace-2017-09-28

    What I meant to express is that I realize that Jonas is actively working
    on this and is hopeful, and I would love for him to be successful, but I
    am more pessimistic. But I wanted to acknowledge that he disagrees and I
    may well be wrong.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adam Borowski@21:1/5 to All on Tue Feb 8 23:10:01 2022
    Guys, once again we had a complaint about forcing people to waste their time
    on copyright matters then wait months or years for review of said matters
    -- just for the discussion degenerate into a proposal to bring even MORE copyright into our life!

    - What is REUSE?
    The REUSE specification [1] is a specification to make copyright machine-readable in the source files itself. It is straightforward to implement, add (e.g.) "SPDX-FileCopyrightText: 2019 Jane Doe
    [...]
    The spec is made by the Free Software Foundation Europe (FSFE) and is
    already implemented in several projects [3].

    ... and this proposal includes gems such as an extra copyright-only file per every image. Dᴏ ɴᴏᴛ ᴡᴀɴᴛ!

    The Social Contract says clearly:
    "Our priorities are our users and free software"
    -- NOT copyright lawyers.

    I, and probably others, consider copyright to be a crime against humanity
    -- and this is not just a figure of speech[1]. We are forced to comply
    with these laws, risking fines and violence if we don't, but why should
    we put more effort than the minimum necessary?

    Other distributions have proven than doing less than we do is enough. And
    even perfect compliancy is no defense against a multi-decade lawsuit that generates massive costs.

    And us poring over copyrights of every file does cost us a lot -- the
    time of developers, especially highly skilled ones, does cost quite a
    penny. That most of Debian work is volunteer doesn't lessen the value of
    that time. Let's not waste it.


    Meow!

    [1]. diff(humans vs other animals) =~ transmission of ideas, etc
    --
    ⢀⣴⠾⠻⢶⣦⠀ Aryans: split from other Indo-Europeans ~2900-2000BC → Ural →
    ⣾⠁⢠⠒⠀⣿⡁ Bactria → settled 2000-1000BC in northwest India. ⢿⡄⠘⠷⠚⠋⠀ Gypsies: came ~1000AD from northern India; aryan. ⠈⠳⣄⠀⠀⠀⠀ Germans: IE people who came ~2800BC to Scandinavia; not aryan.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Adam Borowski on Tue Feb 8 23:40:01 2022
    Adam Borowski <kilobyte@angband.pl> writes:

    Guys, once again we had a complaint about forcing people to waste their
    time on copyright matters then wait months or years for review of said matters -- just for the discussion degenerate into a proposal to bring
    even MORE copyright into our life!

    To be very clear, the reason why I am interested in SPDX and REUSE is so
    that I can do *less* copyright, specifically so that I can automate
    generation of debian/copyright for my packages from information that
    upstream already tracks and never have to think about it, touch it, or
    update it.

    This doesn't achieve less copyright law in the world (I fear nothing we do technically can do that since that's a political and legal issue), but it achieves the goal of me spending less total time dealing with copyright
    and licensing issues than I do now.

    So I think we may be more aligned than it may appear. Yes, I think
    everyone knows that SPDX and REUSE can specify a ton of extra stuff that
    we aren't tracking now. The goal (at least as I see it) is *not* to make everyone track all that stuff. It's to (a) use that work if it already
    exists instead of making folks waste time repeating that work, and (b)
    expose the information that we did have to gather in a standard format so
    that other people can similarly have less copyright in *their* life by automating their pain as well.

    But to be very clear, what I'm hoping to get out of this is less time
    spent on copyright and licensing, not more.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gard Spreemann@21:1/5 to Adam Borowski on Wed Feb 9 11:00:01 2022
    Adam Borowski <kilobyte@angband.pl> writes:

    Guys, once again we had a complaint about forcing people to waste their time on copyright matters then wait months or years for review of said matters
    -- just for the discussion degenerate into a proposal to bring even MORE copyright into our life!

    - What is REUSE?
    The REUSE specification [1] is a specification to make copyright
    machine-readable in the source files itself. It is straightforward to
    implement, add (e.g.) "SPDX-FileCopyrightText: 2019 Jane Doe
    [...]
    The spec is made by the Free Software Foundation Europe (FSFE) and is
    already implemented in several projects [3].

    ... and this proposal includes gems such as an extra copyright-only file per every image. Dᴏ ɴᴏᴛ ᴡᴀɴᴛ!

    The Social Contract says clearly:
    "Our priorities are our users and free software"
    -- NOT copyright lawyers.

    I, and probably others, consider copyright to be a crime against humanity
    -- and this is not just a figure of speech[1]. We are forced to comply
    with these laws, risking fines and violence if we don't, but why should
    we put more effort than the minimum necessary?

    To better protect our users as they exercise point 1 of the DFSG,
    obviously! We can disagree about what the right amount of such effort
    is, and its diminishing returns, but I don't see how the
    intention/objective itself is at question here.


    -- Gard


    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iQJGBAEBCgAwFiEEz8XvhRCFHnNVtV6AnRFYKv1UjPoFAmIDjjYSHGdzcHJAbm9u ZW1wdHkub3JnAAoJEJ0RWCr9VIz6xr4P/2qP/KPVBiGL+8Yunvrl2pwYNQU6MZcQ GCGYBApn8agNLXAEzRDaq/PBKLIfbhVmImmG7hvLlPtRNRTzgrFNwarGjL9B9sOJ uIWGCc69NnX2cQaFgx6ZZMdUSrJPJEfMoYtFqFUYtwNcLX/6RCRwv6PMcyWWE0Ty wEqvdpfvzi94TyRVo34J2u0OSJUSlIP06VAoIbtGl9L2gzvViRC8iyGoXAJ1BiZL N10O2CnlnJ21fFfXRVEy/RhlQAaPaXyLnvOh2RddxJaQ6mjlguT+NU9oBr0gy2ap q0NmK2vqrVUpbBxjH6vfmF6ktkQmcxCRJE92g66FcKjf/kjGzuT7fuzx4FoFX8Eu MbJpReVe/oc+C8CLR18VP2XF1YCbAhVcsS2d8f+ab2ScRVJdDqbcu3vt+qDCS1Ip ctPO3TXDZPL0w0MGOxE5e9mQQqj1RUVSFYswlALRHyNS9kBdXxSwYGVYGcuWpN3E PMYrbaDBznGGm5jyh5WafWzOaOaG7lABwKxEbfq2WF3AAaFChe11q925qLc++Efp 8mRhZpRdVlDcri1DyVD3Fsg120aGWB+yvdqAvpCZ6eyf2ELe31cBZyv9IuVo21L7 nHeT2CSLVmWNmzRxE9HJyGi2lRW6xiBwQHvYkvlUIE6lQ5Naq2gTyFO2wICOPc0Q
    VcgFZOx3wNDA
    =/Gwt
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to rra@debian.org on Thu Feb 10 12:00:01 2022
    On Tue, Feb 8, 2022 at 8:45 PM Russ Allbery <rra@debian.org> wrote:

    I recommend thinking about how to generate an existing debian/copyright
    file and putting the SPDX-formatted one in a different location. You're going to want to decouple the new work from any transition that requires other people change tools. There are a lot of tools that make assumptions about debian/copyright, and trying to track them all down will be counterproductive for trying to make forward progress on exposing
    information in a more interoperable format.

    The way I see this, there are three different things that have been
    discussed on this thread:

    1. Consuming upstream data in SPDX or REUSE format and automating the
    generation of required Debian files using it.

    2. Exposing SPDX data for our packages for downstream consumption.

    3. Aligning the standard way to present Debian copyright information with
    other standards.

    I can tell you from prior experience with DEP-5 that 3 is wildly controversial and will produce the most pushback. There are a lot of packagers who flatly refuse to even use the DEP-5 format, and (pace
    Jonas's aspirations) I don't expect that to change any time soon.

    I think that's fine for what you're trying to accomplish, because I think
    1 and 2 are where the biggest improvements can be found. As long as your system for managing copyright and license information can spit out a DEP-5 debian/copyright file (in its current or a minorly modified format, not
    with new files elsewhere that would have to be extracted from the
    package), you are then backward-compatible with the existing system and
    that takes 3 largely off the table (which is what you want). Then you can demonstrate the advantages of the new system and people can choose to be convinced by those advantages or not, similar to DEP-5, and we can reach
    an organic consensus without anyone feeling like they're forced to change
    how they do things.

    Thanks for this input, Russ! I think you're right: it will be easier
    to output the DEP5 format in addition to SPDX at the beginning, and
    see from there how it works. I would install the source SPDX document
    then to /usr/share/doc/PACKAGE/copyright_spdx in addition to the DEP5
    file in the usual place.

    I will write a SPDX -> DEP5 tool for this, which should be "fairly
    trivial". Regarding concerns about the different file formats SPDX can
    come in: for us only the tag:value format makes sense, I don't want to
    support other formats.

    On Tue, Feb 8, 2022 at 8:36 PM Jonas Smedegaard <jonas@jones.dk> wrote:

    For starters, the format adds one SHA1 hash per source file, right?

    Yes, one checksum per file.

    Sure I can "just" ignore all FileChecksum: lines, but anyone working
    with XML will know that plaintext does not equal human-readable.

    This comparison is a bit off, XML is a representation. The SPDX format
    I want to use is tag:value just like DEP5, so in this regard
    "human-readable". There is more cruft content, but it takes less than
    5 minutes to understand where the per file copyright and license
    information is.

    However, I also think the human-readable aspect is less important here because it is an output format. What I mean with this is that the information is already there in a human readable way: either via REUSE
    or in the file headers directly. While it is theoretically possible to write SPDX documents by hand, I would not treat them with the same
    trust as one created by REUSE.

    Here you seem to assume that humans need not be involved in authoring
    the contents or at least that human-facing interfaces for smart tools
    exist and is expressive enough to cover all that is needed.

    That is quite an assumption, I dare say.

    I think this is a misconception: I don't want people to write SPDX
    documents by hand at all. IMHO for this scenario, DEP5 is still
    superior (that's e.g. why REUSE can also use DEP5).

    Writing the debian/copyright file for ghostscript took quite some time. Singularity is imminent, I know, and I wouldn't mind machines taking
    over the task of classifying tights statements, when they are up to the
    task - but until then I will want to proof-read and intervene as needed.
    My own experience is that they are not yet there - you seem to claim
    they have already surpassed humans for this task...

    Can you show me (off list if too long for an attachment) how your new not-really-needing-manual-editing file for ghostscript looks like, so I
    can compare with my lesser trusted human-laboured product?

    No, because if ghostscript doesn't have the information to
    automatically generate a SPDX document, don't do it by hand, use DEP5
    instead.

    What you can do is to put your DEP5 in .reuse/dep5 in the top-level
    dir and run "reuse spdx" if you want to see how it looks.

    Regarding reviews: I plan to write a SPDX-to-DEP5 converter anyway to
    get a better feel for the spec. I will probably also write a copyright review tool that will show you the copyright header of each file based
    on DEP5 or SPDX information for validation / manual review. This will
    make proof-reading copyright information much easier.

    Seems to are now talking not about a format, but a detection mechanism.

    Exactly! This entire thing is not about format really, but detection mechanisms. And the standard format (outside of Debian) for "detected"
    upstream copyright information is SPDX, that's why I want to use it.

    Regarding the review tool: being able to have the checksums from the
    previous version makes it easy to only review the files where the
    checksum changed. Cool, right?

    So new format is at best "equally good" as current format, except that outperforms current format by adding file hashes.

    That is probably a simplification. Ok, let's then use it as an example:

    You can add file hashes to debian/copyright files *today* - the standard permits unofficial fields, and we could then elevate certain fileds to
    make them official in a later revision of the current format.

    Adding hashes would clutter the files, making them less readable, but in
    your argument that's a feature with no real drawback, so let's play
    along for now.

    Yes I agree that adding hashes to DEP5 makes it unreadable and utterly
    annoying to maintain, that's why I don't want to add it. DEP5 is
    designed to be written by humans, SPDX is not. That's why SPDX can add
    hashes without any drawbacks.

    I don't see the problem with machine parsers. We already use a lot of different tools for our processes (git, dput, dpkg, debhelper,
    lintian, uscan, a mail program, a text editor, ...), adding one more shouldn't be a big deal. It needs to be provided of course, but I plan
    to do that.

    Only 2 of those you list are mandatory: dpkg and RFC822 email - the rest
    are optional, some quite popular but even then routinely bypassed.

    I mean if you want you can write SPDX files by hand, it's not a binary
    format. Same as you can write a Debian package without debhelper.

    How do you know that SPDX already cover all the features we want?

    What do we need? File based copyright and license information. SPDX
    offers that, and so does DEP5. In this regard, both specs have all the
    things we _need_.

    What would be nice to have? Something that allows us to do more
    automation. SPDX includes file hashes, so it can be very easily
    checked if a SPDX copyright document is valid (ignoring mistakes in
    copyright and license assertions). For DEP5, this is impossible
    without cluttering the spec with hashes.

    And if if does, then how is SPDX not a simple superset of current
    format, and therefore a simple matter of identifting and adding missing pieces?

    Again, we could add them, but it would make DEP5 nearly impossible to
    write by hand, something I don't want. I still have packages that I
    wouldn't convert to SPDX (by hand) anytime soon, because they don't
    offer the information to automatically create the required
    information.

    DEP5 could be used via reuse as an intermediate representation for
    developers if no REUSE information is available, but let's ignore this
    for now.

    I would be quite happy if our work on evolving debian/copyright would
    result in a future revision being identical to REUSE format.

    What I dislike is requiring all developers to master 3 formats instead
    of currently only two: freeform-human-only and (also-)machine-readable.

    No, you don't have to master SPDX! That's the point: you don't
    interact with it at all. It's created by tools, and shipped to satisfy
    the legal obligation to provide copyright information. Users don't
    care how the copyright information is shipped. As a developer, you
    just have one less thing to care about, namely writing
    debian/copyright by hand.

    Current format was designed to a) cover the existing needs of Debian,
    and b) not discourage too many developers from using it - to raise the likelihood of a future possibility that we fully embrace it as the one
    single format for us all to use.

    I don't want to force people to use this new format if they don't want
    to. I really don't care if others want to put a lot of work in debian/copyright, but I want to use tools so that I (and others that
    feel the same) don't have to handle it anymore. DEP5 is just not
    designed for this automated use case, and that's totally fine. It's
    good at what it does now, but extending it to an automated use case
    would make it bad at what it was good at: being simple (and all the
    points you mentioned).


    Regards,
    Stephan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dude@21:1/5 to Gard Spreemann on Thu Feb 10 12:20:01 2022
    This is a multi-part message in MIME format.
    Linus: the (GPL 2.0) intented social contract is: “i give you
    sourcecode, give me back your changes”

    https://dwaves.de/2022/01/31/why-is-it-gnu-linux-and-not-just-linux-linus-talking-about-gpl-v3-vs-gpl-v2-the-better-one-the-social-gpl-contract-is-i-give-you-sourcecode-give-me-back-your-changes-non-free-binary/

    if the developer does not want-need changes back GPL 3.0 is also "okayish"

    the kernel licensing is also rather... complicated... (with the many
    different versions of GPL and LGPL) maybe a user can do a 30min
    entertaining sum up video explanation of this ...

    *


    Linux kernel licensing rules

    * The Linux Kernel is provided under the terms of the GNU General
    Public License version 2 only (GPL-2.0), as provided in
    LICENSES/preferred/GPL-2.0, with an explicit syscall exception
    described in LICENSES/exceptions/Linux-syscall-note, as described in
    the COPYING file.This documentation file provides a description of
    how each source file should be annotated to make its license clear
    and unambiguous. It doesn’t replace the Kernel’s license.The license
    described in the COPYING file applies to the kernel source as a
    whole, *though individual source files can have a different license
    which is required to be compatible with the GPL-2.0*:

    GPL-1.0+ : GNU General Public License v1.0 or later
    GPL-2.0+ : GNU General Public License v2.0 or later <https://spdx.org/licenses/GPL-2.0-or-later.html>
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/LICENSES/preferred/GPL-2.0?h=v5.17-rc2
    LGPL-2.0 : GNU Library General Public License v2 only
    LGPL-2.0+ : GNU Library General Public License v2 or later
    LGPL-2.1 : GNU Lesser General Public License v2.1 only
    LGPL-2.1+ : GNU Lesser General Public License v2.1 or later

    src: https://docs.kernel.org/process/license-rules.html

    * actually there is a whole folder “LICENCE” that is shipped with the
    kernel sources, which has the following subfolders:
    <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES?h=v5.17-rc2>

    o deprecated
    <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/deprecated?h=v5.17-rc2>
    o dual
    <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/dual?h=v5.17-rc2>
    o exceptions
    <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/exceptions?h=v5.17-rc2>
    o preferred
    <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/preferred?h=v5.17-rc2>
    * *here is a list of all sorts of free licences
    https://spdx.org/licenses/ **(RSS Feed)*
    <https://lists.spdx.org/g/spdx/rss>

    PS: yes iterating over this stuff takes time, anyone ever read the whole
    GPL 2.0?

    actually did - entertainment factor was... okay

    On 2/9/22 10:45, Gard Spreemann wrote:

    The Social Contract says clearly:
    "Our priorities are our users and free software"
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <p>Linus: the (GPL 2.0) intented social contract is: “i give you
    sourcecode, give me back your changes”<br>
    <br>
    <a class="moz-txt-link-freetext" href="https://dwaves.de/2022/01/31/why-is-it-gnu-linux-and-not-just-linux-linus-talking-about-gpl-v3-vs-gpl-v2-the-better-one-the-social-gpl-contract-is-i-give-you-sourcecode-give-me-back-your-changes-non-free-binary/">
    https://dwaves.de/2022/01/31/why-is-it-gnu-linux-and-not-just-linux-linus-talking-about-gpl-v3-vs-gpl-v2-the-better-one-the-social-gpl-contract-is-i-give-you-sourcecode-give-me-back-your-changes-non-free-binary/</a></p>
    <p>if the developer does not want-need changes back GPL 3.0 is also
    "okayish"</p>
    <p>the kernel licensing is also rather... complicated... (with the
    many different versions of GPL and LGPL) maybe a user can do a
    30min entertaining sum up video explanation of this ...<br>
    </p>
    <ul>
    <li>
    <h1>Linux kernel licensing rules</h1>
    </li>
    <li>The Linux Kernel is provided under the terms of the GNU
    General Public License version 2 only (GPL-2.0), as provided in
    LICENSES/preferred/GPL-2.0, with an explicit syscall exception
    described in LICENSES/exceptions/Linux-syscall-note, as
    described in the COPYING file.This documentation file provides a
    description of how each source file should be annotated to make
    its license clear and unambiguous. It doesn’t replace the
    Kernel’s license.The license described in the COPYING file
    applies to the kernel source as a whole, <strong>though
    individual source files can have a different license which is
    required to be compatible with the GPL-2.0</strong>:
    <pre>GPL-1.0+ : GNU General Public License v1.0 or later
    <a href="https://spdx.org/licenses/GPL-2.0-or-later.html">GPL-2.0+ : GNU General Public License v2.0 or later</a>
    <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/LICENSES/preferred/GPL-2.0?h=v5.17-rc2" class="moz-txt-link-freetext">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/LICENSES/preferred/GPL-2.0?h=
    v5.17-rc2</a>
    LGPL-2.0 : GNU Library General Public License v2 only
    LGPL-2.0+ : GNU Library General Public License v2 or later
    LGPL-2.1 : GNU Lesser General Public License v2.1 only
    LGPL-2.1+ : GNU Lesser General Public License v2.1 or later
    </pre>
    <p>src: <a
    href="https://docs.kernel.org/process/license-rules.html"
    class="moz-txt-link-freetext">https://docs.kernel.org/process/license-rules.html</a></p>
    </li>
    <li>actually there is a <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES?h=v5.17-rc2">whole
    folder “LICENCE” that is shipped with the kernel sources,
    which has the following subfolders:</a>
    <ul>
    <li><a class="ls-dir" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/deprecated?h=v5.17-rc2">deprecated</a></li>
    <li><a class="ls-dir" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/dual?h=v5.17-rc2">dual</a></li>
    <li><a class="ls-dir" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/exceptions?h=v5.17-rc2">exceptions</a></li>
    <li><a class="ls-dir" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/preferred?h=v5.17-rc2">preferred</a></li>
    </ul>
    </li>
    <li><strong>here is a list of all sorts of free licences <a
    href="https://spdx.org/licenses/"
    class="moz-txt-link-freetext">https://spdx.org/licenses/</a> </strong><a
    href="https://lists.spdx.org/g/spdx/rss"><strong>(RSS Feed)</strong></a></li>
    </ul>
    <p>PS: yes iterating over this stuff takes time, anyone ever read
    the whole GPL 2.0?</p>
    <p>actually did - entertainment factor was... okay<br>
    </p>
    <p>On 2/9/22 10:45, Gard Spreemann wrote:</p>
    <blockquote type="cite" cite="mid:87h798l689.fsf@nonempty.org">
    <pre class="moz-quote-pre" wrap="">The Social Contract says clearly:
    "Our priorities are our users and free software"</pre>
    </blockquote>
    </body>
    </html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Rahmatullin@21:1/5 to dude on Thu Feb 10 13:40:01 2022
    On Thu, Feb 10, 2022 at 12:14:40PM +0100, dude wrote:
    Linus: the (GPL 2.0) intented social contract is: “i give you sourcecode, give me back your changes”

    https://dwaves.de/2022/01/31/why-is-it-gnu-linux-and-not-just-linux-linus-talking-about-gpl-v3-vs-gpl-v2-the-better-one-the-social-gpl-contract-is-i-give-you-sourcecode-give-me-back-your-changes-non-free-binary/

    if the developer does not want-need changes back GPL 3.0 is also "okayish"

    the kernel licensing is also rather... complicated... (with the many different versions of GPL and LGPL) maybe a user can do a 30min entertaining sum up video explanation of this ...
    This isn't related to this thread.

    PS: yes iterating over this stuff takes time, anyone ever read the whole GPL 2.0?
    Yes.

    --
    WBR, wRAR

    -----BEGIN PGP SIGNATURE-----

    iQJhBAABCgBLFiEEolIP6gqGcKZh3YxVM2L3AxpJkuEFAmIFBhEtFIAAAAAAFQAP cGthLWFkZHJlc3NAZ251cGcub3Jnd3JhckBkZWJpYW4ub3JnAAoJEDNi9wMaSZLh 11QP/09S/1/QwoCSo95kXf471uo7X/6h/9qUBOiw1tCcoILdgxQqCr9dCChZtUZJ OXdET63ALH4FaNkhBXcN0mrpOvlwRSlIcd5VUJfJOHa7ZB5+Jx1/mAYW8rO1ed7U FA/l9a7Z8kOjcz1l1GLjh3ysZhA2zxeNDiCedISnexsWRf3tpQnoFWu14qlooVhO DNKaiEsQFr1NNtCAxulK5ForcgJNupH/Q3jn0Qd/juE56UIiOajOX+17oA6QL+iY NAI9CaTCRSZkt2Uy2Or6pTXwTUdaKkRywBCdEI9P2zNK4s/bYKjMdqs85Cg9r0bV OX/Z83tQhQwKJCM4GNFeaZMufHrcNJHazohO4J2Yaj2XYYQB789L/SIqF6PDoXus bglVhvfopdeoZtr0GE4dC8ofQHpKwgM5MGYnEJUhZ9zg3F8HnwD+T0ufFfOWAxqE Pgu69V6TZhWp+5CPS0LakmyuDDopDGYPFdQG/gCQroCMh7x5Qv7zp+YtanDuC7Ba TmqP3w982AHpyr38HFJMkmc6f6RSoXfHDAX81Dx1I4/V2Mm2LvWCSwsEAsm8YR2f AX/jRun6pjiJfiyHgM2BqOKW1J3JLDFehC546s0ad2Zc3zc/nhSGQkNGLOoDPnZ8 PoWh/V0dUbXalnzKpsAlnCW8HPLptB+8fI1rwSr1YgeWMf5g
    =BrVJ
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonas Smedegaard@21:1/5 to All on Thu Feb 10 13:40:02 2022
    Quoting Stephan Lachnit (2022-02-10 11:59:11)
    On Tue, Feb 8, 2022 at 8:36 PM Jonas Smedegaard <jonas@jones.dk>
    wrote:

    For starters, the format adds one SHA1 hash per source file, right?

    Yes, one checksum per file.

    Sure I can "just" ignore all FileChecksum: lines, but anyone working
    with XML will know that plaintext does not equal human-readable.

    This comparison is a bit off, XML is a representation. The SPDX format
    I want to use is tag:value just like DEP5, so in this regard "human-readable". There is more cruft content, but it takes less than
    5 minutes to understand where the per file copyright and license
    information is.

    It takes less than 5 minutes to understand where the per file copyright
    and license information is: "Just" ignore all angle-bracket markup.

    Comparison is that both XML and checksums adds noise, reducing
    readability.

    ...and seems you agree:

    Yes I agree that adding hashes to DEP5 makes it unreadable and utterly annoying to maintain, that's why I don't want to add it. DEP5 is
    designed to be written by humans, SPDX is not. That's why SPDX can add hashes without any drawbacks.

    A file containing checksums for all files is much harder not only to
    write but also to read, than one without such hashes.

    [ original criticism revived ]

    Quoting Jonas Smedegaard (2022-02-08 16:39:36)
    If we permit a debian/copyright format that is not human-readable, it
    means that I cannot confidently proof-read and change the contents of
    the debian subdir without the help of machine-parsers, and I would
    need to know two formats with different goals.

    I don't see the problem with machine parsers. We already use a lot
    of different tools for our processes (git, dput, dpkg, debhelper, lintian, uscan, a mail program, a text editor, ...), adding one
    more shouldn't be a big deal. It needs to be provided of course,
    but I plan to do that.

    Only 2 of those you list are mandatory: dpkg and RFC822 email - the
    rest are optional, some quite popular but even then routinely
    bypassed.

    I mean if you want you can write SPDX files by hand, it's not a binary format. Same as you can write a Debian package without debhelper.

    I can also transfer TCP packets using pigeons, if I seek challenges.

    My concern is not about binary versus text-based format, but about only-machine-readable versus human-and-machine-readable format.


    I would be quite happy if our work on evolving debian/copyright would result in a future revision being identical to REUSE format.

    What I dislike is requiring all developers to master 3 formats instead
    of currently only two: freeform-human-only and (also-)machine-readable.

    No, you don't have to master SPDX! That's the point: you don't
    interact with it at all. It's created by tools, and shipped to satisfy
    the legal obligation to provide copyright information. Users don't
    care how the copyright information is shipped. As a developer, you
    just have one less thing to care about, namely writing
    debian/copyright by hand.

    I need to either interact with the format or depend on tools that do.

    When I release a package to Debian, then I am responsible for that
    package being in compliance with Debian Policy - including § 2.3 about information that the debian/copyright file MUST cover.

    It does not matter if I write debian/copyright by hand or choose to use automated tools to autogenerate that file - regardless it is my
    responsibility to ensure that contents or that file comply with Debian
    Policy.

    If I upgrade a package as an NMU, then it is my responsibility to ensure
    that the new package comply with Debian Policy - including
    debian/copyright getting updated as needed - but if debian/copyright
    format is alien to me then I cannot do that responsibly.

    Your argument seems to be that I can simply trust SPDX/REUSE tools.
    Agreed, I can choose to trust tools doing magic for me, but that is
    exactly my concern: I am _forced_ to trust tools, where I have the
    (easy!) option of by-passing helper tools with current
    [also-]machine-readable format.


    - Jonas

    --
    * Jonas Smedegaard - idealist & Internet-arkitekt
    * Tlf.: +45 40843136 Website: http://dr.jones.dk/

    [x] quote me freely [ ] ask before reusing [ ] keep private --==============t02764843489183158=MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    Content-Description: signature
    Content-Type: application/pgp-signature; name="signature.asc"; charset="us-ascii"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEn+Ppw2aRpp/1PMaELHwxRsGgASEFAmIFBckACgkQLHwxRsGg ASEnFg//QrfSZ4SRVj1bOgiy14Iqxl+G97caTULVicdAWO+Rz4AhT0N8vSz2NVr/ 5V4rdyVrGEMeAfEYjEbQcphg1786t2ykQXuXijhCfYzZa6eKrb7ET4hN3SJEWUgd +n34dNL/XMdJoSqtusASJOtAvO/SYaoNTu4SbgjOcd+GXIsuFtOrgHFcTXTQLe00 h9rMtUSzZXV1XiKsP+OUxFcySp4dLsnr8wEaj4plKOWx3PIxzEhmSojgNZRpqyAt BdngWvf4GXArsvMyUVnOSJ4q3Os6yay6II//zmGsZjn97MuYbG+XTZBSEPmZKu3d Mp+zoJr2d7ZqAgqv07DdlCuo4ZQSJF3V1YK+pPM1Mn5/2FYjoKP7OQiZehUtPdGP ccy5QM6KzZ11Xp3cxbuteyZuxyywlXBIK3Yk8DnkG6je292ToDqI108p15rVTUxv sJz9G/Y0dWUqoWYoI5CeGvMrnBOwtCP3bwuvnAC0egfT0exRXb2dNMErzvlyD3uD tpuSVH5/vk/FhciTL
  • From Simon McVittie@21:1/5 to Stephan Lachnit on Thu Feb 10 14:20:01 2022
    On Thu, 10 Feb 2022 at 11:59:11 +0100, Stephan Lachnit wrote:
    No, you don't have to master SPDX! That's the point: you don't
    interact with it at all. It's created by tools, and shipped to satisfy
    the legal obligation to provide copyright information.

    So, are you saying this is something you are doing to satisfy the "letter
    of the law" for licenses that require copyright notices to be reproduced alongside binaries, to avoid having that noise reduce the clarity of the information we are providing for other, arguably more useful reasons?

    Obviously it's not my decision to make, but I have never been particularly convinced by the assertion that we can fulfil the GPL's requirement for corresponding source to accompany binaries by offering source packages
    from the same places that ship binaries, but at the same time we cannot
    fulfil various licenses' requirement for copyright notices to accompany binaries by pointing to those same source packages and saying "it's
    all there".

    The usual form of words in the licenses that require copying copyright
    notices calls for the copyright notice to appear in the "documentation or
    other materials"; I think we could reasonably argue that source packages
    are a form of extremely comprehensive documentation (they document the
    precise behaviour of the binaries!) and they are certainly "other materials".

    However, if the ftp team have a problem with that reasoning, then yes,
    it seems like there could be value in having an essentially write-only
    format for the information that the license requires us to reproduce. That
    way, we fulfil the letter of the license by providing the information we
    are required to (even though it isn't particularly practically useful
    to wade through the list of around 400 potential copyright holders in
    a medium-sized package like GTK[1]), and at the same time, fulfil the
    spirit of the license by communicating the parts that practically matter
    in a more concise form (in the case of GTK, this would be "LGPL-2.1+
    and various compatible licenses", which is the information you'll get
    from basically any other distribution's GTK packaging).

    smcv

    [1] https://tracker.debian.org/media/packages/g/gtk4/copyright-4.6.0ds1-3,
    almost certainly incomplete (but neither the maintainers nor the ftp
    team noticed any omissions, which is as good as we will
    realistically get)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to smcv@debian.org on Thu Feb 10 14:40:01 2022
    On Thu, Feb 10, 2022 at 2:10 PM Simon McVittie <smcv@debian.org> wrote:

    On Thu, 10 Feb 2022 at 11:59:11 +0100, Stephan Lachnit wrote:
    No, you don't have to master SPDX! That's the point: you don't
    interact with it at all. It's created by tools, and shipped to satisfy
    the legal obligation to provide copyright information.

    So, are you saying this is something you are doing to satisfy the "letter
    of the law" for licenses that require copyright notices to be reproduced alongside binaries, to avoid having that noise reduce the clarity of the information we are providing for other, arguably more useful reasons?

    I think I don't fully get the question, but yes, essentially this
    would be a write-only format that I expect no human to read. Users
    don't care about it and it's more than enough from the legal side.

    Regards,
    Stephan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephan Lachnit@21:1/5 to All on Sat Feb 12 02:10:02 2022
    FYI, I started working on a SPDX->DEP5 and DEP5->SPDX converter tool,
    the code (or rather a basic concept) is here [1].

    My goal is to produce an internal representation that collects
    copyright information on a per-file basis, and convert between
    SPDX/DEP5 and this format.

    Regards,
    Stephan

    [1] https://github.com/stephanlachnit/dep5convert

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)