Forum: >>> Magnum BBS <<<

multiple roles of d/copyright

From Simon McVittie@21:1/5 to Scott Kitterman on Thu Feb 10 14:30:01 2022

On Tue, 08 Feb 2022 at 08:59:23 -0500, Scott Kitterman wrote:

From my point of view, treating something like other common classes of RC bugs
means that the project is producing tools and processes to make detection of such bugs more automated to remove them from the archive, that developers are actively looking for them, and that they are routinely fixed in the normal course of Debian development.

I think part of the problem here might be that copyright information is "social", not "technical": software authors can claim copyright and/or authorship in various forms of human-readable, free-form text, which means
any automated detection is necessarily going to be imperfect, and as long
as our policy demands perfection, there will be a reluctance to automate
this (or at least a reluctance to say that we are automating it).

Another part of the problem is that licensing and copyright-information
bugs are not something that we are realistically going to find through
normal use of software: if GTK crashes when you print on a Tuesday, one
of our users will eventually notice, but if we have missed a copyright
holder, it's unlikely that anyone is going to notice that omission from
the list of around 400 potential copyright holders in <https://tracker.debian.org/media/packages/g/gtk4/copyright-4.6.0ds1-3>
unless they repeat the time-consuming process of collecting possible
copyright claims from the source code (as the ftp team presumably do). I
have no idea how the maintainers of larger and more complicated packages
manage to do this, or how the ftp team manage to review larger and more complicated packages in a finite time.

I think the copyright file is doing several things which are perhaps in conflict:

* It lets consumers of packages know what restrictions apply to their
use of a package
- This requires *most* of the license information, although not
necessarily all of it: for example if a package like Linux is licensed
under a mixture of GPL, LGPL, BSD and MIT licenses, it's usually
sufficient to be aware of the most restrictive of those licenses, in
this case GPL
- Having too much information, however, well-intentioned, actually works
against this by making it harder to find what you need
- I would argue that requiring the text of licenses like the CC family
to be inlined into the copyright file works against this goal, by
reducing the signal-to-noise ratio: if you are not familiar with a
particular license, then obviously you will need to read its text
to see what it means, but if you are looking at packages that have
content under various semi-common licenses, you only need to read
each license once
- I would argue that requiring lists of copyright holders in the same
file to be inlined into the copyright file also works against this
goal, again by harming the signal-to-noise ratio

* It lets consumers of packages know that the package is DFSG-compliant
- Same requirements as above

* It's a place to reproduce information that licenses require us to, like
a comprehensive set of copyright notices (if our interpretation of the
applicable licenses is that pointing to nearby source code and calling
it extremely comprehensive accompanying documentation is insufficient)
- In this role, it's essentially write-only: we're doing this because
we have been required to do it, more than because it's practically
useful, and I don't expect anyone to actually read this, except for
the maintainer when collecting it and the ftp team when verifying
that it has been collected
- In another subthread, Stephan Lachnit suggests using the SPDX format
for this write-only information, which I think might be intended as
a way to eventually separate it from the other roles of d/copyright

* It gives authors due credit (which we are not *required* to do, but
in previous discussions of d/copyright I've seen this cited as a reason
why we *should* do this in order to be good citizens)
- Note that collecting copyright holders is not necessarily actually
helpful here, because that often means we are required to "credit"
an employer, rather than mentioning the actual author
- In a medium-sized package like GTK, it's not clear to me that a list of
about 400 possible copyright holders is actually serving this purpose,
because any individual contributor is lost in the noise

* It lets us meet our self-imposed rules
- This is circular, so I'm inclined to disregard it when discussing what
the rules should be: we should set rules because they help us to
achieve a goal, rather than for the sake of having rules

* It lets the ftp team (or other interested reviewers) duplicate the
info-collecting process to check that all of the above have been done
- This is somewhat circular, because this is a way to support the other
goals, not really a goal in its own right

* Are there other relevant goals that I've missed here?

I don't think conflating those goals and assuming they all need to be
satisfied by a single file is necessarily going to lead to meeting any
of those goals in an efficient way, let alone meeting all of them in
an efficient way.

smcv

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Kitterman@21:1/5 to All on Thu Feb 10 15:10:01 2022

On Thursday, February 10, 2022 8:26:23 AM EST Simon McVittie wrote:

On Tue, 08 Feb 2022 at 08:59:23 -0500, Scott Kitterman wrote:

From my point of view, treating something like other common classes of RC bugs means that the project is producing tools and processes to make detection of such bugs more automated to remove them from the archive,
that developers are actively looking for them, and that they are
routinely fixed in the normal course of Debian development.

I think part of the problem here might be that copyright information is "social", not "technical": software authors can claim copyright and/or authorship in various forms of human-readable, free-form text, which means any automated detection is necessarily going to be imperfect, and as long
as our policy demands perfection, there will be a reluctance to automate
this (or at least a reluctance to say that we are automating it).

Another part of the problem is that licensing and copyright-information
bugs are not something that we are realistically going to find through
normal use of software: if GTK crashes when you print on a Tuesday, one
of our users will eventually notice, but if we have missed a copyright holder, it's unlikely that anyone is going to notice that omission from
the list of around 400 potential copyright holders in <https://tracker.debian.org/media/packages/g/gtk4/copyright-4.6.0ds1-3> unless they repeat the time-consuming process of collecting possible copyright claims from the source code (as the ftp team presumably do). I
have no idea how the maintainers of larger and more complicated packages manage to do this, or how the ftp team manage to review larger and more complicated packages in a finite time.

I think the copyright file is doing several things which are perhaps in conflict:

* It lets consumers of packages know what restrictions apply to their
use of a package
- This requires *most* of the license information, although not
necessarily all of it: for example if a package like Linux is licensed
under a mixture of GPL, LGPL, BSD and MIT licenses, it's usually
sufficient to be aware of the most restrictive of those licenses, in
this case GPL
- Having too much information, however, well-intentioned, actually works
against this by making it harder to find what you need
- I would argue that requiring the text of licenses like the CC family
to be inlined into the copyright file works against this goal, by
reducing the signal-to-noise ratio: if you are not familiar with a
particular license, then obviously you will need to read its text
to see what it means, but if you are looking at packages that have
content under various semi-common licenses, you only need to read
each license once
- I would argue that requiring lists of copyright holders in the same
file to be inlined into the copyright file also works against this
goal, again by harming the signal-to-noise ratio

* It lets consumers of packages know that the package is DFSG-compliant
- Same requirements as above

* It's a place to reproduce information that licenses require us to, like
a comprehensive set of copyright notices (if our interpretation of the
applicable licenses is that pointing to nearby source code and calling
it extremely comprehensive accompanying documentation is insufficient)
- In this role, it's essentially write-only: we're doing this because
we have been required to do it, more than because it's practically
useful, and I don't expect anyone to actually read this, except for
the maintainer when collecting it and the ftp team when verifying
that it has been collected
- In another subthread, Stephan Lachnit suggests using the SPDX format
for this write-only information, which I think might be intended as
a way to eventually separate it from the other roles of d/copyright

* It gives authors due credit (which we are not *required* to do, but
in previous discussions of d/copyright I've seen this cited as a reason
why we *should* do this in order to be good citizens)
- Note that collecting copyright holders is not necessarily actually
helpful here, because that often means we are required to "credit"
an employer, rather than mentioning the actual author
- In a medium-sized package like GTK, it's not clear to me that a list of
about 400 possible copyright holders is actually serving this purpose,
because any individual contributor is lost in the noise

* It lets us meet our self-imposed rules
- This is circular, so I'm inclined to disregard it when discussing what
the rules should be: we should set rules because they help us to
achieve a goal, rather than for the sake of having rules

* It lets the ftp team (or other interested reviewers) duplicate the
info-collecting process to check that all of the above have been done
- This is somewhat circular, because this is a way to support the other
goals, not really a goal in its own right

* Are there other relevant goals that I've missed here?

I don't think conflating those goals and assuming they all need to be satisfied by a single file is necessarily going to lead to meeting any
of those goals in an efficient way, let alone meeting all of them in
an efficient way.

smcv

How about it enables the project to comply with license requirements? I may have missed it, but I don't see that on your list. Like it or not, copyright is a thing and licenses (and our compliance with their requirements) are the only things that give us the right to distribute the packages that make up Debian.

Policy 4.5.1 did relax the requirements around listing copyright holders (from the upgrading checklist):

The copyright information for files in a package must be copied
verbatim into "/usr/share/doc/PACKAGE/copyright" when

1. the distribution license for those files requires that copyright
information be included in all copies and/or binary
distributions;

2. the files are shipped in the binary package, either in source or
compiled form; and

3. the form in which the files are present in the binary package
does not include a plain text version of their copyright
notices.

While I think listing all the copyright holders is a good idea (for the
reasons you mention above) and do so, even when legally it's not required, the Debian policy requirement is now the minimum needed to satisfy license requirements.

Scott K
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmIFG/IACgkQeNfe+5rV mvGjRBAAyHnWBX87LLXmchCVXAwYq3rfTTkXt/8D0J9unN3naHlG+BMp6wq4vfHF LjwwiI7bxxSgmP84TRefxix6Z71JO5Sd8sJD/rFw43l1tR6raRu4vXRG4mcRGpvu oc4MxKumhXL3LNKtWt6QqSW52pvYjKdam60n4Pq0qJbUhSOuIR0HI1b3VqDUtJcj AXOKuG8RqZnctLyOL69Ox6fm0moY8szfNDrpzzqdtpIhhAy0uEih7zcGPC1qc+HF sLEcgJ78b7uRViP0WmZ9ALvSkGkhM+lg9fxIiGn01Rxq2SUEFILXflUku9xaOa/U iSq5KHre4rovAH5Sc3mrGQMbWQtJjDl6cBxWjm71edPX5wTmcLU0P4Y3T6DJiB3h jTLA91BlVVRl56bzv1GnQRR5f/zMWqlfEa+bf8T12RknAhyYBADcdHmEVOHpOBOf GYWD0j469JtbgIo8uFO7/x98rWz/eCor8qmXGkqjTgQ7qmEVPcXCraGbKGkfkfcx dOflFH7aFWxeKIvqX4XO9k1HrUZMWBtkm+CZo4CV413i0z3clWvi9S6zf5oWk+Oo xys7DLgYkAqVDeNgIHimTJABfx6RPGkTX/m2+Wze2WWY3A21N+roPik+/tXITdAe 2i/nskzWP/298nZQ7QhDs9aujzLruELpSwfvAUzxrK+pIKrSVFU=
=6vYb
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From The Wanderer@21:1/5 to Scott Kitterman on Thu Feb 10 15:20:01 2022

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
On 2022-02-10 at 09:06, Scott Kitterman wrote:

On Thursday, February 10, 2022 8:26:23 AM EST Simon McVittie wrote:

I think the copyright file is doing several things which are perhaps in
conflict:

* It's a place to reproduce information that licenses require us to, like
a comprehensive set of copyright notices (if our interpretation of the
applicable licenses is that pointing to nearby source code and calling
it extremely comprehensive accompanying documentation is insufficient)

How about it enables the project to comply with license requirements? I may have missed it, but I don't see that on your list.

Isn't that the above paragraph?

--
The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEJCOqsZEc2qVC44pUBKk1jTQoMmsFAmIFHYkACgkQBKk1jTQo MmvmYA//Wt9XPD3V6ZVtaMRWYA4yrAaLE80nHCvlz7llUvV8i+5FcpRd7kQ4rX2y MDbr99hyPMGODeV8runTIDsiPy83/x271xWca1Zx7sp40SButZ2+0BIpmgg6coIG PQup4JxeuseLzp21ntTpJCDaBfFbayomPChP8T9tfRvn7fINe20yK/OTIfjRAuFQ a07s7oL5e2S3BatAUmUEtEGOLWq13assfcxxsF26Nw9jU0XDJkA/51tFIq5EMd2H HdUtr7gc7zu9aG8ckskTeJm14mh4Of1gwzu/om07N1vCEZl9y7v6UYWVkK/W/j1v LCVK3BAr9HzjFX8EINL85w7Pw4CrvYnhomG1VfOLbbVtq1hei0PqQCBzcnVYIK93 wWIsZgtUkBE9WEPaRAv1DJ6GB54tD9mGM12WtK8hGQyoL9XL9pP4udyVdSfaABt1 8pc0Ij7n1CqCozK53uE1NclE2Fp6XwrQDfrL2/Q7ILer2qSqlN1xsKgKUv5+T9/3 uTO6IxyWN8/CcBbDwsiUehILGk7gue6rRi1tgAe/PddkDrDpOBzJidTy5NJ/XRD1 T9OS8yWRixmn48Rnwsf2wQC7ue03HIRKfFbT+e9aW5SuBvzsYMPZJUOQBMgt/s7B rqcDc/+mWMbD+hMNh/5h/Z1uCQES

From Scott Kitterman@21:1/5 to All on Thu Feb 10 15:30:01 2022

On Thursday, February 10, 2022 9:13:29 AM EST The Wanderer wrote:

On 2022-02-10 at 09:06, Scott Kitterman wrote:

On Thursday, February 10, 2022 8:26:23 AM EST Simon McVittie wrote:

I think the copyright file is doing several things which are perhaps in
conflict:

* It's a place to reproduce information that licenses require us to, like >>
a comprehensive set of copyright notices (if our interpretation of the >> applicable licenses is that pointing to nearby source code and calling >> it extremely comprehensive accompanying documentation is insufficient)

How about it enables the project to comply with license requirements? I may have missed it, but I don't see that on your list.

Isn't that the above paragraph?

It is. Thanks.

Scott K

-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmIFH3kACgkQeNfe+5rV mvHaURAAkJt3NKtQNZgxqt4P7MlayMd4TgrOKF6nSrmmZ92E9n0ilRsm3qyOBotZ DZPL51RkxqH3CMNn8ppcb5a7GF4x65yyoBf8sySntBOPZdGjXhbrhuyJMdhTyu0m fAOb8+d9gCxOyXqpzlL2DwRCKNGqkkiZsk+lshHqrbjwbAOggzEHCFT2yYWOP579 YJhO9HwcdoIUYHf+MFWRhSFhEH/EdExjeqgA0XXX55FwjiNKwhA/XZychtlbc5l8 u39fBmVckmrgivvBF2YQplNi6cq+YpjP5k+XRBPdDy2eyoW9Gp6O/foLhltLVE1s AecOThminCTkF2YLCcCf+3Nd8YaJL+8zvwXv17VL+L035+/6VsvgtgM0aH22QT0T +VlsKrj7MkA1IeMyN/1YCb7qBb92CZPzE0AAnAe5YFsDoxddaFEEwloijdapgHMn zNzlUj2CxZqrhQwoc2egwAnRaTHhvASK3APIL3T//dsaQsmu4rK7RXOS6Vi1QLtc XzCESMD8kSEthOF7S5QWSeeLiU0YvprVivYgY+d83Ma7IrvR2GnJNbN+U0qlfg3T dupsD8QN4GGbDXyZDk1J6SmJY4U+VRF8woJaH+WavweY/9oVDVPsKD7M8ViIHddd Sdp1VzNsDrXWP/w1bwShwHj9AoKVJ0ICDAJd4ANkRWmJCI8fZ5o=
=lKJU
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Guest
  Fri Mar 14 06:37:31 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Adams Thomas
  Fri Mar 14 03:10:24 2025
  from Denver via Telnet
- Plume
  Fri Mar 14 03:03:48 2025
  from Uk via SSH
- Guest
  Thu Mar 13 20:29:22 2025
  from /bin/busybox Cat /proc/self/ex via Raw
- Bob Worm
  Thu Mar 13 19:21:48 2025
  from Wales, Uk via Telnet
- Bob Worm
  Thu Mar 13 18:53:20 2025
  from Wales, Uk via SSH
- Razorclad
  Thu Mar 13 17:17:53 2025
  from Denver, Co via Telnet
- Rixter
  Thu Mar 13 14:04:12 2025
  from Madison, Nc via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	429
Nodes:	16 (2 / 14)
Uptime:	115:49:07
Calls:	9,056
Calls today:	3
Files:	13,395
Messages:	6,016,442

multiple roles of d/copyright

Who's Online

Recent Visitors

System Info