Hello!
I had a chance to have a chat with doko at DC24 and one of the things that came out was this question:
"Are there packages in the DPT that aren't maintained by their uploaders and are kept in the team because other people fix bugs and do team uploads on them?"
Let's call these "stale uploaders".
To get some data, I had a little fun and wrote a Python script [1] to match a package git history with the Uploaders field in d/control. If in the last 3 years, someone listed in the Uploaders field hasn't made a single commit, the package gets flagged.
Turns out about half of the packages in our namespace get flagged one way or another :P
=====================
Some caveats:
1. To save time and disk space, I did shallow clones of the DPT's git repositories, using 2021-08-14 as a cutoff date.
This certainly creates false-positives, but it seemed like a reasonable tradeoff.
2. There might be some discrepancies between packages' git repositories on Salsa and what's been uploaded to the archive. Some of these repos might not have gone through NEW at all.
3. A cursory look revealed a bunch of empty repositories [2] and packages that had been moved to other namespaces, but never removed from the DPT's namespace.
4. Packages flagged as "None" don't have an "Uploaders" field. Either:
* the DPT is the "Maintainer" and we should make sure the package has been orphaned
or
* there's a human "Maintainer" and the DPT isn't listed in Uploaders and we should fix the package.
5. Some of the packages flagged as "Debian Python Team" have the team as Uploaders. Considering the recent policy change, we should probably try to make things more uniform by having the DPT as maintainer everywhere.
=====================
All in all, a fair amount of manual work is probably needed to make this list useful and remove false-positive, thus the 'QA Experiment' part in the title of this mail.
Before putting more efforts into this, I wanted to hear from other team members. Do you think this is valuable work?
If I get a good enough list (again, the current one needs work), would you support removing "stale uploaders" from our team-maintained packages??
Cheers,
[1]: https://salsa.debian.org/pollo/qa-scripts/-/blob/master/dpt-stale-uploaders.py
[2]: See empty.txt. I took the liberty of removing them, as they were all older than 6 months.
As a first cut, I would exclude archived repositories for removed packages (e.g. ptable).package that depends on it. I don't think it's stale at all in any meaningful sense, but I don't know how you would know that without spending more time on the package than it's worth.
If the team as a whole is keeping a package up to date, I think we should be happy that the package is maintained and not expend effort to make it harder for people to do that.
I'm really not sure how you are going to sort through this. Another example is appdirs. Yes, I didn't upload it the last few times someone touched it, but it's not in need of an upload now. What it really needs is to be removed, but there's a key
I think it would be more useful to work on finding team packages that aren't maintained at all and have issues. Those should either be fixed or removed.flagged.
Scott K
On August 15, 2024 8:45:24 PM UTC, "Louis-Philippe Véronneau" <pollo@debian.org> wrote:
Hello!
I had a chance to have a chat with doko at DC24 and one of the things that came out was this question:
"Are there packages in the DPT that aren't maintained by their uploaders and are kept in the team because other people fix bugs and do team uploads on them?"
Let's call these "stale uploaders".
To get some data, I had a little fun and wrote a Python script [1] to match a package git history with the Uploaders field in d/control. If in the last 3 years, someone listed in the Uploaders field hasn't made a single commit, the package gets
Turns out about half of the packages in our namespace get flagged one way or another :P
=====================
Some caveats:
1. To save time and disk space, I did shallow clones of the DPT's git repositories, using 2021-08-14 as a cutoff date.
This certainly creates false-positives, but it seemed like a reasonable tradeoff.
2. There might be some discrepancies between packages' git repositories on Salsa and what's been uploaded to the archive. Some of these repos might not have gone through NEW at all.
3. A cursory look revealed a bunch of empty repositories [2] and packages that had been moved to other namespaces, but never removed from the DPT's namespace.
4. Packages flagged as "None" don't have an "Uploaders" field. Either:
* the DPT is the "Maintainer" and we should make sure the package has been orphaned
or
* there's a human "Maintainer" and the DPT isn't listed in Uploaders and we should fix the package.
5. Some of the packages flagged as "Debian Python Team" have the team as Uploaders. Considering the recent policy change, we should probably try to make things more uniform by having the DPT as maintainer everywhere.
=====================
All in all, a fair amount of manual work is probably needed to make this list useful and remove false-positive, thus the 'QA Experiment' part in the title of this mail.
Before putting more efforts into this, I wanted to hear from other team members. Do you think this is valuable work?
If I get a good enough list (again, the current one needs work), would you support removing "stale uploaders" from our team-maintained packages??
Cheers,
[1]: https://salsa.debian.org/pollo/qa-scripts/-/blob/master/dpt-stale-uploaders.py
[2]: See empty.txt. I took the liberty of removing them, as they were all older than 6 months.
5. Some of the packages flagged as "Debian Python Team" have the team as Uploaders. Considering the recent policy change, we should probably try to make things more uniform by having the DPT as maintainer everywhere.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 350 |
Nodes: | 16 (2 / 14) |
Uptime: | 10:25:43 |
Calls: | 7,625 |
Files: | 12,793 |
Messages: | 5,686,539 |