XPost: linux.debian.maint.python
On Fri, 2023-02-24 at 19:33 +0100, Paul Gevers wrote:
Hi Diane,
On 23-02-2023 08:12, Diane Trout wrote:
the version of python3-xlrd 1.2.0-3 in unstable/testing is too old
to
be used with pandas 1.5.3. (See Bug #1031701).
Do I understand correctly that this isn't an issue from the point of python3-xlrd and that only pandas is effected? While investigating
for
this reply I noticed src:pandas doesn't even have a dependency in any
of
its binaries.
It looks like the xlrd dependency was commented out because the Debian
version is too old, though apparently that was done 7 months ago.
https://salsa.debian.org/science-team/pandas/-/blob/main/debian/control#L45
Here's the pandas module that conditionally uses xlrd if it's
available.
https://salsa.debian.org/science-team/pandas/-/blob/main/pandas/io/excel/_xlrd.py
As it is a really common
workflow to use pandas to read excel files, it'd be nice if the
version
of xlrd in bookworm was compatible.
As the maintainer of pandas, do you consider it an RC issue that
pandas
can't convert it? I guess not because you say "it'd be nice" and you
don't even have the required dependency. How severe do you consider
this
issue for pandas? pandas has a quite extensive autopkgtest, doesn't
it
cover this use case? Apparently you knew this earlier, why do you
bring
this up now?
The issue is somewhere between a minor and a normal bug, it breaks a
small component of the library.
I wouldn't claim to be a maintainer of pandas, I feel Rebecca Palmer
has been doing the vast amount of work keeping pandas updated in
Debian.
I started investigating this up after my coworker ran into while trying
to process an .xls file. And when I looked, saw someone else had also
recently filed the same bug report.
Because of the freeze I wanted to check if it was appropriate to
upload
the new version,
I'd hope that the "rules" are clear: https://release.debian.org/testing/freeze_policy.html#soft. You can
contact the Release Team if you need further clarification.
and what kind of warning I should give to the other
developers.
It depends. I'm worried about what you write below.
That's fair.
The counter argument is that xlrd's support for handling the xml based
.xslx files was unsafe since Python 3.9, and it has been recommended to switching to another package like openpyxl to handle xlsx files for a
while.
(Release from xlrd announcement for thread mentioning the removal, and
then goes into discussing the security issues)
https://groups.google.com/g/python-excel/c/IRa8IWq_4zk/m/Af8-hrRnAgAJ
The reason the issue doesn't show up much is .xls files are deprecated
by nearly everyone, this only shows up when you're reading old data or generated by old software.
The reason this is likely a minor issue, is there's a simple work
around which is to convert your xls file to a xlsx file.
Here's Pandas's discussion about deprecating xlrd for xlsx files.
https://github.com/pandas-dev/pandas/issues/28547
Here's the list of packages I found that have any relationship to python-xlrd, if it looked like the autopkgtests actually tested
using
the xlrd library and what the level of declared dependency is.
(none
means the package lacks autopackage tests)
nemo | none | Recommends | odoo-14 | none | Depends | ofxstatement-plugins | none | Depends | psychopy | unlikely | Depends | python3-agateexcel | yes | Depends | python3-canmatrix | no | Recommends | python3-drslib | no | Recommends | python3-glue | yes | Depends | python3-pyspectral | probably | Suggests | python3-rows | unlikely | Recommends | python3-tablib | unlikely | Depends | visidata | none | Build-Depends | vistrails | none | Build-Depends | python-xrt | none | Build-Depends | pyutilib | none | Build-Depends |
If I read everything correctly, it seems like you're too late with
this
change.
With a bit more wakefulness, I looked through the packages that have
any dependency on xlrd.
I think odoo-14 is the package most likely to have issues. They use
xlrd and seem to expect to be able to read and write xls & xlsx files
using xlrd. Needless to say, updating xlrd would then break the ability
to process xlsx files. Though of course the xlrd upstream thinks that's unreliable, and I have no idea how important this feature is to them.
(the odoo repository also has tests, and someone could in theory write autopkgtests for it)
I couldn't figure out what pyspectral is doing.
These packages ofxstatement-plugins, psychopy, python3-agateexcel, python3-rows, python3-tablib, and visidata appear to also depend
on/recommend openpyxl so they likely use the xlrd for .xls files and
openpyxl for .xsx files as xlrd has been recommending.
python3-canmatrix uses a different package python3-xlsxwriter to deal
with xlsx files
https://salsa.debian.org/python-team/packages/python-canmatrix/-/blob/debian/main/setup.py#L104
Nemo looks to only be using xlrd for older .xls files, and has a
different tool for the newer files. They seem to be using mimetypes and
use this block for .xlsx files.
https://salsa.debian.org/search?search=vnd.openxmlformats-officedocument.spreadsheetml.sheet&nav_source=navbar&project_id=17703&group_id=2992&search_code=true&repository_ref=master
and this block for .xls files
https://salsa.debian.org/cinnamon-team/nemo/-/blob/master/search-helpers/mso-xls.nemo_search_helper
python3-drslib appears to be expecting to be used on .xls files.
(looking through)
https://sources.debian.org/src/drslib/0.3.1.p3-2/drslib/p_cmip5/init.py/
vistrails only lists xlrd as a build depends, and it's tests seems to
think it might work with both xls and xlrx files, but the test code in
the package seems to only test xls files.
And as an aside, I found that python-xrt probably should remove
python3-xlrd from it's build dependencies as the package doesn't seem
to use it.
https://codesearch.debian.net/search?q=package%3Apython-xrt+xlrd
Ultimately the argument that this is a relatively minor feature, cuts
both ways. It both suggests the risk of updating is relatively low, but
also there's less reason to update.
Thank you for your time evaluating this request.
Diane
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEETQVcMeSBIEX5AQ11mQ04NnM013AFAmP5R3AACgkQmQ04NnM0 13Cp8w/+Jnph5a5PFVBcZ+/6ce+LRClBc2WH2fjSGPV3wH7a5VojDZLheyFvscIM MFHIfHcgiFSntbUMyrrPCWFVw8uWFmPsIgCei3+ywlcfi5vpYsxxjif6bbRgRMBV IR1945VbmV+dGZ+mANdKnX3hKIEkWlvWflwoKmgkb70gZHI+g0L4NsT205PtNB0w wly26KE/gh0z+X9s0SkVcjwIMWBTjUEpLKp6MIE2c+5zINocITBo4f+ooq7ls64v dGII+lDus8n67bUhm9bVxCm9g+lVYkGEodPAzKe6jgS0Bv7WglqmoxKUM/tfhznI 8fCTZ1ssK6eO6NgVfEnx+IUfcZ72V3/PGDK7FJp2rutYiJcbfZdOi22gyrLHvVLl VngOoaz3+iRg5KrOoLm4iHKSjDSH2H49si0NEPP9O2TyuD0pcp6cM55XMX49jUkN rHyUXgJH2o89ViteMDSzjSFdTh8XyqS0uD2VXhpELk1NJMWfGRiHvPDieoQx1ZAe qXE+1LOI5wdBjp29SiG5VVLeSBcyiTO/Kn3SbINf/AJyYop3r0aWYBrsp35ySdpM WAolVNJBBbKTvhU+pRdu7vBayIqIj0mLgBWWTMWc8zX2DteayQTmZeb+1pG2M8kS 6j8+zxvO/b9qkfuCXErOE7QUF5RwYAVRt657Fp6qbFwEH36ZQD8=
=5nGk
-----END PGP SIGNATURE-----
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)