• Remaining numpy 1.24 issues on 32bit architectures

    From Andreas Tille@21:1/5 to All on Sun Jan 29 12:20:01 2023
    Hi,

    I think there are some remaining issues with numpy 1.24 migration on 32
    bit architectures[1].

    Here is one example:

    _________ TestSequence.test_getitem_with_slice_has_positional_metadata _________

    self = <skbio.sequence.tests.test_sequence.TestSequence testMethod=test_getitem_with_slice_has_positional_metadata>

    def test_getitem_with_slice_has_positional_metadata(self):
    s = "0123456789abcdef"
    length = len(s)
    seq = Sequence(s, metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': np.arange(length)})

    eseq = Sequence("012", metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': np.arange(3)})
    self.assertEqual(seq[0:3], eseq)
    self.assertEqual(seq[:3], eseq)
    self.assertEqual(seq[:3:1], eseq)

    eseq = Sequence("def", metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': [13, 14, 15]})
    self.assertEqual(seq[-3:], eseq)
    E AssertionError: Seque[128 chars]: int32>
    E Stats:
    E length: 3
    E ----------------[14 chars]0 def != Seque[128 chars]: int64>
    E Stats:
    E length: 3
    E ----------------[14 chars]0 def

    skbio/sequence/tests/test_sequence.py:748: AssertionError

    How can I ensure that in both cases the arrays have the same type (I think it makes
    no difference whether it is np.int32 or np.int64 as long as they are of same type.

    Kind regards

    Andreas.

    [1] https://buildd.debian.org/status/package.php?p=python-skbio&suite=experimental

    --
    http://fam-tille.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Tille@21:1/5 to All on Mon Jan 30 09:20:01 2023
    Hi,

    I made some tiny steps forward ("only" 84 failures instead of 89 when I
    wrote my first mail) in the numpy 1.24 migration for 32bit architectures
    but I'm facing issues I do not have a real clue for. In

    https://salsa.debian.org/med-team/python-skbio/-/blob/master/debian/patches/numpy-1.24.patch#L123-L126

    I tried three variants how I could fix

    _______________________ AlphaDiversityTests.test_no_ids ________________________
    self = <skbio.diversity.tests.test_driver.AlphaDiversityTests testMethod=test_no_ids>
    def test_no_ids(self):
    # expected values hand-calculated
    expected = pd.Series([3, 3, 3, 3])
    # All this does not help
    # expected = pd.Series(np.array([3, 3, 3, 3], int32))
    # actual = np.int64(alpha_diversity('observed_otus', self.table1))
    # actual = np.dtype('int64').type(alpha_diversity('observed_otus', self.table1))
    actual = alpha_diversity('observed_otus', self.table1)
    assert_series_almost_equal(actual, expected)
    skbio/diversity/tests/test_driver.py:260:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ left = 0 3
    1 3
    2 3
    3 3
    dtype: int32
    right = 0 3
    1 3
    2 3
    3 3
    dtype: int64
    def assert_series_almost_equal(left, right):
    # pass all kwargs to ensure this function has consistent behavior even if
    # `assert_series_equal`'s defaults change
    pdt.assert_series_equal(left, right,
    check_dtype=True,
    check_index_type=True,
    check_series_type=True,
    check_names=True,
    check_exact=False,
    check_datetimelike_compat=False,
    obj='Series')
    E AssertionError: Attributes of Series are different
    E
    E Attribute "dtype" are different
    E [left]: int32
    E [right]: int64
    skbio/util/_testing.py:323: AssertionError
    ____________________ AlphaDiversityTests.test_observed_otus ____________________
    self = <skbio.diversity.tests.test_driver.AlphaDiversityTests testMethod=test_observed_otus>
    def test_observed_otus(self):
    # expected values hand-calculated
    expected = pd.Series([3, 3, 3, 3], index=self.sids1)
    actual = alpha_diversity('observed_otus', self.table1, self.sids1)
    assert_series_almost_equal(actual, expected)
    skbio/diversity/tests/test_driver.py:223:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ left = A 3
    B 3
    C 3
    D 3
    dtype: int32
    right = A 3
    B 3
    C 3
    D 3
    dtype: int64
    def assert_series_almost_equal(left, right):
    # pass all kwargs to ensure this function has consistent behavior even if
    # `assert_series_equal`'s defaults change
    pdt.assert_series_equal(left, right,
    check_dtype=True,
    check_index_type=True,
    check_series_type=True,
    check_names=True,
    check_exact=False,
    check_datetimelike_compat=False,
    obj='Series')
    E AssertionError: Attributes of Series are different
    E
    E Attribute "dtype" are different
    E [left]: int32
    E [right]: int64
    skbio/util/_testing.py:323: AssertionError
    ______________________ AlphaDiversityTests.test_optimized ______________________
    self = <skbio.diversity.tests.test_driver.AlphaDiversityTests testMethod=test_optimized>
    def test_optimized(self):
    # calling optimized faith_pd gives same results as calling unoptimized
    # version
    optimized = alpha_diversity('faith_pd', self.table1, tree=self.tree1,
    otu_ids=self.oids1) skbio/diversity/tests/test_driver.py:265:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _


    which obviosly[2] failed. I wonder whether someone might give some
    hints how to get dtypes consistently to one integer representation which
    is the background of nearly all these test suite issues.

    Kind regards
    Andreas.

    [2] https://salsa.debian.org/med-team/python-skbio/-/jobs/3868951

    Am Sun, Jan 29, 2023 at 12:11:15PM +0100 schrieb Andreas Tille:
    Hi,

    I think there are some remaining issues with numpy 1.24 migration on 32
    bit architectures[1].

    Here is one example:

    _________ TestSequence.test_getitem_with_slice_has_positional_metadata _________

    self = <skbio.sequence.tests.test_sequence.TestSequence testMethod=test_getitem_with_slice_has_positional_metadata>

    def test_getitem_with_slice_has_positional_metadata(self):
    s = "0123456789abcdef"
    length = len(s)
    seq = Sequence(s, metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': np.arange(length)})

    eseq = Sequence("012", metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': np.arange(3)})
    self.assertEqual(seq[0:3], eseq)
    self.assertEqual(seq[:3], eseq)
    self.assertEqual(seq[:3:1], eseq)

    eseq = Sequence("def", metadata={'id': 'id3', 'description': 'dsc3'},
    positional_metadata={'quality': [13, 14, 15]})
    self.assertEqual(seq[-3:], eseq)
    E AssertionError: Seque[128 chars]: int32>
    E Stats:
    E length: 3
    E ----------------[14 chars]0 def != Seque[128 chars]: int64>
    E Stats:
    E length: 3
    E ----------------[14 chars]0 def

    skbio/sequence/tests/test_sequence.py:748: AssertionError

    How can I ensure that in both cases the arrays have the same type (I think it makes
    no difference whether it is np.int32 or np.int64 as long as they are of same type.

    Kind regards

    Andreas.

    [1] https://buildd.debian.org/status/package.php?p=python-skbio&suite=experimental

    --
    http://fam-tille.de



    --
    http://fam-tille.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nilesh Patra@21:1/5 to Andreas Tille on Mon Jan 30 12:40:01 2023
    Hi,

    On Mon, Jan 30, 2023 at 09:12:57AM +0100, Andreas Tille wrote:
    I made some tiny steps forward ("only" 84 failures instead of 89 when I
    wrote my first mail) in the numpy 1.24 migration for 32bit architectures
    but I'm facing issues I do not have a real clue for. In

    https://salsa.debian.org/med-team/python-skbio/-/blob/master/debian/patches/numpy-1.24.patch#L123-L126

    Apologies for pointing the discussion into an orthogonal direction for
    once. Ofcourse, we could try fixing these, but if you look closely, skbio
    has never built on 32 bit archs ever since around 2016 on i386[3] and
    it has never built on the rest of 32 bit ever since it entered debian[4]
    and now this new upstream FTBFS that you point to, won't really block
    migration in any way.

    So my question is this: Why are we trying hard to fix this on 32-bit _now_ given that the upstream support has never been solid for this package on
    32-bit archs?

    ...
    which obviosly[2] failed. I wonder whether someone might give some
    hints how to get dtypes consistently to one integer representation which
    is the background of nearly all these test suite issues.

    I can think of two alternatives to fix this:

    1. There are a few type conversions to "int" (.astype(int)) in the skbio source code.
    This defaults to 32-bit integer type on 32-bit machines. Explicitly
    casting them to 64-bit can fix this. I happened to write a similar patch
    for another package recently, see[5] if it helps.

    2. Just ignore datatypes while comparing pandas dataframes with
    `check_dtype` parameter. An example/reference patch here[6]

    [1] https://buildd.debian.org/status/package.php?p=python-skbio&suite=experimental
    [2] https://salsa.debian.org/med-team/python-skbio/-/jobs/3868951
    [3]: https://buildd.debian.org/status/logs.php?pkg=python-skbio&arch=i386 [4]:https://buildd.debian.org/status/logs.php?pkg=python-skbio&arch=armhf
    [5]: https://salsa.debian.org/med-team/python-bioframe/-/blob/master/debian/patches/32-bits.patch
    [6]: https://salsa.debian.org/python-team/packages/python-upsetplot/-/blob/master/debian/patches/ignore-dtype-while-asserting.patch

    --
    Best,
    Nilesh

    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQSglbZu4JAkvuai8HIqJ5BL1yQ+2gUCY9esAAAKCRAqJ5BL1yQ+ 2hNZAQCKhIGuXKtqHMEmCQvLRMG64BBpxCvBEBI6AxPZMHC+TgEAoMBEqiuqgqF7 FSmfl0QTwa9ohVdwTccM11obwdzwBgQ=
    =VHsZ
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Tille@21:1/5 to before I started to investigate tim on Mon Jan 30 13:50:01 2023
    Hi Nilesh,

    Am Mon, Jan 30, 2023 at 05:07:37PM +0530 schrieb Nilesh Patra:

    On Mon, Jan 30, 2023 at 09:12:57AM +0100, Andreas Tille wrote:
    I made some tiny steps forward ("only" 84 failures instead of 89 when I wrote my first mail) in the numpy 1.24 migration for 32bit architectures but I'm facing issues I do not have a real clue for. In

    https://salsa.debian.org/med-team/python-skbio/-/blob/master/debian/patches/numpy-1.24.patch#L123-L126

    Apologies for pointing the discussion into an orthogonal direction for
    once. Ofcourse, we could try fixing these, but if you look closely, skbio
    has never built on 32 bit archs ever since around 2016 on i386[3] and
    it has never built on the rest of 32 bit ever since it entered debian[4]
    and now this new upstream FTBFS that you point to, won't really block migration in any way.

    Hmmm, I have checked
    https://buildd.debian.org/status/package.php?p=python-skbio
    before I started to investigate time into this and it says `uncompiled`
    for the 32bit architectures.

    So my question is this: Why are we trying hard to fix this on 32-bit _now_ given that the upstream support has never been solid for this package on 32-bit archs?

    I admit the 0.5.8-2 has migrated which I did not expected since when
    I was looking excuses contained those build problems.

    ...
    which obviosly[2] failed. I wonder whether someone might give some
    hints how to get dtypes consistently to one integer representation which
    is the background of nearly all these test suite issues.

    I can think of two alternatives to fix this:

    1. There are a few type conversions to "int" (.astype(int)) in the skbio source code.
    This defaults to 32-bit integer type on 32-bit machines. Explicitly
    casting them to 64-bit can fix this. I happened to write a similar patch
    for another package recently, see[5] if it helps.

    2. Just ignore datatypes while comparing pandas dataframes with
    `check_dtype` parameter. An example/reference patch here[6]

    Thanks for the additional hints (and have a nice trip).

    Kind regards
    Andreas.

    [1] https://buildd.debian.org/status/package.php?p=python-skbio&suite=experimental
    [2] https://salsa.debian.org/med-team/python-skbio/-/jobs/3868951
    [3]: https://buildd.debian.org/status/logs.php?pkg=python-skbio&arch=i386 [4]:https://buildd.debian.org/status/logs.php?pkg=python-skbio&arch=armhf [5]: https://salsa.debian.org/med-team/python-bioframe/-/blob/master/debian/patches/32-bits.patch
    [6]: https://salsa.debian.org/python-team/packages/python-upsetplot/-/blob/master/debian/patches/ignore-dtype-while-asserting.patch

    --
    Best,
    Nilesh



    --
    http://fam-tille.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)