• ngircd test suite failure on mips64el

    From James Cowgill@21:1/5 to Christoph Biedl on Tue Aug 22 20:30:02 2017
    Copy: debian-mips@lists.debian.org

    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --g1AeQi9IVaabePpm6RUGIRF3djpmKWomK
    Content-Type: text/plain; charset=windows-1252
    Content-Language: en-US
    Content-Transfer-Encoding: quoted-printable

    Hi,

    On 22/08/17 17:51, Christoph Biedl wrote:
    Hello (debian-mipsel and ngircd upstream),

    my latest upload of ngircd (24-2) a few days ago failed to build on
    mips64el, and on that architecture only. Checking the build log[0] and additional tests on the Debian porter box "eller" showed the cause is
    in the "test-mode" test of the test suite. More precisely, the

    | send "mode #usermode +a nick\r"
    [ mode-test.e:57 ]

    does not return a string containing "482", thus this test is considered failing, and so the entire build. Same for the "+q" test that follows
    once the above "+a" test is disabled.
    [...]> The previous upload of ngircd 24-1 back in January was built using
    gcc-6 and showed no problems, also there are no changes between -1 and
    -2 that would even remotely explain this behaviour.

    Ultimately, after rebuilding on mips64el using -O0 the test suite
    passes. Although it's usually premature to assume it: This smells like
    a compiler bug.

    Could anyone shed some light on this? Or perhaps extract the relevant
    parts for a small reproducer?

    This is very likely #871514 in gcc-7. Unfortunately it is affecting a
    lot of packages in the archive at the moment causing incorrect code to
    be generated.

    The bug happens if a "small" variable is spilled to the stack. GCC may
    emit a store of the smaller size to the stack, but then load it back as
    a 64-bit integer. The top bits of the register will then contain some
    garbage on the stack and cause comparisons like in your example to fail
    (the bottom bits will be 0 but the top bits won't be).

    Thanks,
    James


    --g1AeQi9IVaabePpm6RUGIRF3djpmKWomK--

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEE+Ixt5DaZ6POztUwQx/FnbeotAe8FAlmcaP4ACgkQx/Fnbeot Ae+lWQ//TpUOGEyp5H8E0AOb/hGoV+V1lu32feulDb7uyxsRUR1Bh84IJUhsT7bY yFvxhseAJJQIqZV9MQzcWoaq8U5rTB965qY0taTbhAyO+6nSj4yUxgnhWjY9k7mg C36OpooLdmhMCxziYW7Udcgw74B5h1zBSY/N3kNSAXrVOAGG3Zrq8gZHQjRfLIDJ 7ZCtnQKV48L2l2UDCYFiTsHwqbkbQgleuNPalOC7BLX5ZnpzvIpmmkNpVSIYaOwT f+3OXkg2490ipQ9CXK0M1h5xaTPed4CqAfogR5/s2REUbmxIa368GkD8KevbeVwv fct90UjcvC/T/OLTBIytobjgSeDJrCgIsXbeJsH0nq9ljIRx4dhMMLDJfE1hjM3N 2sLV7M+TLR31GU62zYnQKJ7ryQq5SH3sZg7fmepaV4w6TABOISeLozbxIWkpT1JD RxtC7MPcuOHDiA1HoJDE+Ar4HRnT5OInVCI+IdCTuT3wf6BAW5cHV3KGWnTimx4P BGfNTkqnW7FJoVkcI6TZmajOQUGNyoLwA65b8hRvVtNKxPNDTJqFDYEnF96Kjh5u kah8BrOEpusZ0a8MRuqONSi3nDSZ4DrAQBB3bdsL9kkTk57vMQJY8YE77SAetMXi Lr7yv5rLR7HGz8o2Wpw2W597qjVp3pIZGxKy44e8+q80tK0sWko=
    =giTu
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christoph Biedl@21:1/5 to All on Tue Aug 22 21:30:02 2017
    affects 871514 + ngircd
    found 871514 7.2.0-1
    thanks

    James Cowgill wrote...

    On 22/08/17 17:51, Christoph Biedl wrote:

    Ultimately, after rebuilding on mips64el using -O0 the test suite
    passes. Although it's usually premature to assume it: This smells like
    a compiler bug.

    This is very likely #871514 in gcc-7. Unfortunately it is affecting a
    lot of packages in the archive at the moment causing incorrect code to
    be generated.

    Ouch, I see a huge binNMU job approaching.

    The bug happens if a "small" variable is spilled to the stack. GCC may
    emit a store of the smaller size to the stack, but then load it back as
    a 64-bit integer. The top bits of the register will then contain some
    garbage on the stack and cause comparisons like in your example to fail
    (the bottom bits will be 0 but the top bits won't be).

    Thanks for sharing, this seems to be the case indeed:

    | ./src/ngircd/irc-mode.c:725
    | if(!is_oper && !is_machine && !is_owner && !is_admin) {
    | 29aa8: 16c00141 bnez s6,29fb0 <Channel_Mode+0xd58>
    | 29aac: 0257102a slt v0,s2,s7
    | ./src/ngircd/irc-mode.c:725 (discriminator 1)
    ! 29ab0: dfa20000 ld v0,0(sp)
    | 29ab4: 1440013e bnez v0,29fb0 <Channel_Mode+0xd58>
    | 29ab8: 0257102a slt v0,s2,s7
    | ./src/ngircd/irc-mode.c:725 (discriminator 2)
    ! 29abc: dfa20018 ld v0,24(sp)
    | 29ac0: 144000c5 bnez v0,29dd8 <Channel_Mode+0xb80>
    | 29ac4: 0257102a slt v0,s2,s7
    | ./src/ngircd/irc-mode.c:725 (discriminator 3)
    ! 29ac8: dfa20020 ld v0,32(sp)
    | 29acc: 10400465 beqz v0,2ac64 <Channel_Mode+0x1a0c>

    The type of the four variables is "bool", so it's certainly something
    smaller.

    Christoph

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1

    iQIcBAEBCgAGBQJZnHYxAAoJEMQsWOtZFJL9KjMP/3447x66ogWozbUZrWnUQi5E KjnfxmCdCkY6Nzxb+tqAGfzKIKLMM31LgmArbh9yM44oxwilGUDtLEkFzxP/k4tY BMwtsPF1U2xLSSPzv7FOeJKNw1n0oJ3XMQfvveUVx7Vgzoaou+9OM6wNenL9FUr2 v6aoEl2x4WbVG76XAHhZ1dL0zsJyLoAoRZTMbO6qEcWqhgKe4WzC2QZZg8pKl6Jb 8tY59bxhZs33+zxkOcqrFW1gjM07iagah31BYt9Ip37UVQQl0aYVHnhHiFCieRc2 zG4o3Amy6X1SbWJg8r7GsBpRPb0JRTKqSyBMyciQxUFYjqhZ9If7UXuOk6AfOUNX llGsmHofgxognW/MUyOLbFcVeIWuSoNL/CuUjbgFpM2hLesG25Xhb4STpZ8ERqdZ AtVwYbwJ1hvRMX+yIgrp/LQ0bRWFC+A/r0LBBDMFlnyNjQQAVlgMtfeEXR6l9/L6 ocn8w1YPiRLORp44h6Biqo5jgw3P2IXUVeS3XSb88dWFFnfpFgAGn5qn09vDDNqE BIqPfx4VC+qqrM531ScqXMHk3PrdYVBlay9AOqaQxjBVWGr0S+idllrOd6nOvbrC Mjuo2yevriAt/kCETk8qsg5xuJNPLAZeb5x0lVLrye3QSWl86pRd6UyVywZEip22 BJ1b9X/KuNrlGUO2XP8p
    =xCEI
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christoph Biedl@21:1/5 to All on Tue Aug 22 20:00:02 2017
    Hello (debian-mipsel and ngircd upstream),

    my latest upload of ngircd (24-2) a few days ago failed to build on
    mips64el, and on that architecture only. Checking the build log[0] and additional tests on the Debian porter box "eller" showed the cause is
    in the "test-mode" test of the test suite. More precisely, the

    | send "mode #usermode +a nick\r"
    [ mode-test.e:57 ]

    does not return a string containing "482", thus this test is considered failing, and so the entire build. Same for the "+q" test that follows
    once the above "+a" test is disabled.

    The related code is in irc-mode.c:723:

    case 'q': /* Owner */
    case 'a': /* Channel admin */
    if(!is_oper && !is_machine && !is_owner && !is_admin) {
    connected = IRC_WriteErrClient(Origin,
    ERR_CHANOPPRIVTOOLOW_MSG,
    Client_ID(Origin),
    Channel_Name(Channel));
    goto chan_exit;
    }

    where ERR_CHANOPPRIVTOOLOW_MSG contains the expected "482" string
    constant.

    After adding debug print statements:

    case 'a': /* Channel admin */
    +Log(LOG_ALERT, "is_oper = %d", is_oper);
    +Log(LOG_ALERT, "is_machine = %d", is_machine);
    +Log(LOG_ALERT, "is_owner = %d", is_owner);
    +Log(LOG_ALERT, "is_admin = %d", is_admin);
    if(!is_oper && !is_machine && !is_owner && !is_admin) {

    and running the test manually, i.e.

    cd src/testsuite
    strace -s 2048 -f -tt -o ~/strace.log sh -c './start-server1 ; ./mode-test ; ./stop-server1'

    I find for both architectures the same value set (all zero). The strace
    log, trimmed for readability

    [amd64]
    read(7, "mode #usermode +a nick\r\n", 2048) = 24
    write(1, "[599904:1 7] is_oper = 0\n", 28) = 28
    write(1, "[599904:1 7] is_machine = 0\n", 31) = 31
    write(1, "[599904:1 7] is_owner = 0\n", 29) = 29
    write(1, "[599904:1 7] is_admin = 0\n", 29) = 29
    write(7, ":ngircd.test.server 482 nick #usermode :Your privileges are too low\r\n", 69 <unfinished ...>

    [mips64el]
    read(7, "mode #usermode +a nick\r\n", 2048) = 24
    write(1, "[23415:1 8] is_oper = 0\n", 27) = 27
    write(1, "[23415:1 8] is_machine = 0\n", 30) = 30
    write(1, "[23415:1 8] is_owner = 0\n", 28) = 28
    write(1, "[23415:1 8] is_admin = 0\n", 28) = 28
    write(7, ":nick!~user@127.0.0.1 MODE #usermode +a nick\r\n", 46) = 46

    strongly suggests the code on mips64el does *not* follow the if()
    statement although all preconditions are met.

    The previous upload of ngircd 24-1 back in January was built using
    gcc-6 and showed no problems, also there are no changes between -1 and
    -2 that would even remotely explain this behaviour.

    Ultimately, after rebuilding on mips64el using -O0 the test suite
    passes. Although it's usually premature to assume it: This smells like
    a compiler bug.

    Could anyone shed some light on this? Or perhaps extract the relevant
    parts for a small reproducer?

    Christoph

    [0] https://buildd.debian.org/status/fetch.php?pkg=ngircd&arch=mips64el&ver=24-2&stamp=1503078437&raw=0

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1

    iQIcBAEBCgAGBQJZnGEqAAoJEMQsWOtZFJL9GT8P/iH1NcXr5z7OOjeiabBA3cn6 bEHE0PNT8id5j4BW/MsQmcrZaJJ326DKXnA6C4U/OMddNJfVn/eLS7Zp20yRL3GO XU6+O8bllUpxe9EM1RkAe4b6IX/my5ZjN+x1v/yyX1XEQPEY9WayVkDWUiqjKR9N O7UmRej+bQxj/JD+k6iCmmyoxvvsV7jRvxu6tz14FHx1kg6bd9+KZhCl17fujQlt OGGu0OMsPQS5zVZkw05bYRsHUCgrcatqMZVKo9aJdVwJWTos4zD74iYkihRrKQ/e veucs7ABxzM/6HC2T9FLI4B5G3eghy92gDm5XaBNqF1pa2fypQu/oXnWisVSkG44 qIlh3ZQ8y6WaQ37Izd5Tgce437juME7VSo8olaLtGZBoTKaT+o2sO8jix5hpVCna xy0wuvZGZdltnOEbH1brh4qfJ446xauOFGBDdrOUdeCU6+kpT+XHHNdHpyrjne+h HU4X59SIK48aGFndwzDVU0eKHBJTreKCbhLzqhSyvewIBkMuzV2Zi0AjancAowlh Pk7hl2ipY3g5mnY10PaHH38bAyYZyoWcOPsptUfyBZBrRbY2onwKB4mwv3VKEusD ZBTlH16csd4eGzkvaCZLdkZu1+iJvzAovQ8pwP9AnCPpgCmN0V/+noYL7ntGmopc LVgLnxyt5nLFNli7IAib
    =2J3Y
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)