• arm64 char conversion issue

    From Mike Scott@21:1/5 to All on Mon Jan 31 10:53:05 2022
    I've hit an interesting problem. This is on fdbsd13/arm64/rpi4, and
    there's something amiss with char to int conversions. Or at least, a
    difference from i386.


    The immediate symptom is that spfmilter won't start - just says
    'unrecognised option', with a seemingly empty string appended that's
    actually 0xFF.

    This comes from the getopt() loop, whose guts are

    char c;
    while ( ( c = getopt_long( argc, argv, shortopts, longopts, &idx ) ) !=
    -1 ) {
    .....


    It turns out that on the pi, the integer result from getopt is truncated
    then zero-padded rather sign-extended, so the comparison is 0xFF != -1,
    which always fails.


    A simple test program like
    char c = -1;
    int x = c;
    printf("c=%x x=%x\n", c, x);


    gives on the rpi (it truncates)
    c=ff x=ff

    but on an i386 (fbsd11.4) (it sign-extends)
    c=ffffffff x=ffffffff


    Any thoughts please? This seems potentially a major gotcha. The
    spfmilter code uses a char to hold getopt()'s result - as long as it sign-extends, no issue: but truncation will cause an error. I see the
    man page example for getopt uses an int, which is clearly safer.

    I confess I've been away from C programming for so long, I forget what
    the correct behaviour is. Any thoughts? Is spfmilter wrong, or is the
    arm64 compiler wrong?


    --
    Mike Scott
    Harlow, England

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter van Hooft@21:1/5 to Mike Scott on Mon Jan 31 13:02:31 2022
    On 2022-01-31, Mike Scott <usenet.16@scottsonline.org.uk.invalid> wrote:

    I've hit an interesting problem. This is on fdbsd13/arm64/rpi4, and
    there's something amiss with char to int conversions. Or at least, a difference from i386.


    The immediate symptom is that spfmilter won't start - just says
    'unrecognised option', with a seemingly empty string appended that's
    actually 0xFF.

    This comes from the getopt() loop, whose guts are

    char c;
    while ( ( c = getopt_long( argc, argv, shortopts, longopts, &idx ) ) !=
    -1 ) {
    .....


    Different default for -funsigned-char and -fsigned-char ?

    peter

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Weisgerber@21:1/5 to Mike Scott on Mon Jan 31 13:25:31 2022
    On 2022-01-31, Mike Scott <usenet.16@scottsonline.org.uk.invalid> wrote:

    I've hit an interesting problem. This is on fdbsd13/arm64/rpi4, and
    there's something amiss with char to int conversions. Or at least, a difference from i386.

    On arm64, plain "char" is "unsigned char". On x86 it's "signed char".

    char c;
    while ( ( c = getopt_long( argc, argv, shortopts, longopts, &idx ) ) !=
    -1 ) {

    That's broken code. It must be "int c". Only an int will capture
    all possible char values and additionally -1.

    The prototypical example is this:

    while ((c = getchar()) != EOF) { /* EOF is -1 */
    ...
    }

    With "char c" this is broken for both platforms:
    * On arm64 it's an infinite loop.
    * On x86 it will prematurely terminate when it encounters a byte
    with the value 0xff.

    Any variable that is intended to hold any character plus EOF _must_
    be declared int. This is basic C.

    Most architectures default to signed char. Notably arm and powerpc,
    in their 32 and 64-bit incarnations, default to unsigned.

    --
    Christian "naddy" Weisgerber naddy@mips.inka.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Scott@21:1/5 to Christian Weisgerber on Mon Jan 31 13:57:39 2022
    On 31/01/2022 13:25, Christian Weisgerber wrote:
    On 2022-01-31, Mike Scott <usenet.16@scottsonline.org.uk.invalid> wrote:

    I've hit an interesting problem. This is on fdbsd13/arm64/rpi4, and
    there's something amiss with char to int conversions. Or at least, a
    difference from i386.

    On arm64, plain "char" is "unsigned char". On x86 it's "signed char".

    char c;
    while ( ( c = getopt_long( argc, argv, shortopts, longopts, &idx ) ) !=
    -1 ) {

    That's broken code. It must be "int c". Only an int will capture
    all possible char values and additionally -1.

    The prototypical example is this:

    while ((c = getchar()) != EOF) { /* EOF is -1 */
    ...
    }

    With "char c" this is broken for both platforms:
    * On arm64 it's an infinite loop.
    * On x86 it will prematurely terminate when it encounters a byte
    with the value 0xff.

    Any variable that is intended to hold any character plus EOF _must_
    be declared int. This is basic C.

    Most architectures default to signed char. Notably arm and powerpc,
    in their 32 and 64-bit incarnations, default to unsigned.


    OK, thanks. I'll contact the program's author and suggest a patch is in
    order. Worrying that no-one on the fbsd team seems have to tried running
    the code though.

    I wonder how many things like this are lurking. Lucky this one gave an immediate and obvious error.

    Oh, and thanks for pointing out the differing defaults on the two platforms.



    --
    Mike Scott
    Harlow, England

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)