• Help porting Ceph 16.2.6 to mips6el, mipsel and armel

    From Thomas Goirand@21:1/5 to All on Sat Nov 20 11:40:02 2021
    Hi,

    Latest Ceph doesn't build on 3 arch:

    https://buildd.debian.org/status/package.php?p=ceph&suite=experimental

    (plus the unofficial ports...)

    What worries me the most is mips6el, where the linker says "undefined
    reference to `__atomic_load_16'" (and more like this). I don't
    understand because there really is a -latomic parameter to GCC when
    linking, so it should be working.

    I'm not sure what happens with the other arch, though it's important
    that at least the libs can build (so that Qemu can be linked with librbd support).

    Help from anyone, prior to upload to Unstable to replace Ceph 14.2.21,
    would be very much appreciated.

    Cheers,

    Thomas Goirand (zigo)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bastian Blank@21:1/5 to Thomas Goirand on Sat Nov 20 12:20:01 2021
    On Sat, Nov 20, 2021 at 11:31:04AM +0100, Thomas Goirand wrote:
    What worries me the most is mips6el, where the linker says "undefined reference to `__atomic_load_16'" (and more like this). I don't
    understand because there really is a -latomic parameter to GCC when
    linking, so it should be working.

    Not all architectures support sub-wordsize atomic operations. The only
    way to fix that is to use a larger type (long usually).

    Bastian

    --
    Extreme feminine beauty is always disturbing.
    -- Spock, "The Cloud Minders", stardate 5818.4

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Thomas Goirand on Sat Nov 20 13:00:01 2021
    On Sat, 20 Nov 2021 at 11:31:04 +0100, Thomas Goirand wrote:
    Latest Ceph doesn't build on 3 arch:

    https://buildd.debian.org/status/package.php?p=ceph&suite=experimental

    (plus the unofficial ports...)

    What worries me the most is mips6el, where the linker says "undefined reference to `__atomic_load_16'" (and more like this). I don't
    understand because there really is a -latomic parameter to GCC when
    linking, so it should be working.

    If I download and unpack libatomic1_11.2.0-10_mips64el.deb, libatomic1_11.2.0-10_mips64el/DEBIAN/symbols mentions symbol __atomic_load_16@LIBATOMIC_1.0, so -latomic should have been
    sufficient.

    I think this is a problem with compiler argument order. It looks as though
    the failing link might be this one (one long line in the log, newlines
    added here to make it more comprehensible):

    /usr/bin/c++ \
    -g \
    -O2 \
    -ffile-prefix-map=/<<PKGBUILDDIR>>=. \
    -fstack-protector-strong \
    -Wformat \
    -Werror=format-security \
    -Wdate-time \
    -D_FORTIFY_SOURCE=2 \
    -O2 \
    -g \
    -DNDEBUG \
    -Wl,-z,relro \
    -Wl,-z,now \
    -Wl,--as-needed \
    -latomic \
    -rdynamic \
    -pie \
    CMakeFiles/rbd-mirror.dir/main.cc.o \
    -o \
    ../../../bin/rbd-mirror \
    \
    -Wl,-rpath,/<<PKGBUILDDIR>>/obj-mips64el-linux-gnuabi64/lib: \
    ../../../lib/librbd_mirror_internal.a \
    ../../../lib/librbd_mirror_types.a \
    ../../../lib/librbd_api.a \
    ../../../lib/librbd_internal.a \
    ../../../lib/librbd_types.a \
    ../../../lib/libjournal.a \
    ../../../lib/liblibneorados.a \
    ../../../lib/librados.so.2.0.0 \
    ../../../lib/libosdc.a \
    ../../../lib/libcls_rbd_client.a \
    ../../../lib/libcls_lock_client.a \
    ../../../lib/libcls_journal_client.a \
    ../../../lib/libglobal.a \
    ../../../lib/libheap_profiler.a \
    /usr/lib/mips64el-linux-gnuabi64/libtcmalloc.so \
    /usr/lib/mips64el-linux-gnuabi64/libssl.so \
    /lib/mips64el-linux-gnuabi64/libcryptsetup.so \
    ../../../lib/libceph-common.so.2 \
    ../../../lib/libfmt.a \
    /usr/lib/mips64el-linux-gnuabi64/libblkid.so \
    /usr/lib/mips64el-linux-gnuabi64/libcrypto.so \
    ../../../lib/libjson_spirit.a \
    ../../../lib/libcommon_utf8.a \
    ../../../lib/liberasure_code.a \
    ../../../lib/libcrc32.a \
    ../../../lib/libarch.a \
    /usr/lib/mips64el-linux-gnuabi64/libboost_thread.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_atomic.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_random.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_system.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_program_options.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_date_time.so.1.74.0 \
    /usr/lib/mips64el-linux-gnuabi64/libboost_iostreams.so.1.74.0 \
    -lstdc++fs \
    -pthread \
    /usr/lib/mips64el-linux-gnuabi64/libudev.so \
    /usr/lib/mips64el-linux-gnuabi64/libibverbs.so \
    /usr/lib/mips64el-linux-gnuabi64/librdmacm.so \
    -ldl \
    /usr/lib/mips64el-linux-gnuabi64/librt.so \
    -lresolv

    You'll notice -latomic appears *before* the various .a and .o files, but
    in general, link order matters: each object on the linker command-line is
    only used to satisfy the dependencies of objects that appear before it.
    So you'll want -latomic to appear later, next to other dependencies
    like -ldl.

    A common reason for this to be a problem is adding -latomic (or some
    other library) to Automake's foo_LDFLAGS instead of the correct foo_LIBS
    or foo_LIBADD, or the equivalent in other build systems. I see ceph is
    built with CMake: I don't know the specifics of how this fits together in CMake, but most likely it has the same distinction between dependencies
    and non-dependency linker options that e.g. Automake and Meson do.

    Regarding the other failing release architectures:

    On armel, the error seems to be:

    /tmp/cc3CvITT.s:2594: Error: selected processor does not support `yield' in ARM mode
    make[7]: *** [CMakeFiles/rocksdb.dir/build.make:3436: CMakeFiles/rocksdb.dir/third-party/folly/folly/synchronization/DistributedMutex.cpp.o] Error 1

    so probably there is some inline assembly that assumes ARMv6 or later and
    does not account for armel's ARMv5-based baseline.

    On mipsel, it looks like 32 bits of address space might not be enough,
    so you might need to try the same tricks that e.g. webkit2gtk uses to
    save address space:

    virtual memory exhausted: Cannot allocate memory
    make[3]: *** [src/msg/CMakeFiles/common-msg-objs.dir/build.make:93: src/msg/CMakeFiles/common-msg-objs.dir/Message.cc.o] Error 1

    I hope this helps,
    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Goirand@21:1/5 to Simon McVittie on Sun Nov 28 16:00:02 2021
    Hi Simon,

    Thanks a lot for your help.

    On 11/20/21 12:55 PM, Simon McVittie wrote:
    [...]
    You'll notice -latomic appears *before* the various .a and .o files, but
    in general, link order matters: each object on the linker command-line is only used to satisfy the dependencies of objects that appear before it.
    So you'll want -latomic to appear later, next to other dependencies
    like -ldl.

    This was completely right, though the reason for it seems to be the
    cmake test failure to take mips64el into account. Hopefully, this patch
    fixes it:

    https://salsa.debian.org/ceph-team/ceph/-/blob/debian/unstable/debian/patches/cmake-test-for-16-bytes-atomic-support-on-mips-also.patch

    On armel, the error seems to be:

    /tmp/cc3CvITT.s:2594: Error: selected processor does not support `yield' in ARM mode
    make[7]: *** [CMakeFiles/rocksdb.dir/build.make:3436: CMakeFiles/rocksdb.dir/third-party/folly/folly/synchronization/DistributedMutex.cpp.o] Error 1

    so probably there is some inline assembly that assumes ARMv6 or later and does not account for armel's ARMv5-based baseline.

    Hopefully, this patch fixes it: https://salsa.debian.org/ceph-team/ceph/-/blob/debian/unstable/debian/patches/only-yied-under-armv7-and-above.patch

    On mipsel, it looks like 32 bits of address space might not be enough,
    so you might need to try the same tricks that e.g. webkit2gtk uses to
    save address space:

    virtual memory exhausted: Cannot allocate memory
    make[3]: *** [src/msg/CMakeFiles/common-msg-objs.dir/build.make:93: src/msg/CMakeFiles/common-msg-objs.dir/Message.cc.o] Error 1

    I've added a few tricks to hopefully save on memory at build time on 32
    bits arch, I'm not sure this will be enough though.

    Anyways, now I have a different failure, on all arch this time:

    /usr/include/boost/container/detail/copy_move_algo.hpp:1083:10: error: ‘__fallthrough__’ was not declared in this scope; did you mean ‘fallthrough’?
    1083 | BOOST_FALLTHROUGH;
    | ^~~~~~~~~~~~~~~~~

    Any idea what can cause this? To me, this looks like related to the
    latest gcc update (but not sure)...

    Cheers,

    Thomas Goirand (zigo)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)