• RFH: porting bolt-lmm

    From Nilesh Patra@21:1/5 to All on Mon Aug 1 08:40:01 2022
    Dear arm porters,

    I was trying to port bolt-lmm package, which currently builds only on
    amd64, i386
    and ppc to more archs - particularly arm. I am trying to workaround amd64-specific
    SIMD intrinsics with the libsimde-dev (SIMDEverywhere package) - debian
    wiki here[1]

    I've committed the patch I used here[2]. Along with this, I had
    added in "-DSIMDE_ENABLE_OPENMP -fopenmp-simd" to CFLAGS and CXXFLAGS in
    the box I built this in.

    I am able to get it building on arm64 box, but it does
    not seem to run the right way. It seems to trigger the bolt command
    again while trying to
    make initial guess - don't know why.
    I used the same command as used in autopkgtest to test it out. You can
    find the sample correct
    output for this here[3]

    Log pasted at the end of the e-mail. I'd appreciate any help.

    [1]: https://wiki.debian.org/SIMDEverywhere
    [2]: https://salsa.debian.org/med-team/bolt-lmm/-/commit/50badf3338741e739b2bb591a8930d177fa6de6c
    [3]: https://ci.debian.net/data/autopkgtest/unstable/amd64/b/bolt-lmm/24156452/log.gz

    --
    Best,
    Nilesh

    On arm64 box:-

    $ ./bolt \
    --bfile=EUR_subset \
    --phenoFile=EUR_subset.pheno2.covars \
    --exclude=EUR_subset.exclude2 \
    --phenoCol=PHENO \
    --phenoCol=QCOV1 \
    --modelSnps=EUR_subset.modelSnps2 \
    --reml \
    --numThreads=2
    +-----------------------------+
    | ___ |
    | BOLT-LMM, v2.3.6 /_ / |
    | October 29, 2021 /_/ |
    | Po-Ru Loh // |
    | / |
    +-----------------------------+

    Copyright (C) 2014-2021 Harvard University.
    Distributed under the GNU GPLv3 open source license.

    Boost version: 1_74

    Command line options:

    ./bolt \
    --bfile=EUR_subset \
    --phenoFile=EUR_subset.pheno2.covars \
    --exclude=EUR_subset.exclude2 \
    --phenoCol=PHENO \
    --phenoCol=QCOV1 \
    --modelSnps=EUR_subset.modelSnps2 \
    --reml \
    --numThreads=2

    Setting number of threads to 2
    fam: EUR_subset.fam
    bim(s): EUR_subset.bim
    bed(s): EUR_subset.bed

    === Reading genotype data ===

    Total indivs in PLINK data: Nbed = 379
    Total indivs stored in memory: N = 379
    Reading bim file #1: EUR_subset.bim
    Read 54051 snps
    Total snps in PLINK data: Mbed = 54051
    Reading exclude file (SNPs to exclude): EUR_subset.exclude2
    Excluded 47959 SNP(s)
    Reading list of SNPs to include in model (i.e., GRM):
    EUR_subset.modelSnps2
    WARNING: SNP has been excluded: rs2176153
    WARNING: SNP has been excluded: rs77036651
    WARNING: SNP has been excluded: rs189917831
    WARNING: SNP has been excluded: rs76452819
    WARNING: SNP has been excluded: rs77203822
    Included 1331 SNP(s) in model in 2 variance component(s)
    WARNING: 10420 SNP(s) had been excluded

    Breakdown of SNP pre-filtering results:
    1331 SNPs to include in model (i.e., GRM)
    0 additional non-GRM SNPs loaded
    52720 excluded SNPs
    Allocating 1331 x 380/4 bytes to store genotypes
    Reading genotypes and performing QC filtering on snps and indivs...
    Reading bed file #1: EUR_subset.bed
    Expecting 5134845 (+3) bytes for 379 indivs, 54051 snps
    Total indivs after QC: 379
    Total post-QC SNPs: M = 1331
    Variance component 1: 660 post-QC SNPs (name: 'chr21')
    Variance component 2: 671 post-QC SNPs (name: 'chr22')
    Time for SnpData setup = 0.630153 sec

    === Reading phenotype and covariate data ===

    Read data for 373 indivs (ignored 0 without genotypes) from:
    EUR_subset.pheno2.covars
    Number of indivs with no missing phenotype(s) to use: 369
    NOTE: Using all-1s vector (constant term) in addition to specified
    covariates
    Using quantitative covariate: CONST_ALL_ONES
    Number of individuals used in analysis: Nused = 369
    Singular values of covariate matrix:
    S[0] = 19.2094
    Total covariate vectors: C = 1
    Total independent covariate vectors: Cindep = 1

    === Initializing Bolt object: projecting and normalizing SNPs ===

    Number of chroms with >= 1 good SNP: 2
    Average norm of projected SNPs: 368.000000
    Dimension of all-1s proj space (Nused-1): 368
    Time for covariate data setup + Bolt initialization = 0.0125201 sec

    Phenotype 1: N = 369 mean = -0.000706532 std = 1.02606
    Phenotype 2: N = 369 mean = 1.53117 std = 0.499705

    === Estimating variance parameters ===

    === Making initial guesses for phenotype 1 ===

    Using 3 random trials

    +-----------------------------+
    | ___ |
    | BOLT-LMM, v2.3.6 /_ / |
    | October 29, 2021 /_/ |
    | Po-Ru Loh // |
    | / |
    +-----------------------------+

    Copyright (C) 2014-2021 Harvard University.
    Distributed under the GNU GPLv3 open source license.

    Boost version: 1_74

    Command line options:

    (null)

    ERROR: Use exactly one of the --bfile, --bfilegz, or --fam,bim,bed input formats
    Aborting due to error processing command line arguments
    For list of arguments, run with -h (--help) option

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Wise@21:1/5 to Nilesh Patra on Tue Aug 2 06:30:01 2022
    On Mon, 2022-08-01 at 06:19 +0000, Nilesh Patra wrote:

    I was trying to port bolt-lmm package, which currently builds only on
    amd64, i386 and ppc to more archs - particularly arm. I am trying to workaround amd64-specific SIMD intrinsics with the libsimde-dev (SIMDEverywhere package) - debian wiki here[1]
    I've committed the patch I used here[2].

    Why do you enable SIMDE_ENABLE_NATIVE_ALIASES but also rename
    everything to simde_*? Only one of those is needed. The idea of the SIMDE_ENABLE_NATIVE_ALIASES define is that it lets you use SIMDe
    without modifying the code, except for including the headers.

    I am able to get it building on arm64 box, but it does
    not seem to run the right way. It seems to trigger the bolt command
    again while trying to make initial guess - don't know why.
    I used the same command as used in autopkgtest to test it out.

    Your patch should not have that effect, it is quite strange, I think
    you are going to have to trace the execution of how exactly it gets
    back into main() from elsewhere, probably using gdb or similar.

    --
    bye,
    pabs

    https://wiki.debian.org/PaulWise

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEYQsotVz8/kXqG1Y7MRa6Xp/6aaMFAmLopv8ACgkQMRa6Xp/6 aaPniQ//btaptHMxxnrCho/CHThvMKvd72sXAJV9eRq1SVVS7lHTZ8XJF9Eo3fHf z03aHVvlHhuVSqqlIVWGkje5fCTxhl2mTzoEezXHtWgwzeyOZzZKBkJc4K8zL3AB iatcnlDUnora1qvbRxdF7pqDXFn0QBXrn/0oAwioEGV37Hw/jgM6hb45YR+e5rfh z+sjRDSA6ofsNSyPBwcZl0rw9BACAAagopvniw7bSjYWeG4Tv+eLwlnZ5VUsb6QU MpjU2dQ65wrtPQglzicwvRzRhPkzuG1eGj9CueeM5CZvyveLOra6BpSDKAQYbcS1 jhvqsr8vlAHIZLyf4KS+Qfe6MmsCtZN4WLROUtjxGL30oS4p4GDz+YWugpei9/Nm 7mMQpVnnr4Gd5Bwcbuz+s493X6KE9oGJ5m0swKGhnw+5VbRirxBkcTHYOqmOOanl +zyO0CayPW6dze9cobsy+qNrpJfvEY8wv0/XfiJ17Xl0rPWyFHoqzFUiPyxdW41l 4t1nOpktbTyt1BskvVjPl9CA7Zlh9ui9EoutmpZ5bUiZ660cCHEU/gkDFm/y32wb BF7gdcegQ/HEeyf2gdrXlEMXOHEzyvJGhnP02YC3qi5FGQ3na7xPcBv6ujZKklmZ kP0auszvwLRwhI7XUJ0ob9Xv020Iz3Mz9XwSMXgGBoKppZ3/3jY=
    =1eq0
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Nilesh Patra@21:1/5 to Paul Wise on Tue Aug 2 12:00:01 2022
    On 8/2/22 9:54 AM, Paul Wise wrote:
    On Mon, 2022-08-01 at 06:19 +0000, Nilesh Patra wrote:

    I was trying to port bolt-lmm package, which currently builds only on
    amd64, i386 and ppc to more archs - particularly arm. I am trying to
    workaround amd64-specific SIMD intrinsics with the libsimde-dev
    (SIMDEverywhere package) - debian wiki here[1]
    I've committed the patch I used here[2].

    Why do you enable SIMDE_ENABLE_NATIVE_ALIASES but also rename
    everything to simde_*? Only one of those is needed. [...]

    I was trying different approaches, looks like I forgot to
    properly clean up the patch before pushing :)

    I am able to get it building on arm64 box, but it does
    not seem to run the right way. It seems to trigger the bolt command
    again while trying to make initial guess - don't know why.
    I used the same command as used in autopkgtest to test it out.

    Your patch should not have that effect, it is quite strange,

    One thing that probably wasn't clear from my prev mail was that after applying the patch, it runs _fine_ in amd64 machine and probably ppc too.
    The strange log is from arm64 -- it does not run okay there.

    I think
    you are going to have to trace the execution of how exactly it gets
    back into main() from elsewhere, probably using gdb or similar.

    Well, yeah. I'll do so when I get more time.

    --
    Best,
    Nilesh

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)