• [gentoo-dev] [PATCH v2 0/5] cargo.eclass: optimizations

    From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    Hi,

    Changes from v1: `@` is used to separate crate names and versions
    rather than `/`. Thanks to Denis Lisov for the suggestion.

    --
    Best regards,
    Michał Górny


    Michał Górny (5):
    eclass/tests: Add a minimal benchmark for cargo.eclass
    cargo.eclass: Add variable alternative to $(cargo_crate_uris)
    cargo.eclass: Optimize GIT_CRATES check
    cargo.eclass: Support separating crate names/versions via `@`
    cargo.eclass: Mark GIT_CRATES as pre-inherit

    eclass/cargo.eclass | 127 ++++++++++++++++++++++--------------
    eclass/tests/cargo-bench.sh | 114 ++++++++++++++++++++++++++++++++
    2 files changed, 192 insertions(+), 49 deletions(-)
    create mode 100755 eclass/tests/cargo-bench.sh

    --
    2.41.0

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    Add a helper function that sets ${CARGO_CRATE_URIS} variable to make
    it possible to set SRC_URI without subshells. This gives a slight
    speedup (~20%):

    ```
    real 300 it/s
    user 324 it/s
    ```

    Signed-off-by: Michał Górny <mgorny@gentoo.org>
    ---
    eclass/cargo.eclass | 48 +++++++++++++++++++++++++------------
    eclass/tests/cargo-bench.sh | 4 +++-
    2 files changed, 36 insertions(+), 16 deletions(-)

    diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass
    index 991e808d453f..4e0cd1e4de70 100644
    --- a/eclass/cargo.eclass
    +++ b/eclass/cargo.eclass
    @@ -68,7 +68,7 @@ ECARGO_VENDOR="${ECARGO_HOME}/gentoo"
    # "
    # inherit cargo
    # ...
    -# SRC_URI="$(cargo_crate_uris)"
    +# SRC_URI="${CARGO_CRATE_URIS}"
    # @CODE

    # @ECLASS_VARIABLE: GIT_CRATES
    @@ -162,31 +162,31 @@ ECARGO_VENDOR="${ECARGO_HOME}/gentoo"
    # group, and then switch over to building with FEATURES=userpriv.
    # Or vice-versa.

    -# @FUNCTION: cargo_crate_uris
    +# @ECLASS_VARIABLE: CARGO_CRATE_URIS
    +# @OUTPUT_VARIABLE
    +# @DESCRIPTION:
    +# List of URIs to put in SRC_URI created from CRATES variable.
    +
    +# @FUNCTION: _cargo_set_crate_uris
    +# @USAGE: <crates>
    # @DESCRIPTION:
    # Generates the URIs to put in SRC_URI to help fetch dependencies.
    # Constructs a list of c
  • From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    The variable needs to be set before inherit in order for
    ${CARGO_CRATE_URIS} to be set correctly. Currently all ebuilds using GIT_CRATES except for one define it pre-inherit anyway, and this makes
    it consistent with CRATES.

    Signed-off-by: Michał Górny <mgorny@gentoo.org>
    ---
    eclass/cargo.eclass | 1 +
    1 file changed, 1 insertion(+)

    diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass
    index 8618c90bd986..2ff1f042ba79 100644
    --- a/eclass/cargo.eclass
    +++ b/eclass/cargo.eclass
    @@ -77,6 +77,7 @@ ECARGO_VENDOR="${ECARGO_HOME}/gentoo"

    # @ECLASS_VARIABLE: GIT_CRATES
    # @DEFAULT_UNSET
    +# @PRE_INHERIT
    # @DESCRIPTION:
    # Bash associative array containing all of the crates that are to be
    # fetched via git. It is used by cargo_crate_uris.
    --
    2.41.0

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    Optimize the GIT_CRATES check to call `declare -p` only if the variable
    is actually set. In the vast majority of ebuilds using cargo.eclass,
    it's not set, so the subshell-first approach is slowing things down.
    With this change, the speed improves by another ~20%:

    ```
    real 363 it/s
    user 365 it/s
    ```

    Signed-off-by: Michał Górny <mgorny@gentoo.org>
    ---
    eclass/cargo.eclass | 58 ++++++++++++++++++++++-----------------------
    1 file changed, 29 insertions(+), 29 deletions(-)

    diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass
    index 4e0cd1e4de70..d97bb0df9348 100644
    --- a/eclass/cargo.eclass
    +++ b/eclass/cargo.eclass
    @@ -189,35 +189,35 @@ _cargo_set_crate_uris() {
    CARGO_CRATE_URIS+="${url} "
    done

    - local git_crates_type
    - git_crates_type="$(declare -p GIT_CRATES 2>&-)"
    - if [[ ${git_crates_type} == "declare -A "* ]]; then
    - local crate commit crate_uri crate_dir repo_ext feat_expr
    -
    - for crate in "${!GIT_CRATES[@]}"; do
    - IFS=';' read -r crate_uri commit crate_dir <<< "${GIT_CRATES[${crate}]}"
    -
    - case "${crate_uri}" in
    - https://github.com/*)
    - repo_ext=".gh"
    - repo_name="${crate_uri##*/}"
    - crate_uri="${crate_uri%/}/archive/%commit%.tar.gz"
    - ;;
    - https://gitlab.com/*)
    - repo_ext=".gl"
    - repo_name="${crate_uri##*/}"
    - crate_uri="${crate_uri%/}/-/archive/%commit%/${repo_name}-%
  • From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    Support specifying crate names and versions separated by `@` character
    rather than `-`. Since `@` are not valid in crate names, this
    makes splitting the tokens trivial and free of regular expressions. Effectively, the `@` variant is roughly 180% faster:

    ```
    * CRATES with '@' separator
    real 952 it/s
    user 952 it/s
    * CRATES with '-' separator
    real 339 it/s
    user 339 it/s
    ```

    Signed-off-by: Michał Górny <mgorny@gentoo.org>
    ---
    eclass/cargo.eclass | 24 +++++---
    eclass/tests/cargo-bench.sh | 111 +++++++++++++++++++-----------------
    2 files changed, 75 insertions(+), 60 deletions(-)

    diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass
    index d97bb0df9348..8618c90bd986 100644
    --- a/eclass/cargo.eclass
    +++ b/eclass/cargo.eclass
    @@ -59,12 +59,16 @@ ECARGO_VENDOR="${ECARGO_HOME}/gentoo"
    # Bash string containing all crates that are to be downloaded.
    # It is used by cargo_crate_uris.
    #
    +# Ideally, crate names and versions should be separated by a `@`
    +# character. A legacy syntax using hyphen is also supported but it is
    +# much slower.
    +#
    # Example:
    # @CODE
    # CRATES="
    -# metal-1.2.3
    -# bar-4.5.6
    -# iron_oxide-0.0.1
    +# metal@1.2.3
    +# bar@4.5.6
    +# iron_oxide@0.0.1
    # "
    # inherit cargo
    # ...
    @@ -182,10 +186,16 @@ _cargo_set_crate_uris() {
    CARGO_CRATE_URIS=
    for crate in ${crates}; do
    local name version url
    - [[ $crate =~ $regex ]] || die "Could not parse name and version from c
  • From =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?@21:1/5 to All on Fri Jun 16 21:30:01 2023
    The initial results on my machine are:

    ```
    real 252 it/s
    user 289 it/s
    ```

    Signed-off-by: Michał Górny <mgorny@gentoo.org>
    ---
    eclass/tests/cargo-bench.sh | 107 ++++++++++++++++++++++++++++++++++++
    1 file changed, 107 insertions(+)
    create mode 100755 eclass/tests/cargo-bench.sh

    diff --git a/eclass/tests/cargo-bench.sh b/eclass/tests/cargo-bench.sh
    new file mode 100755
    index 000000000000..cdc5e4431c14
    --- /dev/null
    +++ b/eclass/tests/cargo-bench.sh
    @@ -0,0 +1,107 @@
    +#!/bin/bash
    +# Copyright 2023 Gentoo Authors
    +# Distributed under the terms of the GNU General Public License v2
    +
    +EAPI=8
    +source tests-common.sh || exit
    +
    +export LC_ALL=C
    +
    +ITERATIONS=1000
    +RUNS=3
    +
    +doit() {
    + for (( i = 0; i < ITERATIONS; i++ )); do
    + SRC_URI="
    + $(cargo_crate_uris)
    + "
    + done
    +}
    +
    +timeit() {
    + local real=()
    + local user=()
    + local x vr avg
    +
    + for (( x = 0; x < RUNS; x++ )); do
    + while read tt tv; do
    + case ${tt} in
    + real) real+=( ${tv} );;
    + user) user+=( ${tv} );;
    + esac
    + done < <( ( time -p doit ) 2>&1 )
    + done
    +
    + [[ ${#real[@]} == ${RUNS} ]] || die "Did not get ${RUNS} rea