• binary_search accoring to its name

    From Bonita Montero@21:1/5 to All on Sat Nov 27 15:38:54 2021
    This is a somewhat more efficient binary_search, but it requires
    that all key in the array are unique. It's more efficient because
    it doesn't need two predicate compares to test for equality since
    it uses a predicate which returns a stong_ordering-object.
    If you're debugging it tests whether the object left and right
    from the result are less or greater to ensure uniqueness of the
    key-value.

    template<typename RandomIt, typename T, typename Pred>
    inline
    RandomIt xbinary_search( RandomIt begin, RandomIt end, T const &key,
    Pred pred )
    requires std::random_access_iterator<RandomIt>
    &&
    requires( Pred pred, typename std::iterator_traits<RandomIt>::value_type &elem, T const &key )
    {
    { pred( elem, key ) } -> std::convertible_to<std::strong_ordering>;
    }
    {
    using namespace std;
    size_t lower = 0, upper = end - begin, mid;
    strong_ordering so;
    while( lower != upper )
    {
    mid = (lower + upper) / 2;
    so = pred( begin[mid], key );
    if( so == 0 )
    {
    assert(!mid || pred( begin[mid - 1], key ) < 0);
    assert(begin + mid + 1 == end || pred( begin[mid + 1], key ) > 0);
    return begin + mid;
    }
    if( so > 0 )
    upper = mid;
    else
    lower = mid + 1;
    }
    return end;
    }

    Another advantage is, that the supplied key can have a differnt type
    than the elements in the array; so you can f.e. check for an member
    of an element.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marcel Mueller@21:1/5 to All on Sun Nov 28 12:27:45 2021
    Am 27.11.21 um 15:38 schrieb Bonita Montero:
    This is a somewhat more efficient binary_search, but it requires
    that all key in the array are unique. It's more efficient because
    it doesn't need two predicate compares to test for equality since
    it uses a predicate which returns a stong_ordering-object.

    This is well known. But it only improves the best case if a key in an
    unique list happens to match early. In general the algorithm is still
    O(log n). Amortized you save at most 1 comparison for very large containers. This is done at the expense that for non unique keys the behavior is no
    longer well defined.

    Assuming than a three way comparison is often a bit more expensive, it
    might even be less efficient for some collections.

    Of course, there might be use cases where this is a good advice, but not
    in general.


    Another advantage is, that the supplied key can have a differnt type
    than the elements in the array; so you can f.e. check for an member
    of an element.

    Asymmetric comparers are another feature. This can be quite useful.
    Typically in case of associative containers of objects that have an
    intrinsic immutable key.
    Normally i write small container classes exactly for this purpose.
    Something like keyed_set<T,KEY_EXTRACTOR>.


    Marcel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bonita Montero@21:1/5 to All on Sun Nov 28 17:03:05 2021
    Am 28.11.2021 um 12:27 schrieb Marcel Mueller:

    Assuming than a three way comparison is often a bit more expensive,
    it might even be less efficient for some collections.

    The result is often stored inside the CPU-flags.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Juha Nieminen@21:1/5 to Marcel Mueller on Mon Nov 29 05:57:01 2021
    Marcel Mueller <news.5.maazl@spamgourmet.org> wrote:
    Assuming than a three way comparison is often a bit more expensive, it
    might even be less efficient for some collections.

    And not only does it require a slower three-way comparison, it also uses two conditionals instead of the one that std::binary_search/std::lower_bound
    uses. Depending on the situation the additional conditional may actually
    make the code slightly slower (especially since it's likely that the CPU
    won't predict it correctly in about half the iterations, on average). Unpredictable conditionals are performance killers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bonita Montero@21:1/5 to All on Mon Nov 29 08:26:20 2021
    Am 29.11.2021 um 06:57 schrieb Juha Nieminen:

    And not only does it require a slower three-way comparison, ...

    Check the generated code: the result of the three way comparison is
    in the flags, depending on the type.

    ..., it also uses two conditionals instead of the one that std::binary_search/std::lower_bound uses.


    The ==-conditional is predictable almost any time since it matches
    only one time in the loop. But you're right: in most cases this
    code is slower than with lower bound.
    But I've developed a lower_bound alternative which is slightly
    faster than the lower-bound alternative:

    template<typename RandomIt, typename T, typename Pred>
    inline
    RandomIt xlower_bound( RandomIt begin, RandomIt end, T const &key, Pred
    pred )
    requires std::random_access_iterator<RandomIt>
    &&
    requires( Pred pred, decltype(*RandomIt()) &elem, T const &key )
    {
    { pred( elem, key ) } -> std::convertible_to<std::strong_ordering>;
    }
    {
    using namespace std;
    assert(end >= begin);
    size_t hit = end - begin, lower = 0, upper = hit, mid;
    while( lower != upper )
    {
    mid = lower + (upper - lower) / 2;
    if( pred( begin[mid], key ) >= 0 )
    hit = mid,
    upper = mid;
    else
    lower = mid + 1;
    }
    return begin + hit;
    }

    On a vector of 1, 4 million string_view-objects I get
    about 3% more performance on my computer with clang 13.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bonita Montero@21:1/5 to All on Mon Nov 29 14:02:43 2021
    This is a small benchmark:

    auto bench = [&]<typename SearchFn>( SearchFn searchFn ) -> double
    requires requires( SearchFn searchFn, vsv_it iterator, string_view
    const &key )
    {
    { searchFn( iterator, iterator, key ) } -> convertible_to<vsv_it>;
    }
    {
    size_t sumExist = 0;
    auto start = high_resolution_clock::now();
    for( size_t it = ITERATIONS; it; --it )
    sumExist += searchFn( strings.begin(), strings.end(), strings[uidIndex( mt )] ) != strings.end();
    double ns = (double)(int64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / ITERATIONS;
    aSumExist += sumExist;
    return ns;
    };
    double lb = bench( []( vsv_it begin, vsv_it end, string_view const &key ) -> vsv_it
    {
    vsv_it it = lower_bound( begin, end, key );
    if( it == end && *it != key )
    return end;
    return it;
    } );
    cout << "lb: " << lb << endl;
    double xlb = bench( []( vsv_it begin, vsv_it end, string_view const &key ) -> vsv_it
    {
    vsv_it it = xlower_bound( begin, end, key,
    []( string_view &elem, string_view const &key ) -> strong_ordering {
    return elem <=> key; } );
    if( it == end && *it != key )
    return end;
    return it;
    } );
    cout << "xlb: " << xlb << endl;
    double xbs = bench( []( vsv_it begin, vsv_it end, string_view const &key )
    {
    return xbinary_search( begin, end, key,
    []( string_view &elem, string_view const &key ) -> strong_ordering {
    return elem <=> key; } );
    } );
    cout << "xbs: " << xbs << endl;

    vsv_it is vector<string_view>::iterator

    144,420,73 sorted string_view items, avgerage length: 8.68825 characters lower_bound: 1835.47ns
    xlower_bound: 1741.33ns
    xbinary_search: 1845.06

    I think with such a large dataset the time isn't decided upon the
    algorithm but upon the random memory-accesses.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)