• Not normal for the same program to be faster in C# than in C++ [Visual

    From Paolo Ferraresi@21:1/5 to All on Thu Aug 5 18:24:48 2021
    Hello, my name is Paolo Ferraresi and I program in both C# and C++, for passion/study.
    (sorry for my bad English but I will never learn properly)
    I like both C# and C++. I have no preclusions of the religion wars type. I
    find C# very convenient for almost any application but C++ should be
    considered when maximum efficiency and performance is required.
    Since a few days I'm on vacation and a bit for fun, I wrote a few lines
    that make the sieve of Eratosthenes.
    It never happens to me to write exactly the same program for C# and for C+
    +, but so for fun I said to myself: - I write two equivalent codes,
    without .NET and STL containers, only predefined data and arrays, without library algorithms, only for cycles on arrays.

    Here is the C# code:
    using System;
    using System.Diagnostics;
    class Program
    {
    const uint N = 2147483591; //Maximum array size in C#;
    static void Main(string[] args)
    {
    Stopwatch sw = new Stopwatch();
    sw.Start();
    bool[] A = new bool[N];
    for (uint i = 2; i < N; ++i) A[i] = true;
    for (uint i = 2; i < N; ++i)
    if (A[i])
    for (uint j = i; i * j < N; ++j)
    A[i*j] = false;
    sw.Stop();

    Console.WriteLine("Tempo impiegato {0} ms",
    sw.ElapsedMilliseconds);
    Console.Write("Premi un tasto... ");
    Console.ReadKey();
    }
    }

    Here is the C++ code:
    #include <iostream>
    #include <iterator>
    #include <array>
    #include <vector>
    #include <chrono>
    #include <algorithm>
    using namespace std;
    int main()
    {
    const unsigned int N = 2147483591;
    auto Tstart = chrono::high_resolution_clock::now();
    bool* A = new bool[N];
    fill(A, A + N, true);
    for (unsigned int i = 2; i < N; ++i)
    if (A[i])
    for (unsigned int j = i; i * j < N; ++j)
    A[i * j] = false;
    auto Tend = chrono::high_resolution_clock::now();
    chrono::duration<double, std::milli> diff = Tend - Tstart;
    cout << "Tempo impiegato " << diff.count() << "ms\n";
    delete[] A;
    cout << "Press ENTER ";
    cin.get();
    return 0;
    }

    I understand very well that the validity of such a game is almost null,
    but since I remain convinced that the same code (and I repeat the same
    almost 1:1, not that one used arrays and the other STL) cannot be faster
    in C# than in C++, imagine the surprise when the results came out:

    C# (release build): 23093 ms, (48630 ms in debug build).
    C++(release build): 33516 ms, (44906 ms in debug build).

    I came to the conclusion that maybe I have a specific problem from me and
    not from others. I mean apart from the numerical values, which will be different depending on the hardware each of us has, try to see if C++
    turns out faster from you, which is what I expect, honestly.
    Also I changed build from debug to release and that's it, leaving
    everything default, except that I always put x64 platform.

    Finally, I use Windows 10 Pro, Visual Studio 2019 community edition and as
    I mentioned the code was compiled for x64 platform.
    The CPU is AMD Ryzen Threadripper 3970X.

    If any of you would like to try it, then explain what's not working at my place? Thanks bye!
    Greetings from Italy! :)

    Paolo Ferraresi
    fp.box@alice.it
    [I would guess the difference is something unrelated to the loops, such as how the
    two runtime systems allocate a two gigabyte array. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Neuner@21:1/5 to fp.box@alice.it on Thu Aug 5 22:58:02 2021
    On Thu, 5 Aug 2021 18:24:48 -0000 (UTC), Paolo Ferraresi
    <fp.box@alice.it> wrote:
    :
    a C# program
    :
    a C++ program
    :

    I understand very well that the validity of such a game is almost null,
    but since I remain convinced that the same code (and I repeat the same
    almost 1:1, not that one used arrays and the other STL) cannot be faster
    in C# than in C++, imagine the surprise when the results came out:

    C# (release build): 23093 ms, (48630 ms in debug build).
    C++(release build): 33516 ms, (44906 ms in debug build).

    I came to the conclusion that maybe I have a specific problem from me and
    not from others. I mean apart from the numerical values, which will be >different depending on the hardware each of us has, try to see if C++
    turns out faster from you, which is what I expect, honestly.
    Also I changed build from debug to release and that's it, leaving
    everything default, except that I always put x64 platform.

    Finally, I use Windows 10 Pro, Visual Studio 2019 community edition and as
    I mentioned the code was compiled for x64 platform.
    The CPU is AMD Ryzen Threadripper 3970X.
    [I would guess the difference is something unrelated to the loops, such as how the
    two runtime systems allocate a two gigabyte array. -John]


    My guess is that the issue is how (and what) you are timing.


    Multitasking operating systems royally screw up attempts to accurately
    time things. If you want to compare code, you should run many
    iterations of each version and compare their /average/ running times.


    As John mentioned, you are timing allocation of the array. Heap
    management is very different in these two languages, so the time to
    allocate things is, in general, not comparable.

    You shouldn't time the array initialization either unless you do it
    the same way in both programs. The templated fill() algorithm in C++
    will not necessarily be equivalent to the inline C# code - it depends
    on your compiler settings. [see below]


    I would modify your programs like so (in pseudo):

    total = 0
    allocate array
    for N iterations
    initialize array
    start = current time
    run the seive
    stop = current time
    total += (stop - start)
    average = total / N

    And then run for N = 50 (or more) to filter out multitasking related
    noise in the individual timings.


    To really be fair you need to find out what optimizations are being
    done by the C# and dotNET JIT compilers (which work together), and
    adjust your C++ compiler to do the equivalent. Simply doing a
    'release' compile in both languages is not sufficient: in general C++
    is harder to optimize than C#, and many of the possible optimizations
    are disabled by default because they can break code that does not
    comply with their requirements. Except in 'unsafe' code, C# largely
    makes it impossible for code to not comply with its optimization
    requirements.

    But start with more accurate timing.


    Hope this helps,
    George

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to George Neuner on Fri Aug 6 16:14:22 2021
    On 2021-08-06 04:58, George Neuner wrote:

    I would modify your programs like so (in pseudo):

    total = 0
    allocate array
    for N iterations
    initialize array
    start = current time
    run the seive
    stop = current time
    total += (stop - start)
    average = total / N

    Another technique is factoring out looping and other overheads by
    running empty loop as a reference:

    start = current time
    for N iterations
    initialize array
    run the sieve
    end loop;
    total1 = start - current time

    start = current time
    for N iterations
    initialize array
    end loop;
    total2 = start - current time

    average = (total1 - total2) / N -- sieve only

    P.S. Optimizations is a usual suspect of ruining benchmark measures.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Dmitry A. Kazakov on Sun Aug 8 12:38:51 2021
    On Sunday, August 8, 2021 at 8:14:17 AM UTC-7, Dmitry A. Kazakov wrote:

    (snip)

    P.S. Optimizations is a usual suspect of ruining benchmark measures.

    Yes. But in this case, one might want to include some optimizations.

    Note, though, that a good optimizer could optimize out all the loops, as no output depends on them. Some programs I know output the total number
    of primes found, which stops that from happening.

    Also, compilers can do some calculations at compile time. I don't expect
    it for this, but that does ruin some benchmarks. There are stories of complicated
    benchmarks being done entirely at compile time, except for output of the result.

    I would have used a j += i loop. Not that multiply is that slow on modern processors,
    but that it a big part of the loop. One compiler might optimize that one for you.

    One might store the array as bits (8 bool/byte), the other as bytes. It isn't so
    obvious which one is faster, but often the 1 bool/byte is faster, until you run out
    of real memory.

    How much real memory do you have? And the speed might depend in complicated ways on the memory management system.

    And note that you aren't comparing languages, but two compilers implementing those languages (which is why it goes here).
    [In this case, the documentation says they both allocate a byte for each bool but the other
    stuff is all possible. Also remember C++ is a traditional compiler, while C# is bytecode and JIT. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)