• expr {$n * -1} vs expr {-$n}

    From oleg.o.nemanov@gmail.com@21:1/5 to All on Fri Aug 27 06:50:21 2021
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Aug 27 16:43:20 2021
    Am 27.08.2021 um 15:50 schrieb oleg.o....@gmail.com:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)


    Perhaps:

    tcl::mathop::* $n -1

    It may take a while until Wizards like Sergey, Donal etc. might really
    answer the question.

    You may also add the detail, if the snipped is within a proc (e.g. Byte compiled) or not. If so, it depends, how often the proc is called, e.g.
    if byte compilation is a win or not.

    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arjen Markus@21:1/5 to oleg on Fri Aug 27 07:22:05 2021
    On Friday, August 27, 2021 at 3:50:23 PM UTC+2, oleg wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)

    I strongly doubt the difference is measurable. But in general, that is the way to go measure it. :)

    Regards,

    Arjen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Fri Aug 27 08:02:01 2021
    пятница, 27 августа 2021 г. в 17:43:24 UTC+3, Harald Oehlmann:
    You may also add the detail, if the snipped is within a proc (e.g. Byte compiled) or not.
    I'm interested in both cases :-).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to oleg.o....@gmail.com on Fri Aug 27 18:11:17 2021
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)

    Read up on the manpage of Tcl's 'time' command. Using it, you can
    answer your own question, on your hardware.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Gerald Lester@21:1/5 to oleg.o....@gmail.com on Fri Aug 27 13:54:02 2021
    On 8/27/21 10:02 AM, oleg.o....@gmail.com wrote:
    пятница, 27 августа 2021 г. в 17:43:24 UTC+3, Harald Oehlmann:
    You may also add the detail, if the snipped is within a proc (e.g. Byte
    compiled) or not.
    I'm interested in both cases :-).


    Then run tests

    --
    +----------------------------------------------------------------------+
    | Gerald W. Lester, President, KNG Consulting LLC |
    | Email: Gerald.Lester@kng-consulting.net | +----------------------------------------------------------------------+

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Harald Oehlmann on Fri Aug 27 14:43:22 2021
    On Friday, August 27, 2021 at 4:43:24 PM UTC+2, Harald Oehlmann wrote:
    Am 27.08.2021 um 15:50 schrieb oleg:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)

    Perhaps:

    tcl::mathop::* $n -1

    Funny enough, tcl::mathop::- serves as negative sign operator if called with a single argument. One never stops learning new stuff.

    You may also add the detail, if the snipped is within a proc (e.g. Byte compiled) or not. If so, it depends, how often the proc is called, e.g.
    if byte compilation is a win or not.

    I am not one of the wizards, but I guess that tcl::mathop and expr (with braces) produce (nearly) identical byte code.

    As others pointed out, we can easily look behind the curtains:
    $ proc neg a {expr {-$a}}
    $ proc mult a {expr {$a*-1}}
    $ proc opneg a {::tcl::mathop::- $a}
    $ proc opmult a {::tcl::mathop::* $a -1}
    $ ::tcl::unsupported::getbytecode proc neg
    literals {} variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 uminus 3 done} auxiliary {} commands {{codefrom 0 codeto 2 scriptfrom 0 scriptto 9 script {expr {-$a}}}} script {expr {-$a}} namespace :: stackdepth 1 exceptdepth 0
    $ ::tcl::unsupported::getbytecode proc mult
    literals -1 variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 {push1 @0} 4 mult 5 done} auxiliary {} commands {{codefrom 0 codeto 4 scriptfrom 0 scriptto 11 script {expr {$a*-1}}}} script {expr {$a*-1}} namespace :: stackdepth
    2 exceptdepth 0
    $ ::tcl::unsupported::getbytecode proc opneg
    literals {} variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 uminus 3 done} auxiliary {} commands {{codefrom 0 codeto 2 scriptfrom 0 scriptto 18 script {::tcl::mathop::- $a}}} script {::tcl::mathop::- $a} namespace ::
    stackdepth 1 exceptdepth 0
    $ ::tcl::unsupported::getbytecode proc opmult
    literals -1 variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 {push1 @0} 4 mult 5 done} auxiliary {} commands {{codefrom 0 codeto 4 scriptfrom 0 scriptto 21 script {::tcl::mathop::* $a -1}}} script {::tcl::mathop::* $a -1}
    namespace :: stackdepth 2 exceptdepth 0
    $ ::tcl::unsupported::getbytecode ;# expects one of {lambda method objmethod proc script}

    I'd conclude (from the instructions)
    * no diff between expr and mathop
    * uminus should be slightly faster

    HTH
    Martin

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Uwe Klein@21:1/5 to All on Sat Aug 28 12:20:01 2021
    Am 27.08.21 um 17:02 schrieb oleg.o....@gmail.com:
    пятница, 27 августа 2021 г. в 17:43:24 UTC+3, Harald Oehlmann:
    You may also add the detail, if the snipped is within a proc (e.g. Byte
    compiled) or not.
    I'm interested in both cases :-).

    the -$n expr is about 3 times faster ( @level 0 ):

    % set n 4543534534
    4543534534
    % time {expr {$n * -1}} 100000
    0.91321 microseconds per iteration
    % time {expr {-$n}} 100000
    0.35942 microseconds per iteration
    % time {expr {$n * -1}} 100000
    0.82213 microseconds per iteration
    % time {expr {-$n}} 100000
    0.33567 microseconds per iteration
    % time {expr {$n * -1}} 100000
    0.87075 microseconds per iteration
    % time {expr {-$n}} 100000
    0.33751 microseconds per iteration


    if you wrap it in a proc each

    % proc a n {expr {-$n}
    }
    % proc b n {expr {$n * -1}
    }

    % time {a $n} 100000
    0.7419 microseconds per iteration
    % time {b $n} 100000
    1.27659 microseconds per iteration

    Uwe

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 02:46:55 2021
    суббота, 28 августа 2021 г. в 13:20:04 UTC+3, Uwe Klein:
    Am 27.08.21 um 17:02 schrieb oleg.o....@gmail.com:
    пятница, 27 августа 2021 г. в 17:43:24 UTC+3, Harald Oehlmann:
    You may also add the detail, if the snipped is within a proc (e.g. Byte >> compiled) or not.
    I'm interested in both cases :-).

    the -$n expr is about 3 times faster ( @level 0 ):

    % set n 4543534534
    4543534534
    % time {expr {$n * -1}} 100000
    0.91321 microseconds per iteration
    % time {expr {-$n}} 100000
    0.35942 microseconds per iteration
    % time {expr {$n * -1}} 100000
    0.82213 microseconds per iteration
    % time {expr {-$n}} 100000
    0.33567 microseconds per iteration
    % time {expr {$n * -1}} 100000
    0.87075 microseconds per iteration
    % time {expr {-$n}} 100000
    0.33751 microseconds per iteration


    if you wrap it in a proc each

    % proc a n {expr {-$n}
    }
    % proc b n {expr {$n * -1}
    }

    % time {a $n} 100000
    0.7419 microseconds per iteration
    % time {b $n} 100000
    1.27659 microseconds per iteration

    I got the same results. Thanks! But for small n - for example, 2 or -3 - a difference is very small(but -$n is still faster then $n*-1).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 02:39:34 2021
    пятница, 27 августа 2021 г. в 21:11:20 UTC+3, Rich:
    oleg.o....@gmail.com <oleg.o....@gmail.com> wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)
    Read up on the manpage of Tcl's 'time' command. Using it, you can
    answer your own question, on your hardware.

    Thanks, but i already done this step :-). Time can give a different results on my machine than others, for example(due various factors).
    I'm interested in algorithmic difference(if it exists).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 02:47:54 2021
    суббота, 28 августа 2021 г. в 00:43:25 UTC+3, heinrichmartin:
    On Friday, August 27, 2021 at 4:43:24 PM UTC+2, Harald Oehlmann wrote:
    Am 27.08.2021 um 15:50 schrieb oleg:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)

    Perhaps:

    tcl::mathop::* $n -1
    Funny enough, tcl::mathop::- serves as negative sign operator if called with a single argument. One never stops learning new stuff.
    You may also add the detail, if the snipped is within a proc (e.g. Byte compiled) or not. If so, it depends, how often the proc is called, e.g.
    if byte compilation is a win or not.
    I am not one of the wizards, but I guess that tcl::mathop and expr (with braces) produce (nearly) identical byte code.

    As others pointed out, we can easily look behind the curtains:
    $ proc neg a {expr {-$a}}
    $ proc mult a {expr {$a*-1}}
    $ proc opneg a {::tcl::mathop::- $a}
    $ proc opmult a {::tcl::mathop::* $a -1}
    $ ::tcl::unsupported::getbytecode proc neg
    literals {} variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 uminus 3 done} auxiliary {} commands {{codefrom 0 codeto 2 scriptfrom 0 scriptto 9 script {expr {-$a}}}} script {expr {-$a}} namespace :: stackdepth 1 exceptdepth
    0
    $ ::tcl::unsupported::getbytecode proc mult
    literals -1 variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 {push1 @0} 4 mult 5 done} auxiliary {} commands {{codefrom 0 codeto 4 scriptfrom 0 scriptto 11 script {expr {$a*-1}}}} script {expr {$a*-1}} namespace ::
    stackdepth 2 exceptdepth 0
    $ ::tcl::unsupported::getbytecode proc opneg
    literals {} variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 uminus 3 done} auxiliary {} commands {{codefrom 0 codeto 2 scriptfrom 0 scriptto 18 script {::tcl::mathop::- $a}}} script {::tcl::mathop::- $a} namespace ::
    stackdepth 1 exceptdepth 0
    $ ::tcl::unsupported::getbytecode proc opmult
    literals -1 variables {{{scalar arg} a}} exception {} instructions {0 {loadScalar1 %0} 2 {push1 @0} 4 mult 5 done} auxiliary {} commands {{codefrom 0 codeto 4 scriptfrom 0 scriptto 21 script {::tcl::mathop::* $a -1}}} script {::tcl::mathop::* $a -1}
    namespace :: stackdepth 2 exceptdepth 0
    $ ::tcl::unsupported::getbytecode ;# expects one of {lambda method objmethod proc script}

    I'd conclude (from the instructions)
    * no diff between expr and mathop
    * uminus should be slightly faster


    Thanks for this explanation, Martin. Thanks all!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 03:02:07 2021
    Hm. At C level there is no difference between a = a * -1 and a = -a:

    ~$ cat t.c
    void
    main(void)
    {
    int a = 1;
    int i;

    for(i = 0; i < 1000000; i++)
    a = a * -1;
    }

    ~$ cat tt.c
    void
    main(void)
    {
    int a = 1;
    int i;

    for(i = 0; i < 1000000; i++)
    a = -a;
    }

    ~$ gcc -S -c -g t.c
    ~$ gcc t.s
    ~$ time ./a.out

    real 0m0.012s
    user 0m0.010s
    sys 0m0.002s

    ~$ as -alhnd t.s
    [CUT]
    7:t.c **** for(i = 0; i < 1000000; i++)
    18 .loc 1 7 8
    19 000b C745FC00 movl $0, -4(%rbp)
    19 000000
    20 .loc 1 7 2
    21 0012 EB07 jmp .L2
    22 .L3:
    8:t.c **** a = a * -1;
    23 .loc 1 8 5 discriminator 3
    24 0014 F75DF8 negl -8(%rbp)
    [CUT]

    ~$ gcc -S -c -g tt.c
    ~$ gcc tt.s
    ~$ time ./a.out

    real 0m0.012s
    user 0m0.010s
    sys 0m0.002s

    ~$ as -alhnd tt.s
    [CUT]
    7:tt.c **** for(i = 0; i < 1000000; i++)
    18 .loc 1 7 8
    19 000b C745FC00 movl $0, -4(%rbp)
    19 000000
    20 .loc 1 7 2
    21 0012 EB07 jmp .L2
    22 .L3:
    8:tt.c **** a = -a;
    23 .loc 1 8 5 discriminator 3
    24 0014 F75DF8 negl -8(%rbp)
    [CUT]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Heller@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 06:53:24 2021
    At Mon, 30 Aug 2021 03:02:07 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    Hm. At C level there is no difference between a = a * -1 and a = -a:

    ~$ cat t.c
    void
    main(void)
    {
    int a = 1;
    int i;

    for(i = 0; i < 1000000; i++)
    a = a * -1;
    }

    ~$ cat tt.c
    void
    main(void)
    {
    int a = 1;
    int i;

    for(i = 0; i < 1000000; i++)
    a = -a;
    }

    ~$ gcc -S -c -g t.c
    ~$ gcc t.s
    ~$ time ./a.out

    real 0m0.012s
    user 0m0.010s
    sys 0m0.002s

    ~$ as -alhnd t.s
    [CUT]
    7:t.c **** for(i = 0; i < 1000000; i++)
    18 .loc 1 7 8
    19 000b C745FC00 movl $0, -4(%rbp)
    19 000000
    20 .loc 1 7 2
    21 0012 EB07 jmp .L2
    22 .L3:
    8:t.c **** a = a * -1;
    23 .loc 1 8 5 discriminator 3
    24 0014 F75DF8 negl -8(%rbp)
    [CUT]

    The C optimizer is effectively replacing "a * -1" with "-a", because "-1" is a compile-time constant and the compiler knows that the negation instruction is faster than a multiply instruction. If instead, if the -1 was a variable that happened to contain a value of -1, the code would be different. The C code
    for the Tcl match ops is not going to get this optimization, so the Tcl code using expr {$a * -1} is going to be slower than expr {-$a}. Also, the C optimizer is going to recognize that "a = a * -1" is the same as "a *= -1" and will make that additional optimization as well.


    ~$ gcc -S -c -g tt.c
    ~$ gcc tt.s
    ~$ time ./a.out

    real 0m0.012s
    user 0m0.010s
    sys 0m0.002s

    ~$ as -alhnd tt.s
    [CUT]
    7:tt.c **** for(i = 0; i < 1000000; i++)
    18 .loc 1 7 8
    19 000b C745FC00 movl $0, -4(%rbp)
    19 000000
    20 .loc 1 7 2
    21 0012 EB07 jmp .L2
    22 .L3:
    8:tt.c **** a = -a;
    23 .loc 1 8 5 discriminator 3
    24 0014 F75DF8 negl -8(%rbp)
    [CUT]



    --
    Robert Heller -- Cell: 413-658-7953 GV: 978-633-5364
    Deepwoods Software -- Custom Software Services
    http://www.deepsoft.com/ -- Linux Administration Services
    heller@deepsoft.com -- Webhosting Services

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 05:03:15 2021
    понедельник, 30 августа 2021 г. в 14:53:33 UTC+3, Robert Heller:
    The C optimizer is effectively replacing "a * -1" with "-a", because "-1" is a
    compile-time constant and the compiler knows that the negation instruction is
    faster than a multiply instruction.

    Exactly. :-)

    so the Tcl code
    using expr {$a * -1} is going to be slower than expr {-$a}.

    It's would be great if this info will be in expr man page.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Mon Aug 30 14:57:53 2021
    Am 30.08.2021 um 14:03 schrieb oleg.o....@gmail.com:
    понедельник, 30 августа 2021 г. в 14:53:33 UTC+3, Robert Heller:
    The C optimizer is effectively replacing "a * -1" with "-a", because "-1" is a
    compile-time constant and the compiler knows that the negation instruction is
    faster than a multiply instruction.

    Exactly. :-)

    so the Tcl code
    using expr {$a * -1} is going to be slower than expr {-$a}.

    It's would be great if this info will be in expr man page.


    Oleg,

    I appreciate your effort. The TCL optimization highly depends on the
    action of people understanding what is happening under the hood.
    If one of those persons optimizes one place, it it getting faster.
    So, this changes over time.
    You may follow posts from Sergey (who has won some speed-up challenges
    by FlightAware), Donald, Kevin and some others. Speed penalties are
    often in corners you don't expect them. An example is the call to
    commands written in C (as was mentioned here recently).

    If you would do the effort to put speed-related information to the
    man-page, nobody will stop you. Nevertheless, we are happy to have/get
    correct man pages on the functional level which is the main aim.

    Take care and thank you,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Heller@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 08:33:15 2021
    At Mon, 30 Aug 2021 05:03:15 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    понедельник, 30 августа 2021 г. в 14:53:33 UTC+3, Robert Heller:
    The C optimizer is effectively replacing "a * -1" with "-a", because "-1" is a
    compile-time constant and the compiler knows that the negation instruction is
    faster than a multiply instruction.

    Exactly. :-)

    so the Tcl code
    using expr {$a * -1} is going to be slower than expr {-$a}.

    It's would be great if this info will be in expr man page.

    The timing difference between a multiplication instruction and a negation instruction should be well known. On a practical level using "expr {$a * -1}" vs. "expr {-$a}" should be very obvious, when the "-1" is a manifest constant. This is a trivial hand optimization, one the C compiler implements on its own -- good for lazy C programmers -- Tcl programmers need to be a little more pro-active, but that is always the case for Tcl programming.

    I would expect that there is little point in ever using something like "expr {$a * -1}" in production code. OTOH, using something like "expr {$a * [signof $foo]}" where proc signof {x} returns -1 or 1 depending on the sign of x makes better sense than something like:

    if {[signof $foo] < 0} {
    set a [expr {-$a}]
    }




    --
    Robert Heller -- Cell: 413-658-7953 GV: 978-633-5364
    Deepwoods Software -- Custom Software Services
    http://www.deepsoft.com/ -- Linux Administration Services
    heller@deepsoft.com -- Webhosting Services

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 06:33:33 2021
    понедельник, 30 августа 2021 г. в 15:57:57 UTC+3, Harald Oehlmann:
    If you would do the effort to put speed-related information to the
    man-page, nobody will stop you. Nevertheless, we are happy to have/get correct man pages on the functional level which is the main aim.

    Ok. Where to send the patch?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 13:55:16 2021
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????, 27 ??????? 2021 ?. ? 21:11:20 UTC+3, Rich:
    oleg.o....@gmail.com <oleg.o....@gmail.com> wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)
    Read up on the manpage of Tcl's 'time' command. Using it, you can
    answer your own question, on your hardware.

    Thanks, but i already done this step :-).

    You had not indicated so yet in this thread when I posted my reply.

    Time can give a different results on my machine than others, for
    example(due various factors).

    Of course. You might have a brand new i7, which will probably be
    markedly faster than my 7+ year old Xeon here.

    I'm interested in algorithmic difference(if it exists).

    Which is what time will give you. If you test -$x vs. -1*$x on the
    same machine, then you have kept machine differences constant (to the
    extent you can keep them constant) and so any differences must be due
    to different algorithms used for -$x vs -1*$x.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Mon Aug 30 15:44:39 2021
    Am 30.08.2021 um 15:33 schrieb oleg.o....@gmail.com:
    понедельник, 30 августа 2021 г. в 15:57:57 UTC+3, Harald Oehlmann:
    If you would do the effort to put speed-related information to the
    man-page, nobody will stop you. Nevertheless, we are happy to have/get
    correct man pages on the functional level which is the main aim.

    Ok. Where to send the patch?


    Oleg,

    the corresponding man-page is on core.tcl-lang.org/tcl in the folder /doc.

    You may also look to this wiki page to test the formatting:

    https://core.tcl-lang.org/tcl/wiki?name=How+to+edit/test+tcl+man+pages&p

    The formatting is IMHO difficult.

    You may add your changes as path to a ticket.
    You may also ask for commit right and add it to a branch directly.

    Take care,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 07:09:52 2021
    понедельник, 30 августа 2021 г. в 16:44:43 UTC+3, Harald Oehlmann:
    Oleg,

    the corresponding man-page is on core.tcl-lang.org/tcl in the folder /doc. You may also look to this wiki page to test the formatting: https://core.tcl-lang.org/tcl/wiki?name=How+to+edit/test+tcl+man+pages&p
    The formatting is IMHO difficult.

    You may add your changes as path to a ticket.

    Done.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Robert Heller on Mon Aug 30 14:08:08 2021
    Robert Heller <heller@deepsoft.com> wrote:
    At Mon, 30 Aug 2021 05:03:15 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    ????????????????????????????????????????????, 30 ???????????????????????????? 2021 ????. ???? 14:53:33 UTC+3, Robert Heller:
    The C optimizer is effectively replacing "a * -1" with "-a", because "-1" is a
    compile-time constant and the compiler knows that the negation instruction is
    faster than a multiply instruction.

    Exactly. :-)

    so the Tcl code
    using expr {$a * -1} is going to be slower than expr {-$a}.

    It's would be great if this info will be in expr man page.

    The timing difference between a multiplication instruction and a negation instruction should be well known. On a practical level using "expr {$a * -1}" vs. "expr {-$a}" should be very obvious, when the "-1" is a manifest constant.
    This is a trivial hand optimization, one the C compiler implements on its own -- good for lazy C programmers -- Tcl programmers need to be a little more pro-active, but that is always the case for Tcl programming.

    I agree with Robert. The man page is not the place to state well known performance differences. The manpage should document the
    options/syntax the command accepts, but if it starts also documenting
    obvious performance differences we will have a never ending set of
    "tips" to be added.

    Placing this on the Wiki, as a discussion on the [expr] page would be reasonable.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 07:25:31 2021
    понедельник, 30 августа 2021 г. в 16:33:23 UTC+3, Robert Heller:
    The timing difference between a multiplication instruction and a negation instruction should be well known.

    Hm. Why? For example, i have C background and for me this is not "well known". Moreover in other programming languages this situation may differ. And in tcl docs this case isn't described. So, how it should be well known?

    On a practical level using "expr {$a * -1}"
    vs. "expr {-$a}" should be very obvious, when the "-1" is a manifest constant.

    No. This is not obvious. May be you have big tcl experience and that's why you think so.
    But i have no such experience and i say you that this is not obvious.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 07:31:04 2021
    понедельник, 30 августа 2021 г. в 17:08:11 UTC+3, Rich:
    I agree with Robert. The man page is not the place to state well known performance differences.

    This is not well known :-).

    The manpage should document the
    options/syntax the command accepts, but if it starts also documenting obvious performance differences we will have a never ending set of
    "tips" to be added.

    Ok. Why this is bad (when a man page collect all needed info for developer)?

    Placing this on the Wiki, as a discussion on the [expr] page would be reasonable.

    May be you are right. But why a documentation for some func should be necessarily
    spread on different locations? Is this good?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 07:18:08 2021
    понедельник, 30 августа 2021 г. в 16:55:19 UTC+3, Rich:
    Of course. You might have a brand new i7, which will probably be
    markedly faster than my 7+ year old Xeon here.
    I'm interested in algorithmic difference(if it exists).
    Which is what time will give you. If you test -$x vs. -1*$x on the
    same machine, then you have kept machine differences constant (to the
    extent you can keep them constant) and so any differences must be due
    to different algorithms used for -$x vs -1*$x.

    Not really. I told about situation when some operation/costruct can be slower/faster
    on different machine architectures.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Heller@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 10:26:54 2021
    At Mon, 30 Aug 2021 07:25:31 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    понедельник, 30 августа 2021 г. в 16:33:23 UTC+3, Robert Heller:
    The timing difference between a multiplication instruction and a negation instruction should be well known.

    Hm. Why? For example, i have C background and for me this is not "well known".
    Moreover in other programming languages this situation may differ. And in tcl docs this case isn't described. So, how it should be well known?

    Modern *compiled* languages do all sorts of "obvious" optimizations, so I
    guess someone who has not written assembly code or otherwise studied machine-level coding or otherwise learned about low-level CPU ALU implementations, might not know about the timing differences between different sorts of machine instructions.


    On a practical level using "expr {$a * -1}"
    vs. "expr {-$a}" should be very obvious, when the "-1" is a manifest constant.

    No. This is not obvious. May be you have big tcl experience and that's why you think so.
    But i have no such experience and i say you that this is not obvious.




    --
    Robert Heller -- Cell: 413-658-7953 GV: 978-633-5364
    Deepwoods Software -- Custom Software Services
    http://www.deepsoft.com/ -- Linux Administration Services
    heller@deepsoft.com -- Webhosting Services

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Mon Aug 30 09:39:03 2021
    понедельник, 30 августа 2021 г. в 18:27:02 UTC+3, Robert Heller:
    Modern *compiled* languages do all sorts of "obvious" optimizations, so I guess someone who has not written assembly code or otherwise studied machine-level coding or otherwise learned about low-level CPU ALU implementations, might not know about the timing differences between different
    sorts of machine instructions.

    Agree. But this is just confirming that it is not obvious and well known, isn't it :-)?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert Heller@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 13:07:13 2021
    At Mon, 30 Aug 2021 09:39:03 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    понедельник, 30 августа 2021 г. в 18:27:02 UTC+3, Robert Heller:
    Modern *compiled* languages do all sorts of "obvious" optimizations, so I guess someone who has not written assembly code or otherwise studied machine-level coding or otherwise learned about low-level CPU ALU implementations, might not know about the timing differences between different
    sorts of machine instructions.

    Agree. But this is just confirming that it is not obvious and well known, isn't it :-)?

    Maybe, but it is the sort of thing that is (should?) be part of a CompSci 101 course. It does not really belong in the man pages. It IS something that belongs in a CompSci textbook and/or course.

    I guess it is something I and probably most programmers of my generation, who started programming with Apple ][s, Commadore 64s, Trash-80s, Kim-1s, etc. learned early on. Even though current generation programmers are learning
    with modern machines with optimizing compilers on multi-core 64-bit
    processors, it still makes some sense to teach them about the basics of low-level CPU ALU implementations and instill a sense of what the performance costs of different ways to implement a given computation.




    --
    Robert Heller -- Cell: 413-658-7953 GV: 978-633-5364
    Deepwoods Software -- Custom Software Services
    http://www.deepsoft.com/ -- Linux Administration Services
    heller@deepsoft.com -- Webhosting Services

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From ted brown@21:1/5 to Robert Heller on Mon Aug 30 13:19:52 2021
    On 8/30/2021 11:07 AM, Robert Heller wrote:
    At Mon, 30 Aug 2021 09:39:03 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    понедельник, 30 августа 2021 г. в 18:27:02 UTC+3, Robert Heller:
    Modern *compiled* languages do all sorts of "obvious" optimizations, so I >>> guess someone who has not written assembly code or otherwise studied
    machine-level coding or otherwise learned about low-level CPU ALU
    implementations, might not know about the timing differences between different
    sorts of machine instructions.

    Agree. But this is just confirming that it is not obvious and well known,
    isn't it :-)?

    Maybe, but it is the sort of thing that is (should?) be part of a CompSci 101 course. It does not really belong in the man pages. It IS something that belongs in a CompSci textbook and/or course.

    I guess it is something I and probably most programmers of my generation, who started programming with Apple ][s, Commadore 64s, Trash-80s, Kim-1s, etc. learned early on. Even though current generation programmers are learning with modern machines with optimizing compilers on multi-core 64-bit processors, it still makes some sense to teach them about the basics of low-level CPU ALU implementations and instill a sense of what the performance costs of different ways to implement a given computation.





    I would think that the section on [expr] performance is better left as
    is, since it discusses the most important thing for a new tcl programmer
    to be aware of, namely braces.

    % set n 123456789
    123456789

    % time {expr -$n} 100000
    0.9811310000000001 microseconds per iteration

    % time {expr {-$n}} 100000
    0.133201 microseconds per iteration

    % time {expr {-1*$n}} 100000
    0.154149 microseconds per iteration

    Maybe the bytecode interpreter overhead is more important than the
    actual arithmetic statement that is eventually executed. Is a uminus
    going to be that much faster than a mult, or is it that there's an extra
    push that's the difference?


    % ::tcl::unsupported::disassemble script {expr {-$n}}
    .. snip ..
    Command 1: "expr {-$n}"
    (0) push1 0 # "n"
    (2) loadStk
    (3) uminus
    (4) done

    % ::tcl::unsupported::disassemble script {expr {-1*$n}}
    .. snip ..
    Command 1: "expr {-1*$n}"
    (0) push1 0 # "-1"
    (2) push1 1 # "n"
    (4) loadStk
    (5) mult
    (6) done


    And, what happens when (and if?) we get the jit machine code that I saw
    a fascinating conference talk about? That would likely change everything
    and the difference might become undetectable, and not worth documenting.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 21:55:59 2021
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????????, 30 ??????? 2021 ?. ? 17:08:11 UTC+3, Rich:
    I agree with Robert. The man page is not the place to state well known
    performance differences.

    This is not well known :-).

    The manpage should document the
    options/syntax the command accepts, but if it starts also documenting
    obvious performance differences we will have a never ending set of
    "tips" to be added.

    Ok. Why this is bad (when a man page collect all needed info for developer)?

    Well, in this case it would be bad because it is an obvious issue that
    should not need mention in the man page. Other than beginners, the
    rest of us should already know that a multiplication operation is more computational effort than a negation operation. The man page should
    document the necessary syntax in order to show how to use the command.
    Adding obvious items merely clutters the man page with information that
    all but true beginners have no need to read through. And true
    beginners should learn that the computational effort of a
    multiplication operation is larger than a negation via other teaching
    texts, not via the man page.

    Placing this on the Wiki, as a discussion on the [expr] page would be
    reasonable.

    May be you are right. But why a documentation for some func should be necessarily
    spread on different locations? Is this good?

    You could argue that the man page might include a reference to the wiki
    'expr' page for further information and/or tips and tricks, that way if
    someone is unaware of the wiki they would then be made aware.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 22:24:49 2021
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????????, 30 ??????? 2021 ?. ? 16:33:23 UTC+3, Robert Heller:
    The timing difference between a multiplication instruction and a
    negation instruction should be well known.

    Hm. Why? For example, i have C background and for me this is not
    "well known".

    If you consider what actually occurs to perform a multiplication (i.e.,
    the implimentation of a binary number multiplication algorithm) vs.
    what actually occurs for a negation, the difference will be very
    obvious.

    The common base 10 method most of us were taught (which is termed the
    "shift and add" method in many Computer Science texts) is much more
    work to perform a multiplication (and, some early CPU's implimented
    their multiplication instructions using this very method, which is why
    'mul' instructions were 64 clocks while and/or/xor were 1 clock
    instructions).

    The most unoptimized version of this algorithm requires n shifts and n
    add operations, producing a 2n size output, where n is the number of
    bits of the inputs (assuming both inputs are equal size). Computer
    Science texts are also filled with various alternative multiplication
    algoritms that improve upon this basic simple version, but they are
    still "more work" overall than a negation.

    Depending upon the data type involved, a negation can be as simple as
    an XOR to toggle the state of a single bit.

    For an example, see this PDF (https://edge.edx.org/c4x/BITSPilani/EEE231/asset/8086_family_Users_Manual_1_.pdf)
    (note, I choose the 8086 specifically because I already know it was
    slow for multiply).

    Page 67, Add register to register is 3 clocks, a logical And is also 3
    clocks.

    Page 70, Imul of 8 bit quantity is 80 to 98 clocks, and Imul of a 16
    bit quantity is 128 to 154 clocks.

    That is anywhere from 26 times to 51 times longer for a multiply than
    an add or a logical and.

    And, page 83, Xor register to register is 3 clocks.


    Moreover in other programming languages this situation may differ.

    Very unlikely. The programming language does not change the underlying algorithms necessary to perform base 2 multiplication or base 2
    negation, and those algorithms are more work for multiplication than
    for negation.

    And in tcl docs this case isn't described. So, how it should be well
    known?

    That which is considered well known is often not described.

    On a practical level using "expr {$a * -1}"
    vs. "expr {-$a}" should be very obvious, when the "-1" is a manifest constant.

    No. This is not obvious. May be you have big tcl experience and
    that's why you think so.

    There's no Tcl experience required. Since the dawn of computing the
    algorithm for multiplying binary numbers has required more steps than
    the algorithm for negating a binary number.

    But i have no such experience and i say you that this is not obvious.

    Experience *does* have a direct impact on what any one individual
    considers obvious. In my case, I've known this fact (multiplication is
    more work than negation) for at least 34 years, possibly longer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to ted brown on Mon Aug 30 22:27:44 2021
    ted brown <tedbrown888@gmail.com> wrote:
    On 8/30/2021 11:07 AM, Robert Heller wrote:
    At Mon, 30 Aug 2021 09:39:03 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    ?¿?¾?½?µ?´?µ?»???½?¸?º, 30 ?°?²?³???????° 2021 ?³. ?² 18:27:02 UTC+3, Robert Heller:
    Modern *compiled* languages do all sorts of "obvious"
    optimizations, so I guess someone who has not written assembly
    code or otherwise studied machine-level coding or otherwise
    learned about low-level CPU ALU implementations, might not know
    about the timing differences between different sorts of machine
    instructions.

    Agree. But this is just confirming that it is not obvious and well
    known, isn't it :-)?

    Maybe, but it is the sort of thing that is (should?) be part of a
    CompSci 101 course. It does not really belong in the man pages. It
    IS something that belongs in a CompSci textbook and/or course.

    I guess it is something I and probably most programmers of my
    generation, who started programming with Apple ][s, Commadore 64s,
    Trash-80s, Kim-1s, etc. learned early on. Even though current
    generation programmers are learning with modern machines with
    optimizing compilers on multi-core 64-bit processors, it still makes
    some sense to teach them about the basics of low-level CPU ALU
    implementations and instill a sense of what the performance costs of
    different ways to implement a given computation.





    I would think that the section on [expr] performance is better left as
    is, since it discusses the most important thing for a new tcl programmer
    to be aware of, namely braces.

    Indeed, yes, this should remain, because there is no alternative way
    (other than pure trial and error) to learn the brace fact.

    But basic Computer Science 101/201 level material (binary
    multiplication is a complex algorithm) should not be part of the man
    pages.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Robert Heller on Mon Aug 30 22:32:15 2021
    Robert Heller <heller@deepsoft.com> wrote:
    At Mon, 30 Aug 2021 07:25:31 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o.nemanov@gmail.com> wrote:


    ????????????????????????????????????????????, 30 ???????????????????????????? 2021 ????. ???? 16:33:23 UTC+3, Robert Heller:
    The timing difference between a multiplication instruction and a
    negation instruction should be well known.

    Hm. Why? For example, i have C background and for me this is not
    "well known". Moreover in other programming languages this
    situation may differ. And in tcl docs this case isn't described.
    So, how it should be well known?

    Modern *compiled* languages do all sorts of "obvious" optimizations,

    Yes they do, and some of it seems like 'magic' sometimes.

    so I guess someone who has not written assembly code or otherwise
    studied machine-level coding or otherwise learned about low-level CPU
    ALU implementations, might not know about the timing differences
    between different sorts of machine instructions.

    This is indeed *very* true. A background in C, with no experience at
    assembly, does not directly teach that multiplication operations are
    more expensive time wise than negation operations.

    For those of us who wrote our fair share of assembly years ago, and
    spent hours pouring over the instruction clock references for timing
    purposes learned *very quickly* that multiplication was more
    complicated. We learned even more so on those CPU's that lacked a
    multiply instruction, where *we* had to write the underlying
    multiplication routine ourselves if we wanted to multiply two values.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to oleg.o....@gmail.com on Mon Aug 30 22:35:54 2021
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????????, 30 ??????? 2021 ?. ? 16:55:19 UTC+3, Rich:
    Of course. You might have a brand new i7, which will probably be
    markedly faster than my 7+ year old Xeon here.
    I'm interested in algorithmic difference(if it exists).
    Which is what time will give you. If you test -$x vs. -1*$x on the
    same machine, then you have kept machine differences constant (to the
    extent you can keep them constant) and so any differences must be due
    to different algorithms used for -$x vs -1*$x.

    Not really. I told about situation when some operation/costruct can
    be slower/faster on different machine architectures.

    Then you left an awful lot of information up to one's imagination,
    because this:

    "What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)"

    Does not suggest to me that you are asking which of those are
    faster/slower on different machine architectures.

    And with the exception of your "hi", that quoted line above is all that
    was in your original posting to which I originally replied to create
    this sub-thread.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From ted brown@21:1/5 to Rich on Mon Aug 30 18:50:25 2021
    On 8/30/2021 6:55 AM, Rich wrote:
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????, 27 ??????? 2021 ?. ? 21:11:20 UTC+3, Rich:
    oleg.o....@gmail.com <oleg.o....@gmail.com> wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)
    Read up on the manpage of Tcl's 'time' command. Using it, you can
    answer your own question, on your hardware.

    Thanks, but i already done this step :-).

    You had not indicated so yet in this thread when I posted my reply.

    Time can give a different results on my machine than others, for
    example(due various factors).

    Of course. You might have a brand new i7, which will probably be
    markedly faster than my 7+ year old Xeon here.

    I'm interested in algorithmic difference(if it exists).

    Which is what time will give you. If you test -$x vs. -1*$x on the
    same machine, then you have kept machine differences constant (to the
    extent you can keep them constant) and so any differences must be due
    to different algorithms used for -$x vs -1*$x.


    And [time] can tell you things that can be a surprise; so after I lopped
    off the right most digit:

    % set n 4543534534
    4543534534
    % time {expr {$n * -1}} 100000
    0.453894 microseconds per iteration

    % set n 454353453 ;# w/o the last digit
    454353453
    % time {expr {$n * -1}} 100000
    0.16878700000000002 microseconds per iteration


    When I ran my tests with a different number than the OP used, I was
    thinking, gee I have a really fast machine!

    I'm guessing that the number the OP chose to test with must be larger
    than my system can handle as a simple machine word and so something is
    going on by tcl behind the scenes that I don't know about. I used 32 bit
    tcl here, but 64 bit tcl got the same timing.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to ted brown on Tue Aug 31 03:00:28 2021
    ted brown <tedbrown888@gmail.com> wrote:
    On 8/30/2021 6:55 AM, Rich wrote:
    oleg.o....@gmail.com <oleg.o.nemanov@gmail.com> wrote:
    ???????, 27 ??????? 2021 ?. ? 21:11:20 UTC+3, Rich:
    oleg.o....@gmail.com <oleg.o....@gmail.com> wrote:
    Hi, all.

    What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)
    Read up on the manpage of Tcl's 'time' command. Using it, you can
    answer your own question, on your hardware.

    Thanks, but i already done this step :-).

    You had not indicated so yet in this thread when I posted my reply.

    Time can give a different results on my machine than others, for
    example(due various factors).

    Of course. You might have a brand new i7, which will probably be
    markedly faster than my 7+ year old Xeon here.

    I'm interested in algorithmic difference(if it exists).

    Which is what time will give you. If you test -$x vs. -1*$x on the
    same machine, then you have kept machine differences constant (to the
    extent you can keep them constant) and so any differences must be due
    to different algorithms used for -$x vs -1*$x.


    And [time] can tell you things that can be a surprise; so after I lopped
    off the right most digit:

    % set n 4543534534
    4543534534
    % time {expr {$n * -1}} 100000
    0.453894 microseconds per iteration

    The value 4543534534 is large enough to require a 64-bit value. The
    largest 32-bit signed integer is 2^31-1 or 2147483647.

    % set n 454353453 ;# w/o the last digit
    454353453
    % time {expr {$n * -1}} 100000
    0.16878700000000002 microseconds per iteration

    That value fits into a 32-bit signed integer

    When I ran my tests with a different number than the OP used, I was
    thinking, gee I have a really fast machine!

    I'm guessing that the number the OP chose to test with must be larger
    than my system can handle as a simple machine word and so something
    is going on by tcl behind the scenes that I don't know about. I used
    32 bit tcl here, but 64 bit tcl got the same timing.

    There you go. For 4543534534 you are computing a 64-bit sum using
    32-bit arithmetic. For 454353453 you are computing a 32-bit sum using
    32-bit arithmetic.

    The difference is being able to perform a single native CPU "add"
    instruction to obtain the result vs. performing effectively this
    algorithm (oversimplified).

    hi1, lo1 uint32_t; (64 bit value #1)
    hi2, lo2 uint32_t; (64 bit value #2)
    hir, lor uint32_t; (64 bit result)

    lor = lo1+lo2 and setting a carry bit;
    hir = hi1+hi2+carry bit output from the above computation;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christian Gollwitzer@21:1/5 to All on Tue Aug 31 08:50:07 2021
    Am 30.08.21 um 22:19 schrieb ted brown:
    Maybe the bytecode interpreter overhead is more important than the
    actual arithmetic statement that is eventually executed. Is a uminus
    going to be that much faster than a mult, or is it that there's an extra
    push that's the difference?

    It is definitely the bytecode interpreter, with the push being a good explanation. You measured 0.1 *milliseconds* whereas the CPU takes *nanoseconds* to execute the actual multiplication for short integers.

    This only changes when you are using big integers, where this is not a
    single instruction but a complex library function.

    In the end, this is just a very minor change in a larger program, I
    doubt that you can significantly speed up any Tcl program by
    "optimizing" the exprs, except of course for bracing expressions.

    Christian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 02:46:38 2021
    понедельник, 30 августа 2021 г. в 21:07:21 UTC+3, Robert Heller:
    At Mon, 30 Aug 2021 09:39:03 -0700 (PDT) "oleg.o....@gmail.com" <oleg.o....@gmail.com> wrote:


    понедельник, 30 Ð°Ð²Ð³ÑƒÑ Ñ‚Ð° 2021 г. в 18:27:02 UTC+3, Robert Heller:
    Modern *compiled* languages do all sorts of "obvious" optimizations, so I
    guess someone who has not written assembly code or otherwise studied machine-level coding or otherwise learned about low-level CPU ALU implementations, might not know about the timing differences between different
    sorts of machine instructions.

    Agree. But this is just confirming that it is not obvious and well known, isn't it :-)?
    Maybe, but it is the sort of thing that is (should?) be part of a CompSci 101
    course. It does not really belong in the man pages. It IS something that belongs in a CompSci textbook and/or course.

    I guess it is something I and probably most programmers of my generation, who
    started programming with Apple ][s, Commadore 64s, Trash-80s, Kim-1s, etc. learned early on. Even though current generation programmers are learning with modern machines with optimizing compilers on multi-core 64-bit processors, it still makes some sense to teach them about the basics of low-level CPU ALU implementations and instill a sense of what the performance
    costs of different ways to implement a given computation.

    Oh Robert :-). This is an idealistic view. I also think programmers should learn how machine he use
    works, but unfortunately in real life people who programming tcl, perl, python don't learn low-level
    things(let's don't lie ourselves). Because these are high level languages and their strong point is
    precisely that they hide low level things(which we should know for performance tricks in any way).
    As for low level things, we both know that even this knowledge is insufficient. Because tcl is high
    level language and to know how this or that code is fast you need to know the next things:
    1. CPU/MCU architecture and commands set
    2. Compiler/interpreter internals - how it translates his constructs to specific CPU/MCU commands set

    So, CPU/MCU low-level gives you nothing without second item. As for second item, to know compiler/interpreter
    internals you should know the language which is used to program it. Thus above list is converted to:
    1. CPU/MCU architecture and commands set
    2. Compiler/interpreter internals - how it translates his constructs to specific CPU/MCU commands set
    3. C knowledge
    4. Many free time to learn tcl/perl/python internals

    If we are not dreamers, we understand that is too simpler just to ask or read docs(if it available).

    As for documentation of this performance case - i think like others. I already know this(thanks to everyone who helped!),
    but i don't care about others newbies.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 02:48:56 2021
    понедельник, 30 августа 2021 г. в 23:19:56 UTC+3, tedbr...@gmail.com:
    % ::tcl::unsupported::disassemble script {expr {-$n}}

    Folk, this is not the topic, but i should ask. How long we have this ::tcl::unsupported::undocumented::usedbyeverybody::proc ?
    May be this very useful routines, like representation, disassemble, should be placed in ::tcl namespace and documented in tcl(n) manpage?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 03:14:14 2021
    вторник, 31 августа 2021 г. в 00:56:02 UTC+3, Rich:
    Well, in this case it would be bad because it is an obvious issue that should not need mention in the man page. Other than beginners, the
    rest of us should already know that a multiplication operation is more computational effort than a negation operation.

    Agree. Who cares about noobs? For these questions we have stackoverflow.

    Rich, i'm amazed. I programmed general CPU in C and in assembler in various OSes, have some experience with programming MCU in C and in assembler,
    but i don't think that this high level things is obvious(due to many layers between
    machine instructions and high level language). Here 2 options:
    1. I'm stupid and you and others know every combination of CPU/MCU architecture with tcl internals; thus this is obvious(actually, this isn't make this obvious due to volume
    of knowledge to call it obvious i.e. the amount of knowledge required to call it obvious
    indicates just that it is not obvious to someone who does not have this knowledge )
    2. Or you and others have known this fact for a long time and therefor it seems to you
    that it is obvious.

    The man page should
    document the necessary syntax in order to show how to use the command. Adding obvious items merely clutters the man page with information that
    all but true beginners have no need to read through.

    Yes. But you can just press PgDown to scroll this info.

    You could argue that the man page might include a reference to the wiki 'expr' page for further information and/or tips and tricks, that way if someone is unaware of the wiki they would then be made aware.

    That would be good too.
    But, imho, separate man page for performance info or separate section in
    every man page would be more useful.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 03:18:34 2021
    вторник, 31 августа 2021 г. в 01:35:58 UTC+3, Rich:
    Then you left an awful lot of information up to one's imagination,
    because this:
    "What is faster: [expr {$n * -1}] or [expr {-$n}] ? :-)"
    Does not suggest to me that you are asking which of those are
    faster/slower on different machine architectures.

    And with the exception of your "hi", that quoted line above is all that
    was in your original posting to which I originally replied to create
    this sub-thread.

    Sorry, but i think this is obvious ;-).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 03:39:11 2021
    Look, colleagues, about "obvious" things. Just the fact that C translates -N and N * -1
    to *same* constructions make this already not obvious. We already have 1 case which
    work as an example of real life case where your assumptions about difference isn't work.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From oleg.o.nemanov@gmail.com@21:1/5 to All on Tue Aug 31 03:24:55 2021
    вторник, 31 августа 2021 г. в 01:24:52 UTC+3, Rich:
    Experience *does* have a direct impact on what any one individual
    considers obvious. In my case, I've known this fact (multiplication is
    more work than negation) for at least 34 years, possibly longer.

    People often confuse "obvious" and "knowledge based on my experience".
    For those who didn't learn assembler this is not obvious. And this fact is obvious :-).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)