• Itanium support is back in GCC 15

    From Simon Clubley@21:1/5 to All on Mon Nov 4 18:26:20 2024
    Itanium support will no longer be removed from GCC and Itanium will
    instead continue as a supported architecture (at least for Linux).

    https://www.theregister.com/2024/11/01/gcc_15_keep_itanium_support/

    There's a call in that article for an open source full-system emulator.
    Good luck with that one, especially for one that would run VMS as well. :-)

    One question: Why ? :-)

    Simon.

    --
    Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
    Walking destinations on a map are further away than they appear.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Simon Clubley on Mon Nov 4 15:16:09 2024
    On 11/4/2024 1:26 PM, Simon Clubley wrote:
    Itanium support will no longer be removed from GCC and Itanium will
    instead continue as a supported architecture (at least for Linux).

    https://www.theregister.com/2024/11/01/gcc_15_keep_itanium_support/

    There's a call in that article for an open source full-system emulator.
    Good luck with that one, especially for one that would run VMS as well. :-)

    One question: Why ? :-)

    Regarding why, then it seems obvious that there are no
    good commercial reason for GCC to support Itanium, but
    apparently someone is willing to do the work just for fun.

    And in the open source world if someone is willing
    to do the work for fun then it (usually) does happen.

    And Itanium is rather different from most other
    architectures, so from an academic perspective it
    may be interesting.

    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Waldek Hebisch@21:1/5 to gcalliet on Mon Nov 11 01:03:29 2024
    gcalliet <gerard.calliet@pia-sofer.fr> wrote:

    On my side I have always thought the failure of Itanium - they said
    Itanic - have been just the bad meeting between the conservatism of
    geeks and the inchoate laws of the market. Our hatred of Itanium
    contributed to the long life of the very archaic x86 to which the very
    wise Intel returned, for its greater good.

    Failure of Itanic was extensively disscussed in comp.arch. There
    were fundamental issues, EPIC concept required compiler arrange
    code in clever way to gain good performance. Hand coding small
    examples suggested that it is possible to write fast code for
    EPIC, but both when Itanic project started and now nobody knows
    how to do this in a compiler. There is related issue, when
    Itanic started is was not known how to get good instruction
    paralellism on conventional architectures. But then branch predictors
    happened and Intel and AMD were able to get good ILP from x86
    (the same could be done with many other architectures, but is
    incompatible with Itanic principles).

    Beside fundamental problems there were several specific blunders.

    Anyway, Itanic was late, expensive and had unimpressive performance.
    Some people were waiting for it, but what was promised (top
    performance) never appeared.

    --
    Waldek Hebisch

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gcalliet@21:1/5 to All on Thu Nov 7 18:33:57 2024
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    On 11/4/2024 1:26 PM, Simon Clubley wrote:
    Itanium support will no longer be removed from GCC and Itanium will
    instead continue as a supported architecture (at least for Linux).

    https://www.theregister.com/2024/11/01/gcc_15_keep_itanium_support/

    There's a call in that article for an open source full-system emulator.
    Good luck with that one, especially for one that would run VMS as
    well. :-)

    One question: Why ? :-)

    Regarding why, then it seems obvious that there are no
    good commercial reason for GCC to support Itanium, but
    apparently someone is willing to do the work just for fun.

    And in the open source world if someone is willing
    to do the work for fun then it (usually) does happen.

    And Itanium is rather different from most other
    architectures, so from an academic perspective it
    may be interesting.

    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Arne

    Because I created (canadian method) Gnat Ada (on gcc) for VMS Itanium,
    and because we were on gcc 4.7, there is some work ahead, but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++ mode. It
    is one of the reasons why Adacore didn't continue support of gnat ada on
    VMS in 2015.

    I have to know who likes Itanium so much :)

    gcalliet

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to gcalliet on Thu Nov 7 16:48:49 2024
    On 11/7/2024 12:33 PM, gcalliet wrote:
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Because I created (canadian method) Gnat Ada (on gcc) for VMS Itanium,
    and because we were on gcc 4.7, there is some work ahead, but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++ mode. It
    is one of the reasons why Adacore didn't continue support of gnat ada on
    VMS in 2015.

    VMS x86-64 has a better C++ compiler than VMS Itanium.

    But I have no idea which is best for boot strapping:

    g++/Linux -> GXX/VMS

    clang/VMS -> GXX/VMS

    I assume that if a recent GXX/VMS is working then getting
    GFortran and Gnat working would become a lot easier.

    But obviously a lot of work. And I do not expect it to happen. Just
    a thought given that someone wanted to support GCC/Itanium.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gcalliet@21:1/5 to All on Fri Nov 8 09:59:50 2024
    Le 07/11/2024 à 22:48, Arne Vajhøj a écrit :
    But obviously a lot of work. And I do not expect it to happen. Just
    a thought given that someone wanted to support GCC/Itanium.
    You are right, a lot of work. But perhaps a lot of fun :)

    About Itanium, who knows? I heard about some specific uses of Itanium.
    So perhaps a very little business with Itanium could exist sometime.

    On my side I have always thought the failure of Itanium - they said
    Itanic - have been just the bad meeting between the conservatism of
    geeks and the inchoate laws of the market. Our hatred of Itanium
    contributed to the long life of the very archaic x86 to which the very
    wise Intel returned, for its greater good.

    Just to initiate a great controversy :)

    gcalliet

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to gcalliet on Fri Nov 8 09:04:05 2024
    On 11/8/2024 3:59 AM, gcalliet wrote:
    About Itanium, who knows? I heard about some specific uses of Itanium.
    So perhaps a very little business with Itanium could exist sometime.

    On my side I have always thought the failure of Itanium - they said
    Itanic - have been just the bad meeting between the conservatism of
    geeks and the inchoate laws of the market. Our hatred of Itanium
    contributed to the long life of the very archaic x86 to which the very
    wise Intel returned, for its greater good.

    VMS people never liked Itanium. We loved VAX and Alpha, we are OK with
    x86-64, but Itanium was only bought because for almost 2 decades it
    was the only option for a new VMS box.

    Itanium never had a chance. But it was due to money.

    The CPU cost structure (huge fixed cost for design and fab construction
    vs relative small variable cost) means that only CPU's selling
    in hundreds of millions can compete cost wise. So Itanium fell
    behind in clock speed, number of cores and energy efficiency.

    The EPIC concept has been translated to "leave the real work to the
    compiler" and for that to succeed then huge investments in
    compiler technology would have been needed - hundreds maybe
    thousands of engineers working on compiler backend. Did not
    happen - not in HP not in Intel not anywhere. So on VMS Itanium
    the generated "bundles" has a huge percentage of NOP's.

    Could Itanium design have worked out if by magic the necessary
    money for CPU development and compiler backend development had
    been there? That is an academic question with no practical
    impact - it did not happen and it could never have happened.

    But from the technical perspective then I do see some
    benefits from the Itanium design. CPU's has hit the GHz
    cap - just doubling clock speed every generation
    is not physical possible. x86-64 has worked around that
    mostly by increasing number of cores. 1->2->4->8->16->24->32 cores
    worked pretty well as both servers and desktop computers does
    a lot of processes and/or threads in parallel. But 64, 128,
    192 and 256 cores? If running a hypervisor and 10 VM's then all
    good, but what if that is not the case? The Itanium bundles
    offer a way to parallelize hardware usage for single
    threads.

    Modern x86-64 does a lot of advanced stuff under the hood to
    do similar things. But it is limited by the instructions
    and the memory model. With same level of investments then
    I believe Itanium would do better.

    But it is all pretty pointless. It is like: what if the speed
    of light was 20 MPH instead of 200000 MPS.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to gcalliet on Fri Nov 8 22:17:00 2024
    In article <lp6284Fll3cU1@mid.individual.net>,
    gerard.calliet@pia-sofer.fr (gcalliet) wrote:

    About Itanium, who knows? I heard about some specific uses of
    Itanium. So perhaps a very little business with Itanium could exist
    sometime.

    It can't last now. There are a finite supply of Itanium CPUs and no more
    being made.

    On my side I have always thought the failure of Itanium - they said
    Itanic - have been just the bad meeting between the conservatism of
    geeks and the inchoate laws of the market.

    It also had fundamental technical flaws. The basic idea of EPIC, that a compiler with plenty of time to plan, can optimise memory advance loads
    to make Out-of-Order execution unnecessary, is wrong.

    It would be possible to do that in a single-core system with no processor caches, a single-tasking operating system, and few interrupts going off.
    In a multi-processor, multi-tasking system which is taking interrupts, it
    is impossible to know in advance what data will be in which cache levels,
    and hence to optimise memory access in advance.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Dorsey@21:1/5 to arne@vajhoej.dk on Sun Feb 23 18:11:19 2025
    =?UTF-8?Q?Arne_Vajh=C3=B8j?= <arne@vajhoej.dk> wrote:

    VMS x86-64 has a better C++ compiler than VMS Itanium.

    It is MUCH harder to write an efficient VLIW compiler than an efficient compiler for a traditional architecture. The need to keep as many
    parts of the processor working at the same time for optimal performance
    makes for a lot of added work by the compiler back end.

    The whole idea of the VLIW system is that the compiler will be able to
    optimize the code to gain paralellism of units inside the single
    processor. This is a very very ingenious idea but nobody has yet
    been able to make a compiler that could do it well enough for it to be
    a real win.

    It is a very difficult job. A lot of work was put into it. The
    available resources for that work have all evaporated now, gone
    elsewhere to other better-performing projects. The chance of the
    fundamental problems ever getting solved at this point is slim.
    --scott

    --
    "C'est un Nagra. C'est suisse, et tres, tres precis."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Dorsey on Sun Feb 23 21:29:00 2025
    In article <vpfoc7$2o5$1@panix2.panix.com>, kludge@panix.com (Scott
    Dorsey) wrote:

    The whole idea of the VLIW system is that the compiler will be able
    to optimize the code to gain paralellism of units inside the single processor. This is a very very ingenious idea but nobody has yet
    been able to make a compiler that could do it well enough for it to
    be a real win.

    Sadly, the job is *impossible*.

    The fundamental problem in optimisation for modern computers is the
    slowness of main RAM, which isn't currently solvable at a reasonable cost.
    We use caches to mitigate it.

    Out-of-order execution addresses this problem by tracking the data
    dependencies on memory and registers in real time and executing
    instructions when their data is available. This has worked pretty well
    for almost thirty years for x86 and the other architectures that are
    still competing on performance.

    Itanium/EPIC was an alternative to this. The management of data
    dependencies wasn't to be done dynamically by hardware, but in advance by
    the compiler. This requires the compiler to track what data is in cache
    so that advance loads can be scheduled correctly to have data available
    in time. Unfortunately, in a multi-core system with a multi-tasking
    operating system, it's impossible to know in advance what data will be in cache, because that depends on what else is running.

    Other flaws of Itanium include the bulky instruction set, which needs
    more memory bandwidth and larger caches than other architectures, and an architectural misfeature which means floating-point advance loads that
    are outstanding across subroutine calls can fail silently.

    If anyone tries to re-use ideas from Itanium, they'd be well-advised to
    keep quiet about where they got them. There is remaining prejudice
    against it, which is well-justified.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to arne@vajhoej.dk on Mon Feb 24 19:42:39 2025
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:

    On 11/7/2024 12:33 PM, gcalliet wrote:
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Because I created (canadian method) Gnat Ada (on gcc) for VMS
    Itanium, and because we were on gcc 4.7, there is some work ahead,
    but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++
    mode. It is one of the reasons why Adacore didn't continue support
    of gnat ada on VMS in 2015.

    VMS x86-64 has a better C++ compiler than VMS Itanium.


    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    But I have no idea which is best for boot strapping:

    g++/Linux -> GXX/VMS

    clang/VMS -> GXX/VMS

    I assume that if a recent GXX/VMS is working then getting
    GFortran and Gnat working would become a lot easier.

    But obviously a lot of work. And I do not expect it to happen. Just
    a thought given that someone wanted to support GCC/Itanium.

    Arne


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Hoffman@21:1/5 to John Dallman on Mon Feb 24 12:22:35 2025
    On 2025-02-23 21:29:00 +0000, John Dallman said:

    In article <vpfoc7$2o5$1@panix2.panix.com>, kludge@panix.com (Scott
    Dorsey) wrote:

    The whole idea of the VLIW system is that the compiler will be able to
    optimize the code to gain paralellism of units inside the single
    processor. This is a very very ingenious idea but nobody has yet been
    able to make a compiler that could do it well enough for it to be a
    real win.

    Sadly, the job is *impossible*.

    The fundamental problem in optimisation for modern computers is the
    slowness of main RAM, which isn't currently solvable at a reasonable
    cost. We use caches to mitigate it.

    Out-of-order execution addresses this problem by tracking the data dependencies on memory and registers in real time and executing
    instructions when their data is available....

    The Itanium compiler optimizer just doesn't (and can't) know enough
    about the system memory state, yes. Among other (no pun intended)
    issues.

    The attempt to address that included providing run-time feedback into
    the executables; providing post-link, post-execution tuning. (Caliper /
    Atom / OM / etc.)

    https://www.cs.tufts.edu/comp/150PAT/tools/caliper/wiess-rev-4.pdf

    This Alpha versus IA-64 Itanium paper from 1999 describes the issues
    with Itanium quite well too, for those interested:

    https://web.archive.org/web/20010611202933/http://www.compaq.com/hpc/ref/ref_alpha_ia64.doc



    --
    Pure Personal Opinion | HoffmanLabs LLC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robert A. Brooks@21:1/5 to Michael S on Mon Feb 24 13:10:34 2025
    On 2/24/2025 12:42, Michael S wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:

    On 11/7/2024 12:33 PM, gcalliet wrote:
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Because I created (canadian method) Gnat Ada (on gcc) for VMS
    Itanium, and because we were on gcc 4.7, there is some work ahead,
    but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++
    mode. It is one of the reasons why Adacore didn't continue support
    of gnat ada on VMS in 2015.

    VMS x86-64 has a better C++ compiler than VMS Itanium.


    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    Without knowing the compiler version, it's impossible to comment.
    If they were cross-compilers, there was no optimization at all.

    --
    -- Rob

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Clubley@21:1/5 to Michael S on Mon Feb 24 18:57:55 2025
    On 2025-02-24, Michael S <already5chosen@yahoo.com> wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:

    VMS x86-64 has a better C++ compiler than VMS Itanium.


    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.

    Given all the various bits of movement in multiple areas over the last
    year or so, it might be time for those same tests to be run again against current compiler versions.

    Simon.

    --
    Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
    Walking destinations on a map are further away than they appear.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Reagan@21:1/5 to Stephen Hoffman on Mon Feb 24 14:00:44 2025
    On 2/24/2025 12:22 PM, Stephen Hoffman wrote:
    On 2025-02-23 21:29:00 +0000, John Dallman said:

    In article <vpfoc7$2o5$1@panix2.panix.com>, kludge@panix.com (Scott
    Dorsey) wrote:

    The whole idea of the VLIW system is that the compiler will be able
    to optimize the code to gain paralellism of units inside the single
    processor. This is a very very ingenious idea but nobody has yet been
    able to make a compiler that could do it well enough for it to  be a
    real win.

    Sadly, the job is *impossible*.

    The fundamental problem in optimisation for modern computers is the
    slowness of main RAM, which isn't currently solvable at a reasonable
    cost. We use caches to mitigate it.

    Out-of-order execution addresses this problem by tracking the data
    dependencies on memory and registers in real time and executing
    instructions when their data is available....

    The Itanium compiler optimizer just doesn't (and can't) know enough
    about the system memory state, yes. Among other (no pun intended) issues.

    The attempt to address that included providing run-time feedback into
    the executables; providing post-link, post-execution tuning. (Caliper /
    Atom / OM / etc.)

    https://www.cs.tufts.edu/comp/150PAT/tools/caliper/wiess-rev-4.pdf

    This Alpha versus IA-64 Itanium paper from 1999 describes the issues
    with Itanium quite well too, for those interested:

    https://web.archive.org/web/20010611202933/http://www.compaq.com/hpc/ ref/ref_alpha_ia64.doc



    Clearly that old Alpha/IA64 comparison was written with an agenda.
    There is no clear attribution in the document but all the "we did" and
    "we designed" clearly indicates authorship in the Alpha hardware group.

    Some of their assumptions like it will be impossible to do out-of-order
    on IA64 are wrong since the last Itaniums actually implemented OOO and
    existing images saw an immediate benefit.

    They were comparing the Itanium of the day to what they thought Alpha
    could someday do. The Itanium of the day was pretty bad compared to the
    Alpha of the day (or of the next 2 years). And it is more than just the architecture. It is the chip, the process, the interface chips, etc.

    And yes, it was a challenge for compilers. The GEM implementation is a
    good V1 but is lacking. GEM wasn't designed around such a hardware
    model. I'm sure with additional time/money/people that subsequent
    versions would be better. Of all the backends, I've seen, the HPUX one
    is the best. During the Itanium port, I had some of the COBOL RTL
    routines for datatype conversion. We had C code and the performance was horrible out of GEM. We were considering our own assembly versions, but
    I was directed to some of the HPUX compiler folks. I gave them the C
    code and in a few weeks, I had Itanium assembly code that I could not recognize. It used all sorts of Itanium features. It was several times
    faster (I'm thinking 10x but I don't remember). That code is in the
    COBOL RTL today. That was on those early Itaniums without OOO. How
    good would the GEM code be on "modern" Itanium? Don't know. Never
    tried. Doesn't matter.

    As you say, cache is king. Intel doesn't price their chips based on
    clock speed. They price them based on cache size.

    I'll agree that Alpha was the better floating point system. The weird
    bundling rules in the Itanium architecture make it difficult for a
    floating application.

    Not to litigate the argument (but it is what c.o.v does best) again, but
    it was clear to many that upper Digital management didn't want to hear technical arguments about the decision. Turning around to ask your
    choir doesn't give you any information about a transformational change
    in the underlying technology.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Michael S on Mon Feb 24 15:08:57 2025
    On 2/24/2025 12:42 PM, Michael S wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 11/7/2024 12:33 PM, gcalliet wrote:
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Because I created (canadian method) Gnat Ada (on gcc) for VMS
    Itanium, and because we were on gcc 4.7, there is some work ahead,
    but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++
    mode. It is one of the reasons why Adacore didn't continue support
    of gnat ada on VMS in 2015.

    VMS x86-64 has a better C++ compiler than VMS Itanium.

    That comment was about C++ standard compliance not performance.

    C++ VMS x86-64 is clang which in the (older) clang version used
    should mean C++14 while C++ VMS Itanium is very very old (like
    C++ 98 old).

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    I believe the conclusion was that the VMS x86-64 compilers except C++
    was slower than C/C++ on other OS and C++ on VMS.

    My guess is that it is a combination of the GEM to LLVM translation
    and a desire from VSI to be a little conservative (prioritizing
    correctness over speed).

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Simon Clubley on Mon Feb 24 15:11:07 2025
    On 2/24/2025 1:57 PM, Simon Clubley wrote:
    On 2025-02-24, Michael S <already5chosen@yahoo.com> wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    VMS x86-64 has a better C++ compiler than VMS Itanium.

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.

    Given all the various bits of movement in multiple areas over the last
    year or so, it might be time for those same tests to be run again against current compiler versions.

    I have updated the VMS numbers with new compiler versions.

    The traditional languages are still behind C++.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to arne@vajhoej.dk on Mon Feb 24 23:22:22 2025
    On Mon, 24 Feb 2025 15:08:57 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:

    On 2/24/2025 12:42 PM, Michael S wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 11/7/2024 12:33 PM, gcalliet wrote:
    Le 04/11/2024 à 21:16, Arne Vajhøj a écrit :
    I wish someone would volunteer to create VMS support
    in GCC 16 or whatever!

    Because I created (canadian method) Gnat Ada (on gcc) for VMS
    Itanium, and because we were on gcc 4.7, there is some work ahead,
    but why not :)

    The big issue is the step to gcc 5, where they upgraded to c++
    mode. It is one of the reasons why Adacore didn't continue support
    of gnat ada on VMS in 2015.

    VMS x86-64 has a better C++ compiler than VMS Itanium.

    That comment was about C++ standard compliance not performance.


    Ok

    C++ VMS x86-64 is clang which in the (older) clang version used
    should mean C++14 while C++ VMS Itanium is very very old (like
    C++ 98 old).

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    I believe the conclusion was that the VMS x86-64 compilers except C++
    was slower than C/C++ on other OS and C++ on VMS.


    Somehow I got an impression that C++ compilers were also significantly
    slower than C++ compilers on other platforms.
    Do I misremember?

    My guess is that it is a combination of the GEM to LLVM translation
    and a desire from VSI to be a little conservative (prioritizing
    correctness over speed).

    Arne


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Stephen Hoffman on Mon Feb 24 21:27:00 2025
    In article <vpi9sr$19atf$1@dont-email.me>, seaohveh@hoffmanlabs.invalid (Stephen Hoffman) wrote:

    The Itanium compiler optimizer just doesn't (and can't) know enough
    about the system memory state, yes. Among other (no pun intended)
    issues.

    The attempt to address that included providing run-time feedback
    into the executables; providing post-link, post-execution tuning.
    (Caliper / Atom / OM / etc.)

    "Attempt" is about right.

    I did several years porting work to Itanium. I tried run-time feedback
    zero times: doing the link of the instrumented build took over an hour,
    up from about a minute, because it was doing all the code generation at
    link time.

    The claim was "you only do this for the build you'll ship." My response
    was "The compiler is so immature that I'm reporting new bugs every week,
    and you want me to give the compiler new and difficult challenges?"

    I never heard of anyone who got anywhere with profile-guided optimisation
    on Itanium. Have you?

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Michael S on Mon Feb 24 16:43:29 2025
    On 2/24/2025 4:22 PM, Michael S wrote:
    On Mon, 24 Feb 2025 15:08:57 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 2/24/2025 12:42 PM, Michael S wrote:

    C++ VMS x86-64 is clang which in the (older) clang version used
    should mean C++14 while C++ VMS Itanium is very very old (like
    C++ 98 old).

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    I believe the conclusion was that the VMS x86-64 compilers except C++
    was slower than C/C++ on other OS and C++ on VMS.

    Somehow I got an impression that C++ compilers were also significantly
    slower than C++ compilers on other platforms.
    Do I misremember?

    I don't even remember that I posted non-VMS numbers here. Age! :-)

    But I just checked VMS C++ latest (CXX/OPT=LEVEL:5 and clang -O3) vs a
    random Windows GCC 14.1 (g++ -O3):

    VMS is a little faster for integer
    they are about the same for floating point
    Windows is a lot faster for string

    And given that this is a micro-benchmark with in reality just an inner
    loop evaluating a single expression, which means huge uncertainty, then
    I don't see this as proof of a significant difference.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Reagan@21:1/5 to John Dallman on Mon Feb 24 16:55:32 2025
    On 2/24/2025 4:27 PM, John Dallman wrote:
    In article <vpi9sr$19atf$1@dont-email.me>, seaohveh@hoffmanlabs.invalid (Stephen Hoffman) wrote:

    The Itanium compiler optimizer just doesn't (and can't) know enough
    about the system memory state, yes. Among other (no pun intended)
    issues.

    The attempt to address that included providing run-time feedback
    into the executables; providing post-link, post-execution tuning.
    (Caliper / Atom / OM / etc.)

    "Attempt" is about right.

    I did several years porting work to Itanium. I tried run-time feedback
    zero times: doing the link of the instrumented build took over an hour,
    up from about a minute, because it was doing all the code generation at
    link time.

    The claim was "you only do this for the build you'll ship." My response
    was "The compiler is so immature that I'm reporting new bugs every week,
    and you want me to give the compiler new and difficult challenges?"

    I never heard of anyone who got anywhere with profile-guided optimisation
    on Itanium. Have you?

    John
    Actually, the NonStop Itanium kernel is built using PGO. Apparently,
    for their test workload of transactions, the savings was a measurable
    "few" percent. [I think in the 3-5% range but I'm not sure anymore]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Reagan@21:1/5 to All on Mon Feb 24 17:02:08 2025
    On 2/24/2025 4:43 PM, Arne Vajhøj wrote:
    On 2/24/2025 4:22 PM, Michael S wrote:
    On Mon, 24 Feb 2025 15:08:57 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 2/24/2025 12:42 PM, Michael S wrote:

    C++ VMS x86-64 is clang which in the (older) clang version used
    should mean C++14 while C++ VMS Itanium is very very old (like
    C++ 98 old).

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    I believe the conclusion was that the VMS x86-64 compilers except C++
    was slower than C/C++ on other OS and C++ on VMS.

    Somehow I got an impression that C++ compilers were also significantly
    slower than C++ compilers on other platforms.
    Do I misremember?

    I don't even remember that I posted non-VMS numbers here. Age! :-)

    But I just checked VMS C++ latest (CXX/OPT=LEVEL:5 and clang -O3) vs a
    random Windows GCC 14.1 (g++ -O3):

    VMS is a little faster for integer
    they are about the same for floating point
    Windows is a lot faster for string

    And given that this is a micro-benchmark with in reality just an inner
    loop evaluating a single expression, which means huge uncertainty, then
    I don't see this as proof of a significant difference.

    Arne

    We are aware of the string/char performance issues.

    On Alpha and Itanium, the lowlevel routines inside of LIBOTS for things
    like OTS$MOVE, string compare, memmove, etc. are all written in
    hand-crafted assembly. For x86, we are still using a set of BLISS
    reference code that is simple. Plus the LIBOTS we all have on our
    systems was compiled with a non-optimizing BLISS cross-compiler.

    We are currently playing with native compiled LIBOTS code and doing some benchmarks. Besides the brain-dead BLISS code, we have versions that
    loop with larger chunks of data which are even faster. The fastest
    we've seen so far is a native assembly version that uses the REP
    instruction prefix on the MOVSB. That version didn't check for
    overlapping source/dest however so any real version gets a little
    slower. I'm not sure when we can incorporate these, but I'm trying to
    push them as soon as possible.

    A fun reference to read is

    https://cdrdv2-public.intel.com/814198/248966-Optimization-Reference-Manual-V1-050.pdf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Hoffman@21:1/5 to John Dallman on Mon Feb 24 17:41:15 2025
    On 2025-02-24 21:27:00 +0000, John Dallman said:

    I never heard of anyone who got anywhere with profile-guided
    optimisation on Itanium. Have you?

    While there were whitepapers and related, AFAIK, the OM tools were
    never released for OpenVMS.

    Various developers I've chatted with were skeptical about supporting
    and debugging post-link-optimized executables.

    And then there was The Graph: https://commons.wikimedia.org/wiki/File:Itanium_Sales_Forecasts_edit.svg


    --
    Pure Personal Opinion | HoffmanLabs LLC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to John Reagan on Tue Feb 25 13:05:00 2025
    In article <d014a8f62fb90475bfdbf733f0fcbe40d79ca8a5@i2pn2.org>, johnrreagan@earthlink.net (John Reagan) wrote:

    Actually, the NonStop Itanium kernel is built using PGO.
    Apparently, for their test workload of transactions, the savings
    was a measurable "few" percent. [I think in the 3-5% range but I'm
    not sure anymore]

    OK, I'll believe that.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Cross@21:1/5 to johnrreagan@earthlink.net on Tue Feb 25 21:35:46 2025
    In article <c662907ab88c91b96d12e78e0444d8626f421ebf@i2pn2.org>,
    John Reagan <johnrreagan@earthlink.net> wrote:
    On 2/24/2025 4:43 PM, Arne Vajhøj wrote:
    On 2/24/2025 4:22 PM, Michael S wrote:
    On Mon, 24 Feb 2025 15:08:57 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 2/24/2025 12:42 PM, Michael S wrote:

    C++ VMS x86-64 is clang which in the (older) clang version used
    should mean C++14 while C++ VMS Itanium is very very old (like
    C++ 98 old).

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.
    Do you want to say that VMS Itanium compilers are worse?

    I believe the conclusion was that the VMS x86-64 compilers except C++
    was slower than C/C++ on other OS and C++ on VMS.

    Somehow I got an impression that C++ compilers were also significantly
    slower than C++ compilers on other platforms.
    Do I misremember?

    I don't even remember that I posted non-VMS numbers here. Age! :-)

    But I just checked VMS C++ latest (CXX/OPT=LEVEL:5 and clang -O3) vs a
    random Windows GCC 14.1 (g++ -O3):

    VMS is a little faster for integer
    they are about the same for floating point
    Windows is a lot faster for string

    And given that this is a micro-benchmark with in reality just an inner
    loop evaluating a single expression, which means huge uncertainty, then
    I don't see this as proof of a significant difference.

    Arne

    We are aware of the string/char performance issues.

    On Alpha and Itanium, the lowlevel routines inside of LIBOTS for things
    like OTS$MOVE, string compare, memmove, etc. are all written in
    hand-crafted assembly. For x86, we are still using a set of BLISS
    reference code that is simple. Plus the LIBOTS we all have on our
    systems was compiled with a non-optimizing BLISS cross-compiler.

    Hmm. It strikes me that LLVM has intrinsics for `memmove` that
    would also work for OTS$MOVE3; I would think that that would be
    most efficient, as for small moves, this could lower directly
    to a couple of loads and/or stores?

    We are currently playing with native compiled LIBOTS code and doing some >benchmarks. Besides the brain-dead BLISS code, we have versions that
    loop with larger chunks of data which are even faster. The fastest
    we've seen so far is a native assembly version that uses the REP
    instruction prefix on the MOVSB. That version didn't check for
    overlapping source/dest however so any real version gets a little
    slower. I'm not sure when we can incorporate these, but I'm trying to
    push them as soon as possible.

    Yeah, Intel made `REP MOVESB`/`REP STOSB` actually fast a few
    uarchs ago. Good stuff, though startup overhead still dominates
    for <128 bytes or something like that, and having to muck with
    the DF flag remains a bummer.

    A fun reference to read is

    https://cdrdv2-public.intel.com/814198/248966-Optimization-Reference-Manual-V1-050.pdf

    Agner Fog's optimization guides can also be a useful resource
    for things like this: https://www.agner.org/optimize/

    - Dan C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Clubley@21:1/5 to arne@vajhoej.dk on Wed Feb 26 18:32:38 2025
    On 2025-02-24, Arne Vajhøj <arne@vajhoej.dk> wrote:
    On 2/24/2025 1:57 PM, Simon Clubley wrote:
    On 2025-02-24, Michael S <already5chosen@yahoo.com> wrote:
    On Thu, 7 Nov 2024 16:48:49 -0500
    Arne Vajhøj <arne@vajhoej.dk> wrote:
    VMS x86-64 has a better C++ compiler than VMS Itanium.

    According to the benchmarks that you posted here several months (a
    year?) ago, VMS x86-64 compilers are quite awful comparatively to
    x86-64 compilers available on Windows/Linux/BSD.

    Given all the various bits of movement in multiple areas over the last
    year or so, it might be time for those same tests to be run again against
    current compiler versions.

    I have updated the VMS numbers with new compiler versions.

    The traditional languages are still behind C++.


    Thanks Arne,

    Simon.

    --
    Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
    Walking destinations on a map are further away than they appear.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)