• fledgling assembler programmer

    From Alan Beck@21:1/5 to All on Tue Mar 21 17:40:18 2023
    //Hello all,//

    Hi,

    I have started to learn Assembler out of an old book.

    It is ancient (2003) but I don't think 8086 programming has changed
    much. But the tools have.

    I took assembly language in school but dropped out. Now I want another
    go at it.

    Would someone be my Mentor and answer a ton of questions that would
    dwindle out as time went on?

    If it's OK, we could do it here. Or netmail

    Books are from a bookstore.


    Book 1
    Assembly Language for the PC 3rd edition, John Socha and Peter Norton.

    Book 2
    Assembly Language (step by step) Jeff Duntemann. Too Chatty.

    I cannot afford a modern book at this time.

    Thats what I picked up from the thrift store.

    These books are dated around the time I was taking machine code in
    school and I find it interesting now.

    I hope someone picks me up.

    I am running linux and using DOSemu

    Also a modern DEBUG and and a modern Vi

    I am also a Ham Radio Operator (45 years)

    1:229/426.36

    Regards,
    Alan Beck
    VY2XU
    [Please reply directly unless the response is relatted to compilers. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Alan Beck on Tue Mar 21 17:23:25 2023
    On Tuesday, March 21, 2023 at 2:40:22 PM UTC-7, Alan Beck wrote:

    I have started to learn Assembler out of an old book.

    (Hopefully enough related to compilers.)

    Not so long after I started learning OS/360 Fortran and PL/I, I found
    the compiler option for printing out the generated code in sort-of
    assembly language. (Not actually assembleable, though.)

    About that time, I also had source listings on microfilm of
    the OS/360 Fortran library, and some other Fortran callable
    assembly programs. And also, the IBM S/370 Principles
    of Operation.

    With those, and no actual book meant to teach assembly
    programming, I figured it out, and started writing my own
    programs, though mostly callable from Fortran or PL/I.

    Compilers today don't write out the generated code in the same way,
    and there aren't so many libraries around to read. And, personally,
    8086 is my least favorite to write assembly code in.

    Learning C, and thinking about pointers and addresses, is a good start
    toward assembly programming.

    In any case, I don't think I have any idea how others learn
    programming for any language, and especially not for assembly
    programming. I used to read IBM reference manuals, cover to cover.
    That was mostly high school years. After that, I figured out how to
    use them as reference manuals.

    Most of my 80x86 assembly programming in the last
    20 years is (re)writing this one program:

    rdtsc: rdtsc
    ret

    When called from C, and returning a 64 bit integer, it return the Time
    Stamp Counter. (Works for 32 bit code, returning in EDX:EAX. 64 bit is different.)

    C programming works so well, that there are only a few
    things you can't do in C, and so need assembly programs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to gah4@u.washington.edu on Wed Mar 22 10:02:15 2023
    gah4 <gah4@u.washington.edu> writes:
    Not so long after I started learning OS/360 Fortran and PL/I, I found
    the compiler option for printing out the generated code in sort-of
    assembly language. (Not actually assembleable, though.)
    ...
    Compilers today don't write out the generated code in the same way,

    Unix (Linux) compilers like gcc usually write assembly-language code
    if you use the option -S. This code can be assembled, because AFAIK
    that's the way these compilers produce object code.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to gah4@u.washington.edu on Wed Mar 22 06:49:31 2023
    gah4 <gah4@u.washington.edu> schrieb:
    On Tuesday, March 21, 2023 at 2:40:22 PM UTC-7, Alan Beck wrote:

    I have started to learn Assembler out of an old book.

    At the risk of stating the blindingly obvious: There is more
    than one assembler language, each computer architecture has its
    own (with extensions over time, too). There are also sometimes
    different syntax variant, for example AT&T vs. Intel.

    [...]

    Compilers today don't write out the generated code in the same way,

    Quite the opposite.

    The standard on UNIXy systems is to write out assemblly language to
    a file, which is then further processed with the actual assembler.
    Use "-S" to just generate the foo.s file from foo.c.

    Plus, you can disassemble object files and programs with "objdump -d".

    and there aren't so many libraries around to read.

    Not ones written in assembler. But it is possible to download
    the source code to many libraries, for example glibc, and then
    examine what it is compiled to.

    Another possibility would be to use http://godbolt.org, which shows
    you assembler generated for different systems with differnt options.
    To really make sense of it for different architectures you are
    not familiar with may be difficult, though). Or build clang/LLVM
    yourself and set different options for the architecture.

    And, personally,
    8086 is my least favorite to write assembly code in.

    I like 6502 even less :-)

    Learning C, and thinking about pointers and addresses, is a good start
    toward assembly programming.

    That, I agree with. And it helps a lot to also look at the
    generated code.

    [...]

    C programming works so well, that there are only a few
    things you can't do in C, and so need assembly programs.

    Bringing it back a bit towards compilers: Reading assembler code is
    a good way to see where they generate inefficient or (more rarely)
    incorrect code. In some special cases, writing in assembler can
    bring benefits of a factor of 2 or even 4 over compiler-generated
    code, usually when SIMD is involved.

    Assembler is a bit like Latin: For most people, there is no need
    to speak or write, but one should be able to read it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Alan Beck on Wed Mar 22 14:39:59 2023
    On 21/03/2023 22:40, Alan Beck wrote:
    //Hello all,//

    Hi,

    I have started to learn Assembler out of an old book.

    It is ancient (2003) but I don't think 8086 programming has changed
    much. But the tools have.

    I took assembly language in school but dropped out. Now I want another
    go at it.

    Would someone be my Mentor and answer a ton of questions that would
    dwindle out as time went on?

    If it's OK, we could do it here. Or netmail

    Books are from a bookstore.

    I have both these books on my bookshelf - but it was a /long/ time ago
    that I read them.

    The big question here is /why/ you are doing this. The 8086 is ancient
    history - totally irrelevant for a couple of decades at least. Modern
    PC's use x86-64, which is a very different thing. You don't learn
    modern Spanish by reading an old Latin grammar book, even though
    Spanish is a Latin language.

    There are, perhaps, four main reasons for being interested in learning
    to write assembly:

    1. You need some very niche parts of a program or library to run as
    fast as feasible. Then you want to study the details of your target
    processor (it won't be an 8086) and its instruction set - typically
    focusing on SIMD and caching. Done well, this can lead to an order of
    magnitude improvement for very specific tasks - done badly, your
    results will be a lot worse than you'd get from a good compiler with
    the right options. The "comp.arch" newsgroup is your first point of
    call on Usenet for this.

    2. You need some very low-level code for things that can't be
    expressed in a compiled language, such as task switching in an OS.
    Again, you need to focus on the right target. "comp.arch" could be a
    good starting point here too.

    3. You are working on a compiler. This requires a deep understanding of
    the target processor, but you've come to the right newsgroup.

    4. You are doing this for fun (the best reason for doing anything) and learning. You can come a long way with getting familiar with
    understanding (but not writing) assembly from looking at the output of
    your favourite compilers for your favourite targets and favourite
    programming languages on <https://godbolt.org>. Here I would pick an
    assembly that is simple and pleasant - 8086 is neither.

    I would recommend starting small, such as the AVR microcontroller
    family. The instruction set is limited, but fairly consistent and easy
    to understand. There is vast amounts of learning resources in the
    Arduino community (though most Arduino development is in C or C++), and
    you can buy an Arduino kit cheaply. Here you can write assembly code
    that actually does something, and the processor ISA is small enough that
    you can learn it /all/.


    If none of that covers your motivation, then give some more details of
    what you want to achieve, and you can probably get better help.
    (comp.arch might be better than comp.compilers if you are not interested
    in compilers.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Neuner@21:1/5 to Alan Beck on Wed Mar 22 14:54:49 2023
    On Tue, 21 Mar 2023 17:40:18 -0400 (EDT), Alan.Beck@darkrealms.ca
    (Alan Beck) wrote:

    ... I don't think 8086 programming has changed
    much. But the tools have. ...
    Would someone be my Mentor and answer a ton of questions that would
    dwindle out as time went on?

    Assembler mostly is off-topic here in comp.compilers, but
    comp.lang.asm.x86 will be open to pretty much any question regarding
    80x86 assembler.

    [Please reply directly unless the response is related to compilers. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Thomas Koenig on Wed Mar 22 13:31:41 2023
    On Wednesday, March 22, 2023 at 12:06:05 PM UTC-7, Thomas Koenig wrote:
    gah4 <ga...@u.washington.edu> schrieb:

    (snip)

    Compilers today don't write out the generated code in the same way,

    Quite the opposite.

    The standard on UNIXy systems is to write out assemblly language to
    a file, which is then further processed with the actual assembler.

    Yes, not the same way.

    Well, to be sure that this is about compilers, my favorite complaint
    is the lost art of small memory compilers. That is, ones that can
    run in kilobytes instead of megabytes.

    In any case, the OS/360 compilers don't write out assembly code
    that an assembler would recognize. It is meant for people.

    Some write it out in two columns to save paper.
    Labels don't have to agree with what assemblers recognize.
    They don't have to be in the same order that they would be for
    an assembler, though OS/360 object programs don't have to be
    in order, either.

    Having not thought about this for a while, I believe they put
    in some comments that help human readers, though likely not
    what an assembly programmer would say.

    Unixy systems might put in some comments, but mostly don't.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to gah4@u.washington.edu on Thu Mar 23 11:26:50 2023
    gah4 <gah4@u.washington.edu> schrieb:

    [...]

    Well, to be sure that this is about compilers, my favorite complaint
    is the lost art of small memory compilers. That is, ones that can
    run in kilobytes instead of megabytes.

    On the Internet, there is a project for almost everything - in this
    case Tiny C, which still seems to be under active development. Or
    at least there are sill commits at https://repo.or.cz/w/tinycc.git .

    However, there is a reason why compilers got so big - there is
    always a balance to be struck between comilation speed, compiler
    size and optimization.

    An extreme example: According to "Abstracting Away the Machine", the
    very first FORTRAN compiler was so slow that the size of programs
    it could compile was limited by the MTBF of the IBM 704 of around
    eight hours.

    The balance has shifted over time, because of increasing computing
    power and available memory that can be applied to compilation,
    and because relatively more people use programs than use compilers
    than ever before. So, in today's environment, there is little
    incentive for writing small compilers.

    Also, languages have become bigger, more expressive, more powerful,
    more bloated (take your pick), which also increases the size
    of compilers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to tkoenig@netcologne.de on Thu Mar 23 13:56:23 2023
    In article <23-03-003@comp.compilers>,
    Thomas Koenig <tkoenig@netcologne.de> wrote:
    Not ones written in assembler. But it is possible to download
    the source code to many libraries, for example glibc, and then
    examine what it is compiled to.

    Getting more and more off topic, but I can't let this go.

    Glibc is a S W A M P. A newbie who wanders in will drown and never
    come out. Even if you are a very experienced C programmer, you don't
    want to go there.

    Learning assembler in order to understand how machines work is valuable.
    Long ago I learned PDP-11 assembler, which is still one of the cleanest architectures ever designed. I was taking a data structures course at
    the same time, and recursion didn't click with me until I saw how it
    was done in assembler.

    My two cents,

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com
    [I must admit that when I write C code I still imagine there's a
    PDP-11 underneath. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Thomas Koenig on Fri Mar 24 14:17:44 2023
    On Friday, March 24, 2023 at 7:10:00 AM UTC-7, Thomas Koenig wrote:

    (snip about the lost art of small memory compilers.)

    On the Internet, there is a project for almost everything - in this
    case Tiny C, which still seems to be under active development. Or
    at least there are sill commits at https://repo.or.cz/w/tinycc.git .

    However, there is a reason why compilers got so big - there is
    always a balance to be struck between comilation speed, compiler
    size and optimization.

    When I was writing the above, I was looking at the Program Logic
    Manual for the OS/360 Fortran G compiler.
    (G means it is supposed to run in 128K.)

    Fortran G was not written by IBM, but contracted out. And is not
    (mostly) in assembler, but in something called POP. That is, it
    is interpreted by the POP interpreter, with POPcode written using
    assembler macros. Doing that, for one, allows reusing the code
    for other machines, though you still need to rewrite the code
    generator. But also, at least likely, it decreases the size of
    the compiler. POP instructions are optimized for things that
    compiler need to do.

    I also had the source to that so many years ago, but not the
    manual describing it.

    An extreme example: According to "Abstracting Away the Machine", the
    very first FORTRAN compiler was so slow that the size of programs
    it could compile was limited by the MTBF of the IBM 704 of around
    eight hours.

    I remember stories about how well its optimizer worked, when
    it was believed that they had to compete in code speed with
    experienced assembly programmers. I don't remember anything
    about how fast it was.


    The balance has shifted over time, because of increasing computing
    power and available memory that can be applied to compilation,
    and because relatively more people use programs than use compilers
    than ever before. So, in today's environment, there is little
    incentive for writing small compilers.

    I first thought about this, when reading about the Hercules project
    of an IBM S/370 emulator, and couldn't run gcc in 16MB.
    (Well, subtract some for the OS, but it still wouldn't fit.)

    Also, languages have become bigger, more expressive, more powerful,
    more bloated (take your pick), which also increases the size
    of compilers.

    OK, the IBM PL/I (F) compiler, for what many consider a bloated
    language, is designed to run (maybe not well) in 64K.
    At the end of every compilation it tells how much memory was
    used, how much available, and how much to keep the symbol table
    in memory.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dennis Boone@21:1/5 to All on Fri Mar 24 22:51:32 2023
    OK, the IBM PL/I (F) compiler, for what many consider a bloated
    language, is designed to run (maybe not well) in 64K.
    At the end of every compilation it tells how much memory was
    used, how much available, and how much to keep the symbol table
    in memory.

    It's... 30-some passes, iirc?

    De
    [Well, phases or overlays but yes, IBM was really good at slicing compilers into pieces they could overlay. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to after I on Fri Mar 24 22:44:49 2023
    On Friday, March 24, 2023 at 9:13:05 PM UTC-7, Dennis Boone wrote:

    (after I wrote)
    OK, the IBM PL/I (F) compiler, for what many consider a bloated
    language, is designed to run (maybe not well) in 64K.
    At the end of every compilation it tells how much memory was
    used, how much available, and how much to keep the symbol table
    in memory.

    It's... 30-some passes, iirc?

    [Well, phases or overlays but yes, IBM was really good at slicing compilers into pieces they could overlay. -John]

    It is what IBM calls, I believe, dynamic overlay. Each module specifically requests others to be loaded into memory. If there is enough memory,
    they can stay, otherwise they are removed.

    And there are a few disk files to be used, when it is actually
    a separate pass. The only one I actually know, is if the preprocessor
    is used, it writes a disk file with the preprocessor output.

    And as noted, if it is really short on memory, the symbol table
    goes out to disk.

    Fortran H, on the other hand, uses the overlay system generated
    by the linkage editor. When running on virtual storage system, it is
    usual to run the compiler through the linkage editor to remove
    the overlay structure. (One of the few linkers that knows how
    to read its own output.) Normally it is about 300K, without
    overlay closer to 450K.
    [Never heard of dynamic overlays on S/360. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to All on Sat Mar 25 01:27:18 2023
    On Saturday, March 25, 2023 at 12:09:30 AM UTC-7, gah4 wrote:

    (snip)

    It is what IBM calls, I believe, dynamic overlay. Each module specifically requests others to be loaded into memory. If there is enough memory,
    they can stay, otherwise they are removed.

    Traditional overlays are generated by the linkage editor, and have
    static offsets determined at link time.

    PL/I (F) uses OS/360 LINK, LOAD, and DELETE macros to dynamically
    load and unload modules. The addresses are not static. IBM says:

    "The compiler consists of a number of phases
    under the supervision of compiler control
    routines. The compiler communicates with
    the control program of the operating
    system, for input/output and other
    services, through the control routines."

    All described in:

    http://bitsavers.trailing-edge.com/pdf/ibm/360/pli/GY28-6800-5_PL1_F_Program_Logic_Manual_197112.pdf

    They do seem to be called phases, but there are both physical and
    logical phases, where physical phases are what are more commonly
    called phases. There are way more than 100 modules, but I stopped
    counting.

    (snip)
    [Never heard of dynamic overlays on S/360. -John]

    It seems not to actually have a name.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to All on Sat Mar 25 13:07:57 2023
    On 3/24/23 10:17 PM, gah4 wrote:

    Fortran G was not written by IBM, but contracted out. And is not
    (mostly) in assembler, but in something called POP. That is, it
    is interpreted by the POP interpreter, with POPcode written using
    assembler macros. Doing that, for one, allows reusing the code
    for other machines, though you still need to rewrite the code
    generator. But also, at least likely, it decreases the size of
    the compiler. POP instructions are optimized for things that
    compiler need to do.

    After a look at "open software" I was astonished by the number of
    languages and steps involved in writing portable C code. Also updates of popular programs (Firefox...) are delayed by months on some platforms,
    IMO due to missing manpower on the target systems for checks and the
    adaptation of "configure". Now I understand why many people prefer
    interpreted languages (Java, JavaScript, Python, .NET...) for a
    simplification of their software products and spreading.

    What's the actual ranking of programming languages? A JetBrains study
    does not list any compiled language in their first 7 ranks in 2022. C++
    follows on rank 8.

    What does that trend mean to a compiler group? Interpreted languages
    still need a front-end (parser) and back-end (interpreter), but don't
    these tasks differ between languages compiled to hardware or interpretation?

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Neuner@21:1/5 to DrDiettrich1@netscape.net on Sat Mar 25 20:54:26 2023
    On Sat, 25 Mar 2023 13:07:57 +0100, Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:

    After a look at "open software" I was astonished by the number of
    languages and steps involved in writing portable C code. Also updates of >popular programs (Firefox...) are delayed by months on some platforms,
    IMO due to missing manpower on the target systems for checks and the >adaptation of "configure". Now I understand why many people prefer >interpreted languages (Java, JavaScript, Python, .NET...) for a >simplification of their software products and spreading.


    Actually Python is the /only/ one of those that normally is
    interpreted. And the interpreter is so slow the language would be
    unusable were it not for the fact that all of its standard library
    functions and most of its useful extensions are written in C.

    In practice Java and Javascript almost always are JIT compiled to
    native code rather than interpreted. There also exist offline (AOT)
    compilers for both.

    Many JIT runtimes do let you choose to have programs interpreted
    rather than compiled, but running interpreted reduces performance so
    much that it is rarely done unless memory is very tight.


    .NET is not a language itself but rather a runtime system like the
    Jave Platform. .NET consists of a virtual machine: the Common
    Language Runtime (CLR); and a set of standard libraries. Similarly
    the Java Platform consists of a virtual machine: the Java Virtual
    Machine (JVM); and a set of standard libraries. Compilers target
    these runtime systems.

    The .NET CLR does not include an interpreter ... I'm not aware that
    there even is one for .NET. There is an offline (AOT) compiler that
    can be used instead of the JIT.



    What's the actual ranking of programming languages? A JetBrains study
    does not list any compiled language in their first 7 ranks in 2022. C++ >follows on rank 8.

    What does that trend mean to a compiler group? Interpreted languages
    still need a front-end (parser) and back-end (interpreter), but don't
    these tasks differ between languages compiled to hardware or interpretation?

    The trend is toward "managed" environments which offer niceties like
    GC, objects with automagic serialized access, etc., all to help
    protect average programmers from themselves ... err, um, from being
    unable to produce working software.


    DoDi
    George

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to George Neuner on Tue Mar 28 09:21:50 2023
    On 3/26/23 1:54 AM, George Neuner wrote:
    On Sat, 25 Mar 2023 13:07:57 +0100, Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:

    After a look at "open software" I was astonished by the number of
    languages and steps involved in writing portable C code. Also updates of
    popular programs (Firefox...) are delayed by months on some platforms,
    IMO due to missing manpower on the target systems for checks and the
    adaptation of "configure". Now I understand why many people prefer
    interpreted languages (Java, JavaScript, Python, .NET...) for a
    simplification of their software products and spreading.

    Actually Python is the /only/ one of those that normally is
    interpreted. And the interpreter is so slow the language would be
    unusable were it not for the fact that all of its standard library
    functions and most of its useful extensions are written in C.

    My impression of "interpretation" was aimed at the back-end, where
    tokenized (virtual machine...) code has to be brought to a physical
    machine, with a specific firmware (OS). Then the real back-end has to
    reside on the target machine and OS, fully detached from the preceding
    compiler stages.

    Then, from the compiler writer viewpoint, it's not sufficient to define
    a new language and a compiler for it, instead it must placed on top of
    some popular "firmware" like Java VM, CLR or C/C++ standard libraries,
    or else a dedicated back-end and libraries have to be implemented on
    each supported platform.

    My impression was that the FSF favors C and ./configure for "portable"
    code. That's why I understand that any other way is easier for the implementation of really portable software, that deserves no extra
    tweaks for each supported target platform, for every single program. Can somebody shed some light on the current practice of writing portable
    C/C++ software, or any other compiled language, that (hopefully) does
    not require additional human work before or after compilation for a
    specific target platform?

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to DrDiettrich1@netscape.net on Tue Mar 28 14:42:18 2023
    In article <23-03-029@comp.compilers>,
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:
    My impression was that the FSF favors C and ./configure for "portable"
    code.

    Like many things, this is the result of evolution. Autoconf is well
    over 20 years old, and when it was created the ISO C and POSIX standards
    had not yet spread throughout the Unix/Windows/macOS world. It and the
    rest of the autotools solved a real problem.

    Today, the C and C++ worlds are easier to program in, but it's still
    not perfect and I don't think I'd want to do without the autotools. Particularly for the less POSIX-y systems, like MinGW and OpenVMS.

    Can somebody shed some light on the current practice of writing portable >C/C++ software, or any other compiled language, that (hopefully) does
    not require additional human work before or after compilation for a
    specific target platform?

    Well, take a look at Go. The trend there (as in the Python, Java and
    C# worlds) is to significantly beef up the standard libraries. Go
    has regular expressions, networking, file system, process and all kinds
    of other stuff in its libraries, all things that regular old C and C++ code often has to (or had to) hand-roll. That makes it a lot easier for
    someone to just write the code to get their job done, as well as
    providing for uniformity across both operating systems and applications
    written in Go.

    Go goes one step further, even. Following the Plan 9 example, the
    golang.org Go compilers are also cross compilers. I can build a Linux
    x86_64 executable on my macOS system just by setting some environment
    variables when running 'go build'. Really nice.

    The "go" tool itself also takes over a lot of the manual labor, such
    as downloading libraries from the internet, managing build dependencies
    (no need for "make") and much more. I suspect that that is also a
    trend.

    Does that answer your question?

    Arnold

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Hans-Peter Diettrich on Tue Mar 28 14:21:05 2023
    On Tuesday, March 28, 2023 at 1:14:29 AM UTC-7, Hans-Peter Diettrich wrote:

    (snip)
    Then, from the compiler writer viewpoint, it's not sufficient to define
    a new language and a compiler for it, instead it must placed on top of
    some popular "firmware" like Java VM, CLR or C/C++ standard libraries,
    or else a dedicated back-end and libraries have to be implemented on
    each supported platform.

    From an announcement today here on an ACM organized conference:


    "We encourage authors to prepare their artifacts for submission
    and make them more portable, reusable and customizable using
    open-source frameworks including Docker, OCCAM, reprozip,
    CodeOcean and CK."

    I hadn't heard about those until I read that one, but it does sound interesting.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Neuner@21:1/5 to DrDiettrich1@netscape.net on Tue Mar 28 17:26:45 2023
    On Tue, 28 Mar 2023 09:21:50 +0200, Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:

    On 3/26/23 1:54 AM, George Neuner wrote:
    On Sat, 25 Mar 2023 13:07:57 +0100, Hans-Peter Diettrich
    <DrDiettrich1@netscape.net> wrote:

    After a look at "open software" I was astonished by the number of
    languages and steps involved in writing portable C code. Also updates of >>> popular programs (Firefox...) are delayed by months on some platforms,
    IMO due to missing manpower on the target systems for checks and the
    adaptation of "configure". Now I understand why many people prefer
    interpreted languages (Java, JavaScript, Python, .NET...) for a
    simplification of their software products and spreading.

    Actually Python is the /only/ one of those that normally is
    interpreted. And the interpreter is so slow the language would be
    unusable were it not for the fact that all of its standard library
    functions and most of its useful extensions are written in C.

    My impression of "interpretation" was aimed at the back-end, where
    tokenized (virtual machine...) code has to be brought to a physical
    machine, with a specific firmware (OS). Then the real back-end has to
    reside on the target machine and OS, fully detached from the preceding >compiler stages.

    That is exactly as I meant it.

    Python and Java both initially are compiled to bytecode. But at
    runtime Python bytecode is interpreted: the Python VM examines each
    bytecode instruction, one by one, and executes an associated native
    code subroutine that implements that operation.

    In contrast, at runtime Java bytecode is JIT compiled to equivalent
    native code - which include calls to native subroutines to implement
    complex operations like "new", etc. The JVM JIT compiles function by
    function as the program executes ... so it takes some time before the
    whole program exists as native code ... but once a whole load module
    has been JIT compiled, the JVM can completely ignore and even unload
    the bytecode from memory.


    Then, from the compiler writer viewpoint, it's not sufficient to define
    a new language and a compiler for it, instead it must placed on top of
    some popular "firmware" like Java VM, CLR or C/C++ standard libraries,
    or else a dedicated back-end and libraries have to be implemented on
    each supported platform.

    Actually it simplifies the compiler writer's job because the
    instruction set for the platform VM tends not to change much over
    time. A compiler targeting the VM doesn't have to scramble to support
    features of every new CPU - in many cases that can be left to the
    platform's JIT compiler.


    My impression was that the FSF favors C and ./configure for "portable"
    code. That's why I understand that any other way is easier for the >implementation of really portable software, that deserves no extra
    tweaks for each supported target platform, for every single program. Can >somebody shed some light on the current practice of writing portable
    C/C++ software, or any other compiled language, that (hopefully) does
    not require additional human work before or after compilation for a
    specific target platform?

    Right. When you work on a popular "managed" platform (e.g., JVM or
    CLR), then its JIT compiler and CPU specific libraries gain you any
    CPU specific optimizations that may be available, essentially for
    free.

    OTOH, when you work in C (or other independent language), to gain CPU
    specific optimizations you have to write model specific code and/or
    obtain model specific libraries, you have to maintain different
    versions of your compiled executables (and maybe also your sources),
    and you need to be able to identify the CPU so as to install or use
    model specific code.


    For most developers, targeting a managed platform tends to reduce the
    effort needed to achieve an equivalent result.


    DoDi
    George
    [The usual python implementation interprets bytecodes, but there are
    also versions for .NET, the Java VM, and a JIT compiler. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to George Neuner on Wed Mar 29 11:27:49 2023
    On Wednesday, March 29, 2023 at 1:52:41 AM UTC-7, George Neuner wrote:

    Right. When you work on a popular "managed" platform (e.g., JVM or
    CLR), then its JIT compiler and CPU specific libraries gain you any
    CPU specific optimizations that may be available, essentially for
    free.

    For system like Matlab and Octave, and I believe also for Python,
    or one of many higher math languages, programs should spend
    most of the time in the internal compiled library routines.

    You could write a whole matrix inversion algorithm in Matlab
    or Python, but no reason to do that. That is the convenience
    of matrix operations, and gets better as they get bigger.

    In earlier days, there were Linpack and Eispack, and other
    Fortran callable math libraries. And one could write a
    small Fortran program to call them.

    But now we have so many different (more or less) interpreted
    math oriented languages, that it is hard to keep track of them,
    and hard to know which one to use.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Neuner@21:1/5 to All on Wed Mar 29 13:50:45 2023
    [The usual python implementation interprets bytecodes, but there are
    also versions for .NET, the Java VM, and a JIT compiler. -John]

    Thanks John. I knew about the reference implementation, but I was not
    aware of the others.
    George

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Aharon Robbins on Wed Mar 29 18:33:12 2023
    On 2023-03-28, Aharon Robbins <arnold@freefriends.org> wrote:

    In article <23-03-029@comp.compilers>,
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:
    My impression was that the FSF favors C and ./configure for "portable" >>code.

    Like many things, this is the result of evolution. Autoconf is well
    over 20 years old, and when it was created the ISO C and POSIX standards
    had not yet spread throughout the Unix/Windows/macOS world. It and the
    rest of the autotools solved a real problem.

    Today, the C and C++ worlds are easier to program in, but it's still
    not perfect and I don't think I'd want to do without the autotools. Particularly for the less POSIX-y systems, like MinGW and OpenVMS.

    Counterpoint: Autotools are a real detriment to GNU project programs.

    When a release is cut of a typical GNU program, special steps
    are execute to prepare a tarball which has a compiled configure
    script.

    You cannot just do a "git clone" of a GNU program, and then run
    ./configure and build. You must run some "make boostrap" nonsense, and
    that requires you to have various Autotools installed, and in specific versions!

    In the past what I have done to build a GNU program from version
    control, as a quick and dirty shortcut, done was to find the tarball
    which is the closest match to the baseline that I'm trying to build
    (e.g. of GNU Make or GNU Awk or whatever). Unpack the tarball over the repository and run ./configure. Then "git reset --hard" the changes and rebuild.

    Most Autotools programs will not cleanly cross-compile. Autotools is tha
    main reason why distro build systems use QEMU to create a virtual target environment with native tools and libraries, and then build the "cross-compiled" program as if it were native.

    Among the problems are in Autoconf itself. If it knows the program
    is being cross-compiled, any test which requires a test program to be
    compiled and executed is disabled. Since the output of that configure
    test is needed, bad defaults are substituted.
    For instance, about a decade and a half ago I helped a company
    replace Windriver cruft with an in-house distribution. Windriver's cross-compiled Bash didn't have job control! Ctrl-Z, fg, bg stuff no
    workie. The reason was that it was just cross-compiled straight, on an
    x86 build box. It couldn't run the test to detect job control support,
    and so it defaulted it off, even though the target machine had
    "gnu-linux" in its string. In the in-house distro, my build steps for
    bash exported numerous ac_cv_... internal variables to override the bad defaults.

    My TXR language project has a hand-written, not generated, ./configure
    script. What you get in a txr-285.tar.gz tarball is exactly what you
    get if you do a "git clone" and "git checkout txr-285", modulo
    the presence of a .git directory and differing timestamps.

    You just ./configure and make.

    I have a "./configure --maintainer" mode which will require flex and bison instead of using the shipped parser stuff, and that's about it.
    You don't have to use that to do development work.

    There is no incomprehensible nonsense in the build system at all.

    None of my configure-time tests require the execution of a program;
    For some situations, I have developed clever tricks to avoid it. For
    instance, if you want to know the size of a data type:. Here
    is a fragment:

    printf "Checking what C integer type can hold a pointer ... "

    if [ -z "$intptr" ] ; then
    cat > conftest.c <<!
    #include <stddef.h>
    #include <limits.h>
    #include "config.h"

    #define D(N, Z) ((N) ? (N) + '0' : Z)
    #define UD(S) D((S) / 10, ' ')
    #define LD(S) D((S) % 10, '0')
    #define DEC(S) { UD(S), LD(S) }

    struct sizes {
    char h_BYTE[32], s_BYTE[2];
    #if HAVE_SUPERLONG_T
    char h_SUPERLONG[32], s_SUPERLONG[2];
    #endif
    #if HAVE_LONGLONG_T
    char h_LONGLONG[32], s_LONGLONG[2];
    #endif
    char h_PTR[32], s_PTR[2];
    char h_LONG[32], s_LONG[2];
    char h_INT[32], s_INT[2];
    char h_SHORT[32], s_SHORT[2];
    char h_WCHAR[32], s_WCHAR[2];
    char nl[2];
    } foo = {
    "\nSIZEOF_BYTE=", DEC(CHAR_BIT),
    #if HAVE_SUPERLONG_T
    "\nSIZEOF_SUPERLONG_T=", DEC(sizeof (superlong_t)),
    #endif
    #if HAVE_LONGLONG_T
    "\nSIZEOF_LONGLONG_T=", DEC(sizeof (longlong_t)),
    #endif
    "\nSIZEOF_PTR=", DEC(sizeof (char *)),
    "\nSIZEOF_LONG=", DEC(sizeof (long)),
    "\nSIZEOF_INT=", DEC(sizeof (int)),
    "\nSIZEOF_SHORT=", DEC(sizeof (short)),
    "\nSIZEOF_WCHAR_T=", DEC(sizeof (wchar_t)),
    "\n"
    };
    !

    In this generated program the sizes are encoded as two-digit decimal
    strings, at compile time. So the compiled object file will contain
    something like "SIZEOF_PTR= 8" surrounded by newlines. The configure
    script can look for these strings and get the values out:

    if ! conftest_o ; then # conftest_o is a function to build the .o
    printf "failed\n\n"

    printf "Errors from compilation: \n\n"
    cat conftest.err
    exit 1
    fi

    The script gets the SIZEOF lines out and evals them as shell
    assignments. That's why we avoided SIZEOF_PTR=08; that would become
    octal in the shell:

    eval $(tr '\0' ' ' < conftest.o | grep SIZEOF | sed -e 's/ *//')

    It also massages these SIZEOFs into header file material:

    tr '\0' ' ' < conftest.o | grep SIZEOF | sed -e 's/= */ /' -e 's/^/#define /' >> config.h

    if [ $SIZEOF_PTR -eq 0 -o $SIZEOF_BYTE -eq 0 ] ; then
    printf "failed\n"
    exit 1
    fi

    Here is how it then looks like in config.h:

    #define SIZEOF_BYTE 8
    #define SIZEOF_LONGLONG_T 8
    #define SIZEOF_PTR 4
    #define SIZEOF_LONG 4
    #define SIZEOF_INT 4
    #define SIZEOF_SHORT 2
    #define SIZEOF_WCHAR_T 4

    There is a minor cross-compiling complication in txr in that you need
    txr to compile the standard library. So you must build a native txr
    first and then specify TXR=/path/to/native/txr to use that one for
    building the standard lib. Downstream distro people have figured this
    out on their own.


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to gah4@u.washington.edu on Wed Mar 29 18:34:53 2023
    On 2023-03-28, gah4 <gah4@u.washington.edu> wrote:
    On Tuesday, March 28, 2023 at 1:14:29 AM UTC-7, Hans-Peter Diettrich wrote:

    (snip)
    Then, from the compiler writer viewpoint, it's not sufficient to define
    a new language and a compiler for it, instead it must placed on top of
    some popular "firmware" like Java VM, CLR or C/C++ standard libraries,
    or else a dedicated back-end and libraries have to be implemented on
    each supported platform.

    From an announcement today here on an ACM organized conference:


    "We encourage authors to prepare their artifacts for submission
    and make them more portable, reusable and customizable using
    open-source frameworks including Docker, OCCAM, reprozip,
    CodeOcean and CK."

    "We encourage authors to lock their software to third party boat
    anchors, such as ..."


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    [If you are telling people not to use Docker, that whale sailed a long time ago. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Aharon Robbins on Fri Mar 31 07:49:46 2023
    On 3/28/23 4:42 PM, Aharon Robbins wrote:
    In article <23-03-029@comp.compilers>,
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:
    My impression was that the FSF favors C and ./configure for "portable"
    code.

    Like many things, this is the result of evolution. Autoconf is well
    over 20 years old, and when it was created the ISO C and POSIX standards
    had not yet spread throughout the Unix/Windows/macOS world. It and the
    rest of the autotools solved a real problem.

    About 20 years ago I could not build any open source program on Windows. Messages like "Compiler can not build executables" popped up when using
    MinGW or Cygwin. I ended up in ./configure in a Linux VM and fixing the resulting compiler errors manually on Windows. Without that trick I had
    no chance to load the "portable" source code into any development
    environment for inspection in readable (compilable) form. Often I had
    the impression that the author wanted the program not for use on Windows machines. Kind of "source open for specific OS only" :-(

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to gah4@u.washington.edu on Fri Mar 31 05:19:14 2023
    gah4 <gah4@u.washington.edu> schrieb:

    For system like Matlab and Octave, and I believe also for Python,
    or one of many higher math languages, programs should spend
    most of the time in the internal compiled library routines.

    They should, but sometimes they don't.

    If you run into things not covered by compiled libraries, but which
    are compute-intensive, then Matlab and (interpreted) Python run
    as slow as molasses, orders of magnitude slower than compiled code.

    As far as the projects to create compiled versions with Python
    go, one of the problems is that Python is a constantly evolving
    target, which can lead to real problems, especially in long-term
    program maintenance. As Konrad Hinsen reported, results in
    published science papers have changed due to changes in the Python infrastructure:

    http://blog.khinsen.net/posts/2017/11/16/a-plea-for-stability-in-the-scipy-ecosystem/

    At the company I work for, I'm told each Python project will only
    use a certain specified version of Python will never be changed for
    fear of incompatibilities - they treat each version as a new
    programming language :-|

    To bring this back a bit towards compilers - a language definition
    is an integral part of compiler writing. If

    - the specification to o be implemented is unclear or "whatever
    the reference implementation does"

    - the compiler writers always reserve the right for a better,
    incompatible idea

    - the compiler writers do not pay careful attention to
    existing specifications

    then the resuling compiler will be of poor quality, regardless of
    the cool parsing or code generation that go into it.

    And I know very well that reading and understanding language
    standards is no fun, but I'm told that writing them is even
    less fun.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aharon Robbins@21:1/5 to 864-117-4973@kylheku.com on Fri Mar 31 07:10:46 2023
    In article <23-03-037@comp.compilers>,
    Kaz Kylheku <864-117-4973@kylheku.com> wrote:
    On 2023-03-28, Aharon Robbins <arnold@freefriends.org> wrote:
    Today, the C and C++ worlds are easier to program in, but it's still
    not perfect and I don't think I'd want to do without the autotools.
    Particularly for the less POSIX-y systems, like MinGW and OpenVMS.

    Counterpoint: Autotools are a real detriment to GNU project programs.

    When a release is cut of a typical GNU program, special steps
    are execute to prepare a tarball which has a compiled configure
    script.

    You cannot just do a "git clone" of a GNU program, and then run
    ./configure and build. You must run some "make boostrap" nonsense, and
    that requires you to have various Autotools installed, and in specific >versions!

    This is not inherent in the autotools; it's laziness on the part of the maintainers. For exactly this reason gawk has a very simple bootstrap.sh program that simply does a touch on various files so that configure will
    run without wanting to run the autotools.

    Most Autotools programs will not cleanly cross-compile. Autotools is the
    main reason why distro build systems use QEMU to create a virtual target >environment with native tools and libraries, and then build the >"cross-compiled" program as if it were native.

    QEMU wasn't around when the Autotools were first designed and
    implemented. Most end users don't need to cross compile either, and it
    is for them that I (and other GNU maintainers, I suppose) build my
    configure scripts.

    Yes, the world is different today than when the autotools were
    designed. No, the autotools are not perfect. I don't know of a better alternative though. And don't tell me CMake. CMake is an abomination, interweaving configuration with building instead of cleanly separating
    the jobs. Not to mention its stupid caching which keeps you from
    running a simple "make" after you've changed a single file.

    My TXR language project has a hand-written, not generated, ./configure >script. What you get in a txr-285.tar.gz tarball is exactly what you
    get if you do a "git clone" and "git checkout txr-285", modulo
    the presence of a .git directory and differing timestamps.

    You just ./configure and make.

    And for gawk it's ./bootstrap.sh && ./configure && make
    where bootstrap.sh only takes a few seconds.

    None of my configure-time tests require the execution of a program;
    For some situations, I have developed clever tricks to avoid it.

    And why should you, or anyone, be forced to develop such clever tricks?

    All of this simply justifies more the approach taken by newer languages,
    which is to move all the hard crap into the libraries. The language
    developers do all the hard work, instead of the application developers
    having to do it. This is great for people who want to just get their
    job done, which includes me most of the time. However, and this is a
    different discussion, it does lead to a generation of programmers who
    have *no clue* as to how to do the hard stuff should they ever need to.

    My opinion, of course.

    Arnold
    --
    Aharon (Arnold) Robbins arnold AT skeeve DOT com

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Hans-Peter Diettrich on Fri Mar 31 16:34:24 2023
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> writes:
    My impression was that the FSF favors C and ./configure for "portable"
    code. That's why I understand that any other way is easier for the >implementation of really portable software, that deserves no extra
    tweaks for each supported target platform, for every single program.

    I have not noticed that the FSF has any preference for C, apart from C
    being the lingua franca in the late 1980s and the 1990s, and arguably
    for certain requirements it still is.

    Now C on Unix has to fight with certain portability issues. In early
    times C programs contained a config.h that the sysadmin installing a
    program had to edit by hand before running make. Then came autoconf,
    which generates configure files that run certain checks on the system
    and fill in config.h for you; and of course, once the mechanism is
    there, stuff in other files is filled in with configure, too.

    It's unclear to me what you mean with "any other way is easier". The
    way of manually editing config.h certainly was not easier for the
    sysadmins. Not sure if it was easier for the maintainer of the
    programs.

    Can somebody shed some light on the current practice of writing portable >C/C++ software, or any other compiled language, that (hopefully) does
    not require additional human work before or after compilation for a
    specific target platform?

    There are other tools like Cmake that claim to make autoconf
    unnecessary, but when I looked at it, I did not find it useful for my
    needs (but I forgot why).

    So I'll tell you here some of what autoconf does for Gforth: Gforth is
    a Forth system mostly written in Forth, but using a C substrate. Many
    system differences are dealt with in the C substrate, often with the
    help of autoconf. The configure.ac file describes what autoconf
    should do for Gforth; it has grown to 1886 lines.

    * It determines the CPU architecture and OS where the configure script
    is running at, and uses that to configure some architecture-specific
    stuff for Gforth, in particular how to synchronize the data and
    instruction caches; later gcc acquired __builtin___clear_cache() to
    do that, but at least on some platforms that builtin is broken
    <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93811>.

    * It checks the sizes of the C integer types in order to determine the
    C type for Forth's cell and double-cell types.

    * It uses the OS information to configure things like the newline
    sequence, the directory and path separators.

    * It deals with differences between OSs, such as large (>4GB) file
    support, an issue relevant in the 1990s.

    * It checks for the chcon program, and, if present, uses it to "work
    around SELinux brain damage"; if not present, the brain is probably
    undamaged.

    * It tests which of several ways is accepted by the assembler to skip
    code space (needed for implementing Gforth's dynamic
    superinstructions).

    * It checks for the presence of various programs and library functions
    needed for building Gforth, e.g. mmap() (yes, there used to be
    systems that do not have mmap()). In some cases it works around the
    absence, sometimes with degraded functionality; in other cases it
    just reports the absence, so the sysadmin knows what to install.

    That's just some of the things I see in configure.ac; there are many
    bits and pieces that are too involved and/or too minor to report here.

    Our portability stuff does not catch everything. E.g., MacOS on Apple
    Silicon has a broken mmap() (broken as far as Gforth is concerned;
    looking at POSIX, it's compliant with that, but that does not justify
    this breakage; MacOS on Intel works fine, as does Linux on Apple
    Silicon), an issue that's new to us; I have not yet devised a
    workaround for that, but when I do, a part of the solution may use
    autoconf.

    Now when you write Forth code in Gforth, it tends to be quite portable
    across platforms (despite Forth being a low-level language where, if
    you want to see them, it's easy to see differences between 32-bit and
    64-bit systems, and between different byte orders). One reason for
    that is that Gforth papers over system differences (with the help of
    autoconf among other things); another reason is that Gforth does not
    expose many of the things where the systems are different, at least
    not at the Forth level. You can use the C interface and then access
    all the things that C gives access to, many of which are
    system-specific, and for which tools like autoconf exist.

    The story is probably similar for other languages.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Thomas Koenig on Fri Mar 31 12:41:32 2023
    On Friday, March 31, 2023 at 4:42:14 AM UTC-7, Thomas Koenig wrote:
    gah4 <ga...@u.washington.edu> schrieb:
    For system like Matlab and Octave, and I believe also for Python,
    or one of many higher math languages, programs should spend
    most of the time in the internal compiled library routines.

    They should, but sometimes they don't.

    If you run into things not covered by compiled libraries, but which
    are compute-intensive, then Matlab and (interpreted) Python run
    as slow as molasses, orders of magnitude slower than compiled code.

    But then there is dynamic linking.

    I have done it in R, but I believe it also works for Matlab and
    Python, and is the way many packages are implemented. You write a
    small C or Fortran program that does the slow part, and call it from interpreted code.

    And back to my favorite x86 assembler program:

    rdtsc: rdtsc
    ret

    which allows high resolution timing, to find where the program
    is spending too much time. Some years ago, I did this on a program
    written by someone else, so I mostly didn't know the structure.
    Track down which subroutines used too much time, and fix
    just those.

    In that case, one big time sink is building up a large matrix one
    row or one column at a time, which requires a new allocation and
    copy for each time. Preallocating to the final (if known) size fixes that.

    But then there were some very simple operations that, as you note,
    are not included and slow. Small C programs fixed those.
    There are complications for memory allocation, which I avoid
    by writing mine to assume (require) that all is already allocated.

    (snip)

    At the company I work for, I'm told each Python project will only
    use a certain specified version of Python will never be changed for
    fear of incompatibilities - they treat each version as a new
    programming language :-|

    To bring this back a bit towards compilers - a language definition
    is an integral part of compiler writing. If

    I have heard about that one.

    It seems that there are non-backward compatible changes
    from Python 2.x to 3.x. That is, they pretty much are different
    languages.

    Tradition on updating a language standard is to maintain, as much
    as possible, backward compatibility. It isn't always 100%, but often
    close enough. You can run Fortran 66 program on new Fortran 2018
    compilers without all that much trouble. (Much of the actual problem
    comes with extensions used by the old programs.)
    [Python's rapid development cycle definitely has its drawbacks. Python 3
    is not backward compatible with python 2 (that's why they bumped the major version number) and they ended support for python 2 way too soon. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Hans-Peter Diettrich on Sun Apr 2 10:04:31 2023
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> writes:
    Often I had
    the impression that the author wanted the program not for use on Windows >machines. Kind of "source open for specific OS only" :-(

    Whatever we want, it's also a question of what the OS vendor wants.

    For a Unix, there were a few hoops we had to jump through to make
    Gforth work: e.g., IRIX 6.5 had a bug in sigaltstack, so we put in a
    workaround for that; HP/UX's make dealt with files with the same mtime differently from other makes, so we put in a workaround for that.
    Windows, even with Cygwin, puts up many more hoops to jump through;
    Bernd Paysan actually jumped through them for Gforth, but a Windows
    build is still quite a bit of work, so he does that only occasionally.

    It's no surprise to me that other developers don't jump through these
    hoops; maybe if someone payed them for it, but why should they do it
    on their own time?

    As a recent example of another OS, Apple has intentionally reduced the functionality of mmap() on MacOS on Apple silicon compared to MacOS on
    Intel. As a result, the development version of Gforth does not work
    on MacOS on Apple Silicon (it works fine on Linux on Apple Silicon).
    I spent a day last summer on the MacOS laptop of a friend (an
    extremely unpleasant experience) trying to find the problem and fix
    it, and I found the problem, but time ran out before I had a working
    fix (it did not help that I had to spend a lot of time on working
    around things that I missed in MacOS). Since then this problem has
    not reached the top of my ToDo list; and when it does, I will go for
    the minimal fix, with the result that Gforth on MacOS will run without
    dynamic native-code generation, i.e., slower than on Linux.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Kaz Kylheku on Sun Apr 2 08:56:48 2023
    Kaz Kylheku <864-117-4973@kylheku.com> writes:
    When a release is cut of a typical GNU program, special steps
    are execute to prepare a tarball which has a compiled configure
    script.

    You cannot just do a "git clone" of a GNU program, and then run
    ./configure and build. You must run some "make boostrap" nonsense, and
    that requires you to have various Autotools installed, and in specific >versions!

    And the problem is?

    The git repo contains only the source code, useful for developers.
    The developers have stuff installed that someone who just wants to
    install the program does not necessarily want to install. E.g., in
    the case of Gforth, you need an older Gforth to build the kernel
    images that contain Forth code compiled to an intermediate
    representation. Therefore the tarballs contain a number of generated
    (or, as you say, "compiled") files, e.g., the configure script, the
    kernel images in case of Gforth, or the C files generated by Bison in
    case of some other compilers.

    If you go for the "git clone" route rather than building from the
    tarball, you don't get these amenities, but have to install all the
    tools that the developers use, and have to perform an additional step
    (usually ./autogen.sh) to produce the configure file. "make
    bootstrap" is unlikely to work, because at that stage you don't have a Makefile.

    I remember "make bootstrap" from gcc, where IIRC it compiles gcc first
    (stage1) with the pre-installed C compiler, then (stage2) with the
    result of stage1, and finally (stage3) again with the result of
    stage2; if there is a difference between stage2 and stage3, something
    is amiss.

    Anyway, tl;dr: If you just want to do "./configure; make", use the
    tarball.

    Most Autotools programs will not cleanly cross-compile. Autotools is tha
    main reason why distro build systems use QEMU to create a virtual target >environment with native tools and libraries, and then build the >"cross-compiled" program as if it were native.

    Clever! Let the machine do the work, rather than having to do manual
    work for each package.

    For instance, about a decade and a half ago I helped a company
    replace Windriver cruft with an in-house distribution. Windriver's >cross-compiled Bash didn't have job control! Ctrl-Z, fg, bg stuff no
    workie. The reason was that it was just cross-compiled straight, on an
    x86 build box. It couldn't run the test to detect job control support,
    and so it defaulted it off, even though the target machine had
    "gnu-linux" in its string. In the in-house distro, my build steps for
    bash exported numerous ac_cv_... internal variables to override the bad >defaults.

    That's the way to do it.

    Your idea seems to be that, when the value is not supplied, instead of
    a safe default (typically resulting in not using a feature), one
    should base the values on the configuration name of the system. I
    think the main problem with that is that for those systems most in
    need of cross-compiling the authors of the tests don't know good
    values for the configuration variables; for linux-gnu systems I
    usually configure and compile on the system.

    For some situations, I have developed clever tricks to avoid it. For >instance, if you want to know the size of a data type:. Here
    is a fragment:

    Great! Now we need someone who has enough time to replace the
    AC_CHECK_SIZEOF autoconf macro with your technique, and a significant
    part of the configuration variables that have to be supplied manually
    when cross-configuring Gforth become fully automatic.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Anton Ertl on Wed Apr 5 11:23:39 2023
    On 4/2/23 12:04 PM, Anton Ertl wrote:

    For a Unix, there were a few hoops we had to jump through to make
    Gforth work: e.g., IRIX 6.5 had a bug in sigaltstack, so we put in a workaround for that; HP/UX's make dealt with files with the same mtime differently from other makes, so we put in a workaround for that.
    Windows, even with Cygwin, puts up many more hoops to jump through;
    Bernd Paysan actually jumped through them for Gforth, but a Windows
    build is still quite a bit of work, so he does that only occasionally.

    Too bad that not all existing OS are POSIX compatible? ;-)

    So my impression still is: have a language (plus library) and an
    interpreter (VM, browser, compiler...) on each target system. Then
    adaptations to a target system have to be made only once, for each
    target, not for every single program.

    Even for programs with extreme speed requirements the development can be
    done from the general implementation, for tests etc., and a version
    tweaked for a very specific target system, instead of the single target
    version in the first place and problematic ports to many other platforms.

    Of course it's up to the software developer or principal to order or
    build a software for a (more or less) specific target system only, or a primarily unbound software.

    (G)FORTH IMO is a special case because it's (also) a development system. Building (bootstrapping) a new FORTH system written in FORTH is quite complicated, in contrast to languages with stand alone tools like
    compiler, linker etc. Some newer (umbilical?) FORTH versions also
    compile to native code.

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Hans-Peter Diettrich on Wed Apr 5 16:30:31 2023
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> writes:
    On 4/2/23 12:04 PM, Anton Ertl wrote:

    For a Unix, there were a few hoops we had to jump through to make
    Gforth work: e.g., IRIX 6.5 had a bug in sigaltstack, so we put in a
    workaround for that; HP/UX's make dealt with files with the same mtime
    differently from other makes, so we put in a workaround for that.
    Windows, even with Cygwin, puts up many more hoops to jump through;
    Bernd Paysan actually jumped through them for Gforth, but a Windows
    build is still quite a bit of work, so he does that only occasionally.

    Too bad that not all existing OS are POSIX compatible? ;-)

    Like many standards, POSIX is a subset of the functionality that
    programs use. Windows NT used to have a POSIX subsystem in order to
    make WNT comply with FIPS 151-2 needed to make WNT eligible for
    certain USA government purchases. From what I read, it was useful for
    that, but not much else.

    So my impression still is: have a language (plus library) and an
    interpreter (VM, browser, compiler...) on each target system. Then >adaptations to a target system have to be made only once, for each
    target, not for every single program.

    You mean: Write your program in Java, Python, Gforth, or the like?
    Sure, they deal with compatibility problems for you, but you may want
    to do things (or have performance) that they do not offer, or only
    offer through a C interface (and in the latter case you run into the
    C-level compatibility again).

    Even for programs with extreme speed requirements the development can be
    done from the general implementation, for tests etc., and a version
    tweaked for a very specific target system, instead of the single target >version in the first place and problematic ports to many other platforms.

    Well, if you go that route, the result can easily be that your program
    does not run on Windows. Especially for GNU programs: The primary
    goal is that they run on GNU. Any effort spent on a Windows port is
    extra effort that not everybody has time for.

    (G)FORTH IMO is a special case because it's (also) a development system. >Building (bootstrapping) a new FORTH system written in FORTH is quite >complicated, in contrast to languages with stand alone tools like
    compiler, linker etc.

    Not really. Most self-respecting languages have their compiler(s)
    implemented in the language itself, resulting in having to bootstrap.
    AFAIK the problem Gforth has with Windows is not the bootstrapping;
    packaging and installation are different than for Unix.

    - anton
    --
    M. Anton Ertl
    anton@mips.complang.tuwien.ac.at
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Anton Ertl on Fri Apr 7 15:35:32 2023
    On 4/5/23 6:30 PM, Anton Ertl wrote:
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> writes:

    You mean: Write your program in Java, Python, Gforth, or the like?
    Sure, they deal with compatibility problems for you, but you may want
    to do things (or have performance) that they do not offer, or only
    offer through a C interface (and in the latter case you run into the
    C-level compatibility again).

    Except the library also is portable ;-)

    Else you end up with:
    Program runs only on systems with libraries X, Y, Z installed.


    (G)FORTH IMO is a special case because it's (also) a development system.
    Building (bootstrapping) a new FORTH system written in FORTH is quite
    complicated, in contrast to languages with stand alone tools like
    compiler, linker etc.

    Not really. Most self-respecting languages have their compiler(s) implemented in the language itself, resulting in having to bootstrap.

    The FORTH compiler also is part of the current monolithic framework.
    Replacing a WORD has immediate impact on the just running compiler and everything else. A bug can make the current system crash immediately,
    without diagnostics. Else the current WORDs can not be replaced
    immediately, only after a full compilation and only by code that depends
    on neither the old nor the new framrwork.


    AFAIK the problem Gforth has with Windows is not the bootstrapping;
    packaging and installation are different than for Unix.

    Isn't that the same problem with every language?

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Anton Ertl on Thu Apr 6 08:35:12 2023
    On 2023-04-05, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
    Hans-Peter Diettrich <DrDiettrich1@netscape.net> writes:
    On 4/2/23 12:04 PM, Anton Ertl wrote:

    For a Unix, there were a few hoops we had to jump through to make
    Gforth work: e.g., IRIX 6.5 had a bug in sigaltstack, so we put in a
    workaround for that; HP/UX's make dealt with files with the same mtime
    differently from other makes, so we put in a workaround for that.
    Windows, even with Cygwin, puts up many more hoops to jump through;
    Bernd Paysan actually jumped through them for Gforth, but a Windows
    build is still quite a bit of work, so he does that only occasionally.

    Too bad that not all existing OS are POSIX compatible? ;-)

    Like many standards, POSIX is a subset of the functionality that
    programs use. Windows NT used to have a POSIX subsystem in order to
    make WNT comply with FIPS 151-2 needed to make WNT eligible for
    certain USA government purchases. From what I read, it was useful for
    that, but not much else.

    The best POSIX subsystem for Windows is arguably Cygwin. It has
    quite a rich POSIX functionality. Not only that, but ANSI terminal functionality: its I/O system contains a layer which translates
    ANSI escape sequences into Windows Console API calls.

    Yuo can take a program written on Linux which uses termios to put the
    TTY in raw mode, and ANSI escapes to control the screen, and it will
    work on Cygwin.

    One of its downsides downside is that Cygwin has poor performance
    (mainly in the area of file access).

    The other downside of Cygwin is that it implements certain conventions
    that are at odds with "native" Windows.

    In 2016 I started a small project called Cygnal (Cygwin Native
    Application Library) to fix problems in this second category,
    creating a fork of the Cygwin DLL.

    https://www.kylheku.com/cygnal

    (G)FORTH IMO is a special case because it's (also) a development system. >>Building (bootstrapping) a new FORTH system written in FORTH is quite >>complicated, in contrast to languages with stand alone tools like
    compiler, linker etc.

    Not really. Most self-respecting languages have their compiler(s) implemented in the language itself, resulting in having to bootstrap.

    You can avoid the chicken-and-egg problem that requires boostrapping by
    using a host language to implement an interpreter for the target
    language. That interpreter can then directly execute the compiler, which
    can compile itself and other parts of the run-time, as needed.

    It's still a kind of boostrapping, but at no point do you need a
    pre-built binary of the target language compiler to build that compiler;
    you just need an implementation of a host language.

    This works quite well when the host language is good for writing
    interpreters, and the target one for compiler work, and also when it's
    useful to have an interpreter even when compilation is available.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Anton Ertl on Sat Apr 8 18:25:06 2023
    Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
    Most self-respecting languages have their compiler(s)
    implemented in the language itself, resulting in having to bootstrap.

    This is a bit complicated for GCC and LLVM.

    For both, the middle end (and back end) is implemented in C++,
    so a C++ interface at class level is required, and that is a
    bit daunting.

    Examples: Gnat (GCC's Ada front end) is written in Ada, and its
    Modula-2 front end is written in Modula-2. On the other hand,
    the Fortran front end is written in C++ (well, mostly C with
    C++ features hidden behind macros).

    The very first Fortran compiler, of course, was written in
    assembler.
    [It was, but Fortran H, the 1960s optimizing compiler for S/360 was
    written in Fortran with a few data structure extensions. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)