• Are transpiling techniques different than compiling techniques?

    From Roger L Costello@21:1/5 to All on Mon Oct 11 13:26:01 2021
    Hi Folks,

    Today I learned a new word: transpiling

    I looked it up and learned that it is converting one source code to another. See below.

    "Neat!" I thought. "I am converting a military air navigation data format to a civilian air navigation data format, which is a kind of transpiling, I think.
    I wonder if there are techniques specific to transpiling?

    Is there a book or tutorial on how to build a transpiler? Are there techniques unique to transpilers?

    /Roger

    -----------------------------------------------

    Compiler: is an umbrella term to describe a program that takes source code written in one language and produce a (or many) output file in some other language. In practice we mostly use this term to describe a compiler such as gcc which takes in C code as input and produces a binary executable (machine code) as output.

    Transpilers are also known as source-to-source compilers. So in essence they are a subset of compilers which take in a source code file and convert it to another source code file in some other language or a different version of the same language. The ouput is generally understandable by a human. This output still has to go through a compiler or interpreter to be able to run on the machine.

    Some examples of transpilers:
    1. Emscripten<https://kripken.github.io/emscripten-site/>: Transpiles C/C++ to JavaScript
    2. Babel<https://babeljs.io/>: Transpiles ES6+ code to ES5 (ES6 and ES5 are different versions or generations of the JavaScript language)

    https://stackoverflow.com/questions/44931479/compiling-vs-transpiling

    [Back in the day, the term was "sift", from a translator from Fortran
    II to Fortran IV written in 1962. In the late 1960s IBM had a Fortran
    to PL/I translator which worked (I used it) but generated ugly code
    due to all the places where the semantics of PL/I were almost but not
    quite the same as similar looking Fortran constructs:

    http://bitsavers.org/pdf/ibm/360/fortran/GC33-2002-2_FORTRAN_To_PL1_Translator_Jan73.pdf

    I think you will find two approaches. There's the half-hearted one in which
    it translates contstructs into corresponding ones and hopes the differences don't matter, and the full one that is a real compiler with all of the
    usual analyses and a code generator that happens to generate another high
    level language. The f2c Fortran to C translator is an example

    https://www.netlib.org/f2c/f2c.pdf

    -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Detlef Meyer-Eltz@21:1/5 to All on Tue Oct 12 11:34:04 2021
    I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
    without beeing aware, that this kind of software is called a "transpiler".

    https://www.texttransformer.com/Delphi2Cpp_en.html <https://www.texttransformer.com/Delphi2Cpp_en.html>

    What might come close to a special transpiler technique are "rewrite
    rules" of syntax trees. But I use a naive approach with no mysterious transpiler theory in the background. I shortly describe the steps that
    are done during conversion:

    1. the Delphi source code is pre-processed according to the set conditions
    2. the resulting reduced code is parsed to build a syntax tree
    3. the syntax tree is pre-processed to calculate some information needed
    for the output.
    4. the syntax tree is output as C++ code

    For the first two steps an own parser generator called "TextTransformer"
    is used. The first step can be regarded as a kind of compilation/"transpilation" of its own. An example for the third step is
    the calculation of the variables that have to be passed to
    sub-functions, when nested functions are unbundled. A lot of manual work
    has to be done for the fourth step. Numerous special cases have to be hard-coded there, as there is no simple deduction relationship between
    the source language and the target language. Some Delphi constructs
    cannot be converted at all. But C++ is more powerful than Delphi, so
    that many Delphi constructs can be reconstructed or simulated in C++. A converter the other way round would be quite poor. The power of a
    language could be part of a transpiler theory.

    In contrast to a compiler, which has to be fast because it is used over
    and over again in the development of software, the speed of the
    tranpiler does not matter: ideally, it only has to be used once to do
    its job.


    Detlef


    Am 11.10.2021 um 15:26 schrieb Roger L Costello:
    Hi Folks,

    Today I learned a new word: transpiling

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kartik Agaram@21:1/5 to All on Mon Oct 11 11:23:33 2021
    On a slight tangent, I've never liked the term "compiler". I prefer "translator". "Translator" maps well with "interpreter" when talking about natural languages. That seems like a good reason to also use it for
    computer languages.

    Bringing it back to this thread, I think the difference between compilers
    and transpilers is largely meaningless. They're both just translators.

    [It is about 65 years too late to change "compiler". On the other
    hand, approximately nobody uses "transpiler" and we can use something
    less cute like translator, or the classic SIFT. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Kartik Agaram on Tue Oct 12 20:05:34 2021
    On 10/11/21 8:23 PM, Kartik Agaram wrote:
    On a slight tangent, I've never liked the term "compiler". I prefer "translator". "Translator" maps well with "interpreter" when talking about natural languages. That seems like a good reason to also use it for
    computer languages.

    Bringing it back to this thread, I think the difference between compilers
    and transpilers is largely meaningless. They're both just translators.

    I'd classify both like with lexer and parser by I/O type: A compiler
    translates from source text into *binary* code, the other one into
    another source *text*.

    The "transpiler" IMO is a relict from the time when translation of human
    speech was the domain of humans, to deprecate the output of translation programs. While automated translation really sucked for decades, in the
    last years I found human translations and presentations often less
    precise or meaningful than automated translation.

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Detlef Meyer-Eltz on Tue Oct 12 20:19:20 2021
    On 10/12/21 11:34 AM, Detlef Meyer-Eltz wrote:
    I'm working for years on the Delphi to C++ translator "Delphi2Cpp",
    without beeing aware, that this kind of software is called a "transpiler".

    https://www.texttransformer.com/Delphi2Cpp_en.html <https://www.texttransformer.com/Delphi2Cpp_en.html>

    Hi Detlef, I find your "TextTransformer" quite a good name :-)

    In contrast to a compiler, which has to be fast because it is used over
    and over again in the development of software, the speed of the
    tranpiler does not matter: ideally, it only has to be used once to do
    its job.

    Depending on the project type all (daily...) updates of the origin have
    to be translated anew. With the risk of introduced bugs that require a verification of each translation.

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From jan van katwijk@21:1/5 to All on Tue Oct 12 17:59:50 2021
    I have - long time ago - written al Algol 60 to C translator.
    Not one where "some intermediate machine" is defined and implemented in C,
    but each Algol 60 construct is mapped upon - hopefully - semantically equivalent C constructs.

    Looking at the translation process it is just a simplified compiler,
    which a parser, a scan for name resolution, a scan to generate an
    include file and a scan to map Algol procedures to C procedures.

    Apart from handling by name parameters, to be mapped into (almost) parameterless procedures, to function parameters (in Algol one does
    not specify the parameter profile of a formal procedure parameter) and
    - to a certain extent - switches and labels as parameter, it is fairly
    straight forward (extensive description is available, see " https://github.com/JvanKatwijk/algol-60-compiler).

    I would not give it another name than translator or compiler.

    Of course mapping any language to any other language may give
    problems, in the 80-ies we made a subset A60 to Ada translator, and
    direct mapping of by name parameters and things like non-local gotos
    is not well possible (but then, the programs that needed to be
    translated was simply structured, apart from a few goto's no big
    problems)

    jan



    Op di 12 okt. 2021 om 17:18 schreef Detlef Meyer-Eltz < Meyer-Eltz@t-online.de>:

    I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
    without beeing aware, that this kind of software is called a "transpiler".
    [ by some people ]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christopher F Clark@21:1/5 to Detlef on Thu Oct 14 00:33:44 2021
    In this interested thread, Detlef wrote:
    In contrast to a compiler, which has to be fast because it is used over
    and over again in the development of software, the speed of the
    transpiler does not matter: ideally, it only has to be used once to do
    its job.

    Sometimes, this is true, sometimes not.

    Twice in my career I worked on projects which developed a transpiler.

    In the first case, it was true.

    My mentor on that project developed a Jovial to PL/I transpiler using PL/I's macro facility. We only used it to bootstrap the "real" Jovial compiler
    to Multics, which was also only a bootstrap to get the Interdata 8/32
    Jovial compiler to work. And, the Jovial to PL/I transpiler didn't have to
    be real accurate nor fast nor deal with the entire language, just good
    enough to get the compiler bootstrapped We may have done the
    bootstrap several dozen times (but probably not several hundred)
    during the development, but once the Multics Jovial compiler was
    working, we never did it again. It was a throw away transpiler.

    In the second case, it was not.

    At Intel I was part of a CAD team for chip design tools. The tool we
    built was called "VMOD" (I don't remember what that stood for). Anyway,
    it had one part that allowed designers to draw gates and wire graphically.
    That was the initial impetus to the project. However, it was known that
    there were places where the design would be better expressed in Verilog
    and we allowed the user to drop "combo" boxes (short for combinatorial
    logic boxes) that looked vaguely like "chips" (i.e. they were rectangular)
    and had "pins" around the edges for connecting to graphical wires.
    But in those combo boxes you could use a text editor to input Verilog code, which could refer to the pins to communicate with the graphical model.

    Anyway, I did the Verilog compiler (and also the "compiler" for the graphical gates--they were actually the same compiler and used the same IR).
    But, the technology for both was transpilation. For simulation of the chip,
    we transpiled to C++ code and used Visual C++ as our "backend". Our
    runtime library was also written in C++. So, we got out a C++ version
    of the "chip" that did a cycle accurate model of what the real chip would
    do and it hooked back to the graphic model for certain aspects of debugging, but it also generated log files and allowed debugging of the C++ code.

    However, for that code, we didn't just translate once. Since the design
    work was on-going and simulating the design so that the developers
    could get the chip(s) right (it was used to design about a dozen or two
    chips over its useful life) meant it needed to run at something approximating
    C speed, which we managed to do and it was thus about 1000x of the
    performance of the previous simulators that Intel had been using. In fact,
    it was fast enough that teams had us do a pure Verilog version (no graphic support) for teams that were coding in Verilog and didn't buy into the development model that the tool was designed to promote. So, it became
    a Verilog to C++ transpiler.

    Of course, being for chip design, the other important aspect was synthesizing real gates. For synthesis, we transpiled the graphic model into Verilog
    and that included transpiling the Verilog combo box code into Verilog,
    Now, that was mostly an identity transform except for hooking up the pins, dealing with name collisions (e.g. renaming multiple copies of the same
    box to unique names) and a few clock related portions. When it was
    acting as a Verilog compiler, only the name collision and clock related portions were relevant as there were no graphic gates and no pins.

    And, by the way, there was no big secret to achieving approximate C speed.
    We got it because we let Visual Studio do all the heavy lifting of optimization and code generation. There was no way, I was going to compete with
    them on that aspect and no need to. Thus, transpiling gave us a reasonably good compiler for a fraction of the effort.

    And the main thing we had to do was deal with the fact that in Verilog
    each "bit" has 4 states 0,1, x and z. And the x and z states of a bit
    are used in stylized ways (x means invalid and z means don't care).
    So, we did a small amount of analysis to detect if the gates and wires
    under consideration could have x or z values and if so, used the more
    complex logic that got those values correct (and mapped each bit to a
    2 bit pair, so that we had 4 states to use and we did it FORTRAN
    "column major" style, so that the bits for 0,1 were in one contiguous
    array for the width of the wire/bus they were representing and the
    bits indicating that 0,1 was really x, z were in a parallel array and
    a quick check of the 2nd array for all 0s allowed us often to not deal
    with it at all for some wire/bus. And, if we statically determined
    that none of the bits on that bus would ever be x or z, we didn't need
    that 2nd array at all. So, things like adders could then use the
    normal simple C/C++ logic for addition and not have to do a bit-by-bit
    version of it.

    We also had special code for clocks, because they couldn't be x or z, but most flip-flops are edge triggered, so you want to distinguish rising edge from high (or low) and the same for falling edge. And all the gates were partitioned into what edge or level they were sensitive to and we only ran the code when the relevant clock was in that state.
    -- ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres
    23 Bailey Rd voice: (508) 435-5016
    Berlin, MA 01503 USA twitter: @intel_chris ------------------------------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Detlef Meyer-Eltz on Sat Oct 16 17:26:45 2021
    On 2021-10-12, Detlef Meyer-Eltz <Meyer-Eltz@t-online.de> wrote:
    I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
    without beeing aware, that this kind of software is called a "transpiler".

    It isn't; that's just a word used by some web programming hipsters.
    Transpilers are everywhere, because browsers are stuck with Javascript
    as their lowest-level target language*, and it sucks so terribly that
    people want to use almost anything else. The bar is quite low; it's easy
    to write toy languages that spit out Javascript, so it has become a kind
    of popular sport, and from there came "transpiling".

    ---
    * I know what Webassembly is; it's gadget for expressing lower-level computations with machine-oriented types, to complement and accompany Javascript; it is not a replacement for Javascript.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kartik Agaram on Sat Oct 16 17:16:05 2021
    On 2021-10-11, Kartik Agaram <ak@akkartik.com> wrote:
    On a slight tangent, I've never liked the term "compiler". I prefer "translator". "Translator" maps well with "interpreter" when talking about natural languages. That seems like a good reason to also use it for
    computer languages.

    Back in the day of Grace Hopper working on Fortran, the terms were
    different from today. The "tran" in Fortran of course stands for
    translation.

    Back then, the word "coding" stood for taking a program (e.g. written by
    hand on paper in pseudo-code) and turning into to a machine-language
    computer program: among the last steps of programming. Today, we have
    "source code" and producing it is coding.

    The word "automatic coding" denoted the situation when a computer
    was programmed into coding: taking a higher level description of the
    program and trnaslating it to machine language.

    "Compiling" existed; that referred to something that is more like
    "linking" or "loading" today, or perhaps the preparation of an archive containing object files. It had the obvious meaning: sticking together
    routines to create a collection.

    Somehow "compile" came to have the meaning to include the translation
    step too. Perhaps because some of the steps came to be combined into one
    tool invocation.

    "To compile" is an attractive word in that it means putting stuff
    together, but is only used in specialized circumstances. You don't
    usually say that you compiled the clothes after taking them out of the
    dryer, or that you compiled the toppings onto the sandwich, or that many responsibilities have been compiled upon your shoulders. It's not a
    commonly used word. It is mostly used in the context of combining
    multiple published works, which is a very specific meaning.

    That's the big reason why it was possible to give the word a technical
    meaning is clear to the point that we can use "compile" almost entirely
    out of context (other than it being clear it's a computing context) and
    we know what kind of activity it refers to.

    "To translate" is not so: do you mean C++ to assembly, or English to
    German? Translating what: people translating user interfaces or
    docuemntation to another language? Or the machine translating something? Translate is also a term in English-language mathematics: to displace coordinates. This happens in computing: logical window-relative
    coordinates get translated to a pixel coordinate in the display buffer.
    In memory management, virtual addresses get translated to physical
    addresses.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Kaz Kylheku on Sat Oct 16 20:22:05 2021
    Kaz Kylheku <480-992-1380@kylheku.com> schrieb:

    Back in the day of Grace Hopper working on Fortran, the terms were
    different from today. The "tran" in Fortran of course stands for translation.

    Grace Hopper working on Fortran? Hardly, you probably meant John Backus.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Kaz Kylheku on Sat Oct 16 23:55:12 2021
    On 10/16/21 7:16 PM, Kaz Kylheku wrote:

    "To compile" is an attractive word in that it means putting stuff
    together

    The same applies to "assemble" at machine level.

    I could imagine that at that time the result was more important than sophisticated handling of source code. A portable C compiler also is
    assumed to output executable modules where other compilers rely on a linker.

    DoDi
    [The Bell Labs portable C compiler output assembler source code, although
    most people didn't notice since it normally assembled it and threw the assembler code away. Last time I checked gcc and clang do the same. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Kaz Kylheku on Sun Oct 17 08:37:55 2021
    On Saturday, October 16, 2021 at 10:48:14 AM UTC-7, Kaz Kylheku wrote:

    (snip on the word transpiler)

    It isn't; that's just a word used by some web programming hipsters. Transpilers are everywhere, because browsers are stuck with Javascript
    as their lowest-level target language*, and it sucks so terribly that
    people want to use almost anything else. The bar is quite low; it's easy
    to write toy languages that spit out Javascript, so it has become a kind
    of popular sport, and from there came "transpiling".

    In the 1970's, programs to improve Fortran were common,
    with Ratfor and Mortran as two examples.
    (That is, Fortran IV or Fortran 66.)

    The ones I know were written as macro processors, where macros
    match some strings in the input data, along with arguments, and replace
    them with new strings. At least for the Mortran processor, macros can
    create or modify macros. A fairly simple processor, then, allows for a somewhat complicated language.

    One problem, though, is that such processors don't fully parse
    the input. Syntax errors in the input produce some strange output,
    and strange errors from the final compiler.

    It does seem that there are some macro processors for use with Javascript. [Ratfor used a yacc grammar, which is why early versions of yacc could produce ratfor output. As you note, it didn't understand all of Fortran so it let syntax errors through, which is why I did my PDP-10 hack to put the source
    line numbers in the Fortran output, to help figure out where the bug is.
    I later wrote a full Fortran 77 parser which was awful. No wonder they
    didn't try to do it in ratfor. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to Hans-Peter Diettrich on Sun Oct 17 07:02:30 2021
    On 10/16/21 11:55 PM, Hans-Peter Diettrich wrote:

    [The Bell Labs portable C compiler output assembler source code, although most people didn't notice since it normally assembled it and threw the assembler code away.  Last time I checked gcc and clang do the same. -John]

    I meant the final executable result is (can be) generated from source
    code by a single C compiler invocation. How this result is obtained in
    detail, in how many passes, by how many related tools, is not so obvious
    and of less interest to the user.

    Nowadays dedicated managing tools are available, starting with (batch)
    Make and a number of (interactive) Integrated Development Environments.
    Here the compiler can be recognized as a source code translation part of
    the system, not as the all-embracing process.

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Hans-Peter Diettrich on Sun Oct 17 15:01:02 2021
    On Sunday, October 17, 2021 at 11:27:23 AM UTC-7, Hans-Peter Diettrich wrote:

    (snip on compilers generating assembly source code.)

    I meant the final executable result is (can be) generated from source
    code by a single C compiler invocation. How this result is obtained in detail, in how many passes, by how many related tools, is not so obvious
    and of less interest to the user.

    Unix tradition, and still supported by gcc, is to stop after generating the assembly source file, with the -S option.

    Some compilers allow mixing assembly code in with the source language.
    Seeing the combined result makes it easier to debug. (Or edit the
    generated file before sending it to the assembler.)

    Many compilers that don't write an assemblable output file, will generate
    a pseudo-assembly listing. Enough to figure out what the generated code
    does, but usually nowhere close to input to an assembler.

    I mostly learned OS/360 assembly language reading the generated
    code listings from the Fortran compilers.
    [The code from Fortran G was putrid, but from Fortran H and its successors pretty impressive. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)