• Internal representation of algebraic types

    From luserdroog@21:1/5 to All on Mon Jun 7 15:53:58 2021
    What is the common internal representation for algebraic types?
    Say I wanted to add a complex typing system to a simple homebrew
    Lisp interpreter, how would I go about doing it? I suppose the contrite
    answer is "lists", but does anyone have any more detail or references
    to where I can learn about how to build something like this?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Rubin@21:1/5 to luserdroog on Mon Jun 7 18:03:34 2021
    luserdroog <mijoryx@yahoo.com> writes:
    What is the common internal representation for algebraic types?

    For a product type you just have the components, and for a sum type you
    have a cell saying which alternative has been chosen, plus that value.

    The standard (though now old) book about FPL implementation is Simon
    Peyton Jones "The Implementation of Functional Programming Languages":

    https://www.microsoft.com/en-us/research/publication/the-implementation-of-functional-programming-languages/

    R. W. Harper's PFPL is newer and (despite having "Practical" in its
    title) somewhat more theoretical:

    https://www.cs.cmu.edu/~rwh/pfpl/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luca Saiu@21:1/5 to luserdroog on Sun Jun 20 21:25:16 2021
    On 2021-06-07 at 15:53 -0700, luserdroog wrote:

    What is the common internal representation for algebraic types?
    Say I wanted to add a complex typing system to a simple homebrew
    Lisp interpreter, how would I go about doing it? I suppose the contrite answer is "lists", but does anyone have any more detail or references
    to where I can learn about how to build something like this?

    About the general problem of data representation (assuming you care
    about efficiency) I like this old survey, which is still the best I
    know:
    David Gudeman.
    ``Representing Type Information in Dynamically Typed Languages'',
    technical report,
    Department of Computer Science, The University of Arizona, 1993.

    Nowadays some solutions are more convenient than others; for example
    tagging in the low bits is compatible with conservative-pointer-finding
    garbage collection, while tagging in the high bits is not. Some people
    like NaN-coding.

    Algebraic types, as Paul Rubin hints at, are sums of products. The
    general complex case will always require boxed values; but the simple
    case (equivalent to C enumerates) is important in practice and worth optimising: it is possible to arrange for a practically unbounded amount
    of distinct unique objects. In Lisp you should have already built the
    required machinery when implementing Booleans and the empty list, if it
    is a different object.

    Regards,

    --
    Luca Saiu http://ageinghacker.net
    I support everyone's freedom of mocking any opinion or belief, no
    matter how deeply held, with open disrespect and the same unrelented
    enthusiasm of a toddler who has just learned the word "poo".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From luserdroog@21:1/5 to Luca Saiu on Sat Jun 26 16:45:53 2021
    On Sunday, June 20, 2021 at 2:25:18 PM UTC-5, Luca Saiu wrote:
    On 2021-06-07 at 15:53 -0700, luserdroog wrote:

    What is the common internal representation for algebraic types?
    Say I wanted to add a complex typing system to a simple homebrew
    Lisp interpreter, how would I go about doing it? I suppose the contrite answer is "lists", but does anyone have any more detail or references
    to where I can learn about how to build something like this?
    About the general problem of data representation (assuming you care
    about efficiency) I like this old survey, which is still the best I
    know:
    David Gudeman.
    ``Representing Type Information in Dynamically Typed Languages'',
    technical report,
    Department of Computer Science, The University of Arizona, 1993.

    Nowadays some solutions are more convenient than others; for example
    tagging in the low bits is compatible with conservative-pointer-finding garbage collection, while tagging in the high bits is not. Some people
    like NaN-coding.

    Algebraic types, as Paul Rubin hints at, are sums of products. The
    general complex case will always require boxed values; but the simple
    case (equivalent to C enumerates) is important in practice and worth optimising: it is possible to arrange for a practically unbounded amount
    of distinct unique objects. In Lisp you should have already built the required machinery when implementing Booleans and the empty list, if it
    is a different object.

    Regards,


    Thanks a bunch! Paul Rubin's links were closer to what I was asking for,
    but this paper is what I've always wanted but didn't know how to ask about! I've stumbled across, read about, and reinvented many of those techniques
    but without any of the careful analysis of performance costs.

    I've used large wrappers in my PostScript interpreter (trying to take care
    that they're not /too/ large), tagging the low bits in my Lisp interpreter,
    and tagging the high bits in my APL interpreter. But my goals have always
    been just to be cool or clever, with the metric usually being how compact
    the source code could be.

    Probably I spent too much time doing code golf.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)