• The Edison language and "personal software system" (1982)

    From Lasse =?iso-8859-1?q?Hiller=F8e?= P@21:1/5 to All on Mon Nov 15 12:37:38 2021
    One book that has inspired me more than any other books on the topic of programming languages, is _Programming A Personal Computer_ by Per Brinch- Hansen. This book is a gem. In 400 pages, Brinch-Hansen
    - documents a complete OS for the PDP-11 (also ported to the IBM PC).
    - documents the language Edison, in which the OS is written.
    - discusses all phases of the compiler at length, with full source code.
    - provides an extensive and very convincing argument for his style of
    language design, and the particular choices taken in designing Edison.
    - provides a compelling guide to _writing_ such a documentation, with
    much good advice from Peter Naur.

    I was very glad when I recently found out that the book was available as
    PDF here:
    http://pascal.hansotten.com/per-brinch-hansen/
    (The website also has a lot of other interesting information.)

    I read excerpts of his earlier work on Concurrent Pascal for my second
    year (fourth semester) course on concurrent programming, in 1988. And
    years before that, his book _On Pascal Compilers_ was the second book I
    read about writing compilers (the first was Wirth's _Algorithms+Data Structures=Programs_, which I read when in 9. grade in 1983-84.)

    Edison as a language is strongly guided by a number of principles. One is
    to design a language by taking a good existing one, and remove as much as possible, then add what you need extra. Edison is based on Pascal, like
    so many other languages, but it differs quite a lot (from original
    Pascal).
    - It has modules.
    - It uses bracketed control syntax (Algol68/Modula-2/Ada/...) instead of
    begin ... end compound statements. (This fixes Pascal's "dangling else" problem.)
    - It has the basic scalar types of Pascal: bool(ean), char, int(eger) and Enumerated (being a systems language, Reals have been omitted),
    composable using sets, records and arrays.
    - It has statements for concurrent execution and for critical regions.
    - It simplifies type equivalency (a confusing part of Pascal) by
    excluding integer ranges and requiring all types to be named, resulting
    in types being equivalent when having the same name and scope.
    - Pointers are deemed unnecessary and omitted! There is no dynamic memory allocation!
    - Same goes for unions (he achieves the same with an enumerated field in
    a record indicating the actual type, and typecasting record variables),
    case statement and for loops.
    - Functions can return composite types.

    Fun fact (fun pun intended): He argues: "In spite of these refinements, I
    would omit functions if I had to design another language", using the same motivation as functional purists, but with the radically opposite
    conclusion, keeping the assignment statement instead. (Procedure
    parameters are a vital part of the system, but by not having procedures
    as first class values that can be assigned or returned, the problem of
    upward funargs is avoided.)

    Using simple symbolic intermediate languages (ordinal values of
    enumerated types IIRC) between all the phases of compilation, a program compiles into "Edison code", which is then interpreted by a threaded interpreter in the kernel, a small piece of PDP-11 code written in Alva,
    an alternative assembler notation for PDP-11 instructions.

    Edison is definitely minimalistic and probably not very efficient, and
    the "operating system" is very primitive, including just a very simple
    editor and some tools and utilities for formatting, prettyprinting, and "housekeeping" like disk formatting, backup, file copy, disk catalog etc.
    Yet it _is_ a complete system, what Brinch-Hansen refers to as a
    "personal software system", which can be studied and fully understood in
    all details by one person in a reasonable amount of time, and also ported
    to another machine type very quickly.

    We talked about declaration endianness in the PLZ thread, curiously, Brinch-Hansen makes an argument about type declarations that Pascal needs
    to read two symbols (the type name being declared, and an equals symbol)
    before knowing what kind of type is being declared, and opts for a left-
    ended notation, not unlike how variables are declared in non-Pascal Algol descendants:
    enum fruit (apple, banana, canteloupe);
    set fruitsalad (fruit)
    array line [1:80] (char)

    Why he insists on the completely redundant parentheses around the element
    type of an array I donẗ know. It seems a "glitch" somehow, or perhaps is
    is to emphasise the notion of containment and make it similar to the
    other forms. The reason he keeps Pascal variable declarations using var
    is probably because it makes LL(1) recursive descent parsing simpler. The Edison compiler, which uses a separate pass for lexing, obviously cannot
    have any interaction between the lexer and the symbol table. He _does_
    cheat once though, in the syntax checking phase 2, where he uses the
    symbol table to discern between a name (identifier) starting a procedure
    call statement or a variable assignment statement... nobody's perfect. :-)

    All in all an astounding accomplishment, with well argued design choices,
    and a very good read. His style is very dry, but the clarity and
    conciseness makes it easy reading.

    /Lasse

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)