• In defense of phi nodes

    From Elijah Stone@21:1/5 to All on Mon Apr 26 16:35:25 2021
    A conflict, in compiler IL design:

    1. We would like to use SSA, to enable fancy optimizations.
    2. We would like to allow the value of a register to depend on a branch.

    The latter constraint necessitates the existence of multiple assignments,
    while the former necessitates that each register be assigned to only once.
    How to resolve this?

    (Note: the syntax a { b... } [c] creates a block named a, whose body
    comprises the instructions b, and which is terminated by the instruction
    c.)


    Solution the first: memory locations.

    Consider this (somewhat contrived) implementation of the logical NOT:

    first {
    result_location ← alloca 1
    } [branch input=0, was_zero, was_one]
    was_zero {
    store result_location, 1
    } [jmp end]
    was_one {
    store result_location, 0
    } [jmp end]
    end {
    result ← load result_location
    } [ret result]

    This is the most horrible, naive possible implementation. An IR using SSA
    but requiring this implementation would probably be slower than one which simply allowed mutable registers. Not much more to say here.


    Solution the second: optionally mutable registers.

    I don't know of any compiler which actually does this, and it doesn't seem particularly worthwhile, but it is something I thought of. (Though I do
    think one of gcc and llvm were experimenting with mutable registers
    recently, maybe?)

    We could allow some registers to be mutable, and some to be declared
    statically immutable, as in:

    first {
    result ← mutable byte
    } [branch input=0, was_zero, was_one]
    was_zero {
    result ← 1
    } [jmp end]
    was_one {
    result ← 0
    } [jmp end]
    end {} [ret result]

    This solves _some_ of the problems with the first implementation (and
    indeed would likely produce optimal code for this example), but the
    possibility of a register's being mutable still inhibits optimizations.
    On to the solutions people actually use:


    Solution the third: block parameters.

    Here, we allow blocks to pass each other arguments, as when calling
    functions. The values of these arguments can be different in different invocations of the same block, but registers still cannot be mutated.
    Example:

    first {} [branch input=0, was_zero, was_one]
    was_zero {
    tmp0 ← 1
    } [jmp end, tmp0]
    was_one {
    tmp1 ← 0
    } [jmp end, tmp1]
    end(value) {} [ret value]


    Solution the fourth: phi nodes.

    A 'phi' is a special instruction whose value is that of _whichever of its operands was assigned to most recently_. An example is worth a thousand abstract pontifications, so:

    first {} [branch input=0, was_zero, was_one]
    was_zero {
    result0 ← 1
    } [jmp end]
    was_one {
    result1 ← 0
    } [jmp end]
    end {
    result ← phi result0, result1
    } [ret result]

    If the input was 0, then control flow goes through was_zero and not
    was_one, so result0 will have been assigned to recently (and result1 not
    at all), so the final result will be result0--that is, 1.


    Block parameters and phi nodes are equivalent; it's always possible to losslessly translate between the two. Phi nodes were discovered first, in
    the 1980s, and remain in broader use, while block parameters are a newer
    (and somewhat more fashionable) innovation.

    The common arguments in favour of block parameters are that they are
    easier to reason about (I didn't need nearly so much explanation for the
    block parameter example than for the phi node example!) and that they
    improve locality.

    Block parameters do _increase_ locality, but I don't think they improve it
    so much as they obscure essential nonlocality. Both phi nodes and block parameters express a relationship between multiple registers. Phi nodes
    place the expression of that relationship at the genesis of the
    'destination' register, while block parameters scatter the expression of relationship, not tying it clearly to either the 'destination' register or
    the 'source' registers.

    And this ties more ultimately into what phi nodes and block parameters
    enable, which is _principled mutability_. Pure static _single_ assignment
    is not sufficient to express all algorithms efficiently, so we need a
    limited way to assign to a register multiple times. (Perhaps we should
    rather call it Statically Enumerated Assignment?) In both schemes, there
    is a special class of register--either block parameters, or those
    registers whose geneses are phi nodes--which are assigned to multiple
    times. Phi nodes provide a focal point which enumerating those
    assignments (in the form of proxy registers).

    Thoughts?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)