• By Popular Demand

    From Quadibloc@21:1/5 to All on Wed Jan 24 16:14:46 2024
    A very common comment I have receieved from several people on my Concertina II ISA is that making the instruction stream vlock structured is a mistake.

    However, computers having a VLIW architecture do normally have a block structured instruction scheme, with the block being the very long instruction word. While I've included VLIW functionality in Concertina II and Concertina III, this has been to increase performance in some implementations, and, thus, is a relatively minor part of the ISA.

    What I've come up with now has the following characteristics:

    The normal instruction set no longer has block structure, it's been squeezed enough to go without that, and provide variable-length instructions.

    But one can also choose to run in VLIW mode; then, the instruction stream
    is divided into blocks of eight 32-bit instructions, with one block header
    to indicate instruction predication.

    So block structure is only present where it belongs. VLIW code can't be distinguished from normal code by using the block header because
    instructions in normal code can cross block boundaries, so the second
    half of a 32-bit instruction could look like the start of a block header.

    Is this worthwhile, I wonder...

    John Savard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to All on Wed Jan 24 20:05:23 2024
    Block structure is only applicable to 1-width of execution
    and fails for all other widths.....

    So the question becomes:: is your architecture designed for exactly
    one width of execution ???

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Quadibloc@21:1/5 to All on Thu Jan 25 01:47:07 2024
    On Wed, 24 Jan 2024 20:05:23 +0000, MitchAlsup1 wrote:

    Block structure is only applicable to 1-width of execution
    and fails for all other widths.....

    So the question becomes:: is your architecture designed for exactly
    one width of execution ???

    Well, the VLIW mode is designed for eight-wide execution. But it
    can also work well with four-wide or two-wide, I would think, since
    it could still specify more efficient execution for those.

    But this new design is about _abandoning_ block structure *except*
    for VLIW programs. No more pseudo-immediates. 16-bit instructions
    no longer come in pairs.

    And here it is now:

    http://www.quadibloc.com/arch/ct20int.htm

    So basically I have finally taken your advice to dump block structure,
    except that Concertina IV still offers VLIW in addition to CISC and
    RISC; but now, VLIW is separate so the CISC/RISC instruction set is
    no longer disfigured by block structure.

    John Savard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Quadibloc@21:1/5 to Quadibloc on Thu Jan 25 08:22:08 2024
    On Thu, 25 Jan 2024 01:47:07 +0000, Quadibloc wrote:

    On Wed, 24 Jan 2024 20:05:23 +0000, MitchAlsup1 wrote:

    Block structure is only applicable to 1-width of execution
    and fails for all other widths.....

    So the question becomes:: is your architecture designed for exactly
    one width of execution ???

    Well, the VLIW mode is designed for eight-wide execution. But it
    can also work well with four-wide or two-wide, I would think, since
    it could still specify more efficient execution for those.

    With Concertina II in its various incarnations, if a header left
    seven instructions in a block, I provided _six_ break bits with
    them, as there was always a break before the first one because the
    instructions needed to be fetched.

    With Concertina IV, on the other hand, the break bit happens to be
    the first bit of every instruction. So, while the block length of
    eight instructions controls the format of the header that provides
    predication, there actually would be nothing stopping an implementation
    from treating that as simply a notational convention for predication...
    and fetching and executing twelve instructions at a time.

    Although presumably the compiler would take the issue width of the
    target machine into account. So Concertina IV isn't necessarily
    block structured in a way that limits the issue widths it can
    work with, although that's just a happy accident, not something I
    intended. And, of course, memory is simpler if powers of two are
    fetched and executed.

    Remember: now the only header is for predication. No longer is
    decoding profoundly changed by some possible header values.

    John Savard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Quadibloc on Sun Feb 25 20:41:00 2024
    In article <uord1m$1s1mn$1@dont-email.me>, quadibloc@servername.invalid (Quadibloc) wrote:

    But one can also choose to run in VLIW mode; then, the instruction
    stream is divided into blocks of eight 32-bit instructions, with
    one block header to indicate instruction predication.

    How does one enter or leave VLIW mode? Is the mode part of an execution context? Can code running in VLIW mode call non-VLIW code, and vice-versa?


    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)