• J1 Barrel Processor

    From Christopher Lozinski@21:1/5 to All on Fri Jul 7 22:16:01 2023
    The J1 is an amazing chip. Hard to improve on it, but it does have two weaknesses, both of which could benefit from a sharing economy. One is the shift register, and the other is the lack of math functions.

    With the J1, you get a choice between a small one bit shift register, or large barrel shift register which more than doubles the size of the cpu from 70 to 150 LUTs. I would think that 8 J1 Forth cores could share a barrel shift register. That would
    increase the overall chip size, not by 100%, but by about 12.5 percent, maybe twice that if you add in control logic and networking.

    Of course two cores may want the barrel shift register at the same time, so you would need a pause option and some control logic.

    Here is the line of code which updates the state on the J1.
    { pc, dsp, st0, rsp } <= { pcN, dspN, st0N, rspN };

    The following code change would solve the pause problem.
    case (pause)
    1’b0: //No pause, update the state
    { pc, dsp, st0, rsp } <= { pcN, dspN, st0N, rspN };
    1’b1: // Pause, keep the current state
    { pc, dsp, st0, rsp } <= { pc, dsp, st0, rsp }
    endcase

    The simplest way to do the control logic is with a circulating one. On every clock cycle one of the cores gets access to the barrel shift register. On average a core would have to wait 3.5 clock cycles for access. It is interesting to note that the J1
    CPU, runs at 80 Hz with barely any math functions. The Microcore has all the functions, and runs at 20 hz. So running at 80Hz, and occasionally waiting for 4 cycles is not that bad. An 80/20 MHz Barrel CPU. Of course reality will be a worse than
    that. How does clock speed scale with number of cores?

    And of course the J1’s other problem is it that it has almost no math functions. Worse yet there is no space for additional op codes. In contrast the microCore, focused on real time control has some 82 instructions. https://github.com/microCore-
    VHDL/microCore/blob/master/documents/uCore_instructions.pdf

    So I could imagine a many core j1 barrel processor, with two extra instruction bits, allowing for a bunch of shared math functions. Then on every clock cycle, every core gets exclusive access to about 8 math functions. With 8 such groupings there
    are 64 additional math functions. On every clock cycle, each core would also have access to its own 16 dedicated ALU functions. The larger functions could be shared. The more frequently used instructions would be in the dedicated ALU;s. Popular
    shared functions, such as multiply could even be available twice. Here is the study of instruction frequency.
    https://users.ece.cmu.edu/~koopman/stack_computers/sec6_3.html

    It is interesting to note that the Propeller Parallax does something similar. Every 8th cycle, the 8 cores get access to the shared core memory and the shared Cordic functions.

    I am also mindful that some math functions are faster, and some are slower. Or maybe the math functions are all located far from some of the cores, so by default they should get an extra clock cycle for the signal to propagate. With the pause option,
    the slower math functions could be given 2 clock cycles to execute, or even three. The multi clock cycle instructions could be pipelined.

    So what do you all think? Is this a good idea? Did I learn anything this semester? Would this be a good master’s project? Does anyone have a need for a cpu with 8 forth cores? Over on the AI and robotics discussion group, one person is
    controlling 14 motors, so he could use a 16 core CPU. Then there would be less FPGA development, and more software development, which is presumably faster. And another person is also controlling a lot of things. Details omitted.

    As for me, I finished the classes and exams for my first semester of graduate school in Digital Circuit Design. As a long time software developer, when they first taught us verilog synthesis, my reaction was that it was completely unintuitive. By the
    end of the semester, my verilog term project, a frequency and duty cycle meter got a perfect score. Progress.

    My master’s thesis to build a forth CPU was approved, but there is no way I could design something better than the J1. I remember that I used to think that it was a nutty cpu, why was he mixing jumps and addresses in one instruction. Now I have the
    skills to understand and appreciate it. So there is no point building a single Forth CPU, but a many core J1 barrel processor would be a most reasonable project. I could even reuse some of the 60+ VHDL math functions from the MicroCore.

    Comments? I am still new to all of this stuff. I have not yet built a chip as complex as this proposed one. so your expertise would be most appreciated.
    It is very hard to find people who have any idea what I am talking about.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to Christopher Lozinski on Fri Jul 7 23:41:45 2023
    On Saturday, July 8, 2023 at 1:16:04 AM UTC-4, Christopher Lozinski wrote:
    The J1 is an amazing chip. Hard to improve on it, but it does have two weaknesses, both of which could benefit from a sharing economy. One is the shift register, and the other is the lack of math functions.

    Yes, you are new to the game. What you seem to not understand is that CPUs are not things unto themselves. They exist only in a world of applications. Whether or not a feature is useful on a given CPU, depends on the application. So, what is the
    application setting the context in which you are judging the J1?


    With the J1, you get a choice between a small one bit shift register,

    Yes, a 1 bit shift register is very small. I think you mean a shift instruction that only shifts by one bit. Is there even a register for this? I don't think so.


    or large barrel shift register which more than doubles the size of the cpu from 70 to 150 LUTs.

    If you need a barrel shifter, you have to pay the price. I recall a realization once, that a multiplier was hugely increased in size, by the size of of the required barrel shifter. The barrel shifter is basically a mux, and muxes are very real estate
    intensive, just like the actual multiplier.


    I would think that 8 J1 Forth cores could share a barrel shift register. That would increase the overall chip size, not by 100%, but by about 12.5 percent, maybe twice that if you add in control logic and networking.

    So you have made a decision to have 8 CPUs? What's the application? I wonder if you have read the history of the J1? It was invented to be implemented in an FPGA as an alternative to the Xilinx microblaze soft core, or maybe the nanoblaze, I can't
    recall the names. The smaller one was hard coded to LUTs and registers, while the larger one was inferred HDL. The J1 was faster, used smaller code and did some operations more efficiently. It allowed more functions to be implemented with better
    encoding of images, if I recall. The point is, it was custom designed for the task at hand.

    Subsequent iterations are better for other tasks, but not "better" in any way for the original task. The "perfect" design is one that works and does the required job. There's no way to improve it if it meets all the requirements.


    Of course two cores may want the barrel shift register at the same time, so you would need a pause option and some control logic.

    Yes, the problems of optimization, complexity. A PhD at one of my jobs gave a talk about not optimizing until you know you have a requirement that needs optimization. Why? Because optimization reduces flexibility, increases design time, opens the door
    for more bugs, and increases the cost, both of the current design but also future improvements.

    This can be summarized as, "If it ain't broke, don't fix it".


    Here is the line of code which updates the state on the J1.
    { pc, dsp, st0, rsp } <= { pcN, dspN, st0N, rspN };

    The following code change would solve the pause problem.
    case (pause)
    1’b0: //No pause, update the state
    { pc, dsp, st0, rsp } <= { pcN, dspN, st0N, rspN };
    1’b1: // Pause, keep the current state
    { pc, dsp, st0, rsp } <= { pc, dsp, st0, rsp }
    endcase

    The simplest way to do the control logic is with a circulating one. On every clock cycle one of the cores gets access to the barrel shift register. On average a core would have to wait 3.5 clock cycles for access. It is interesting to note that the J1
    CPU, runs at 80 Hz with barely any math functions. The Microcore has all the functions, and runs at 20 hz. So running at 80Hz, and occasionally waiting for 4 cycles is not that bad. An 80/20 MHz Barrel CPU. Of course reality will be a worse than that.
    How does clock speed scale with number of cores?

    I assume you mean 20 MHz? That's pretty durn slow for a CPU in an FPGA, even one in a barrel processor design.

    Before deciding if an instruction is needed, it is typical to analyze the code you will be running on it. Have you done that? How do you know this instruction needs to be sped up?

    In a barrel processor, more virtual cores are created by adding pipeline stages. But instead of using them to speed up a single processor, they are running different processes/threads or whatever you wish to call them. So there's no need for look-ahead
    or pipeline stalls. Each processor gets it's clock cycle to run the barrel shifter, but the shifter has to run at the full clock speed.

    If you can divide the stages of the CPU finer, and keep the delays balanced, you will see a proportional increase in speed. But at some point, you will have trouble dividing up the logic across the pipeline stages, and one stage will be slower than the
    others. That will cost you aggregate clock speed and so the speed of each process.


    And of course the J1’s other problem is it that it has almost no math functions. Worse yet there is no space for additional op codes. In contrast the microCore, focused on real time control has some 82 instructions. https://github.com/microCore-VHDL/
    microCore/blob/master/documents/uCore_instructions.pdf

    I don't recall the details of the microcore, but it sounds like a lot more complexity. 82 instructions is a lot for a MISC design. Perhaps you are not familiar with the reason for looking at stack processors in FPGAs. It is because they can be made
    very simply, with small instructions. The 'M' stands for "minimal". I believe the GA144 processor has only 32 instructions, as does the b16. The F18A is the processor in the GA144, 144 of them.


    So I could imagine a many core j1 barrel processor, with two extra instruction bits, allowing for a bunch of shared math functions.

    That's 18 bits now, getting to be even less MISC.


    Then on every clock cycle, every core gets exclusive access to about 8 math functions. With 8 such groupings there are 64 additional math functions. On every clock cycle, each core would also have access to its own 16 dedicated ALU functions. The
    larger functions could be shared. The more frequently used instructions would be in the dedicated ALU;s. Popular shared functions, such as multiply could even be available twice. Here is the study of instruction frequency.
    https://users.ece.cmu.edu/~koopman/stack_computers/sec6_3.html

    It is interesting to note that the Propeller Parallax does something similar. Every 8th cycle, the 8 cores get access to the shared core memory and the shared Cordic functions.

    I am also mindful that some math functions are faster, and some are slower. Or maybe the math functions are all located far from some of the cores, so by default they should get an extra clock cycle for the signal to propagate. With the pause option,
    the slower math functions could be given 2 clock cycles to execute, or even three. The multi clock cycle instructions could be pipelined.

    So what do you all think? Is this a good idea?

    When you tell me what your design goal is, I will tell you if any of this is a good idea.


    Did I learn anything this semester? Would this be a good master’s project?

    I wanted to design a simple, yet fast graphics processor from FAST TTL for a master's project. But at U of Md, you don't get to pick a project and do it. You have to work with a professor, which means you work on something they like. I checked the
    areas the professors were working in and decided to take an exam instead.


    Does anyone have a need for a cpu with 8 forth cores?

    I've looked at this before. I have a CPU design that is much more streamlined. Designed 20 years ago, it focused on a small instruction space. I didn't really have a need for muti-cores, so I never pursued it. Being a single clock cycle design (
    including interrupt response time), it would have been easy to pipeline. I probably would have gone with N=4 on the first iteration, to ease the stage balancing issue. At N=8, I think it would be significantly more challenging. At N=16, I'm not sure
    it would run at a much higher processor speed than N=8, but there's no way to know, but to try it.


    Over on the AI and robotics discussion group, one person is controlling 14 motors, so he could use a 16 core CPU. Then there would be less FPGA development, and more software development, which is presumably faster. And another person is also
    controlling a lot of things. Details omitted.

    What is a barrel processor better than multiple single core processors? If you make them small enough, and really fast, they can do math from simple operations. For example, the F18A CPU is 18 bits, but can do multiple precision math. The instructions
    for multiply are 1 bit at a time, but it runs at 700 MIPS, so multiplies are not so slow and the chip has no barrel shifter. You are not going to get 700 MIPS from an FPGA, but you can exceed 100 MIPS, maybe 200. Have you read the chart of CPU
    comparisons? I can never remember the guy's name. He has tabulated around 100 or maybe 200 soft cores with performance specs, both in MIPS and in MIPS/LUT and of course LUTs. Very useful to see what works and what doesn't. Someone here will remember
    his name, or he may read this. If you post in the FPGA group, I bet he will see it and respond, or you can search there. comp.arch.fpga


    As for me, I finished the classes and exams for my first semester of graduate school in Digital Circuit Design. As a long time software developer, when they first taught us verilog synthesis, my reaction was that it was completely unintuitive. By the
    end of the semester, my verilog term project, a frequency and duty cycle meter got a perfect score. Progress.

    HDL is not much like most languages. But that's because concurrency is inherent to HDL, while it is different in other languages. You have to spawn tasks and such. I used Occam for a bit, but I don't recall how it works really. I just know when I
    code HDL, I'm describing what I want the hardware to do, in a way that specifies what the hardware is. Since I came from the hardware side, that's easy for me. I helped a software guy do an HDL project once. He really didn't want to learn my way. He
    had a quick and dirty test project to do (I think it was literally "hello, world") and with lots of LUTs and RAM, he could easily do it with a software approach. He just needed someone to help him over the bumps in the road learning that way. I was
    impressed that he got through it with relatively little trouble. That taught me that my way (describing the hardware I wanted) is not the only way to use HDL.


    My master’s thesis to build a forth CPU was approved, but there is no way I could design something better than the J1.

    Why? "better" is an issue of meeting requirements. Until you have requirements, there is no "better". J1 uses a 16 bit instruction. As I've said, most MISC processors use 8 or even 5 bit instructions. They often pack 5 bit instructions into 16 bit
    words, so they can be read 3 at a time from an external memory. My designs work with internal memory only, but I still like to keep it to 8 or 9 bits, depending on the particular FPGA. Some have 8 bit wide memories, others 9 bits (or multiples).

    It's not a requirement to do everything in single instructions. My CPU tries to optimize the use of memory space, so the instructions are designed to optimally use that space. The smallest instruction is one bit opcode, and 7/8 bits of data, LOAD
    IMMEDIATE (like the prefix instruction in Transputers). The first consecutive use, loads the immediate data onto the return stack, with sign extension. Subsequent uses shift the upper bits and loads in the immediate data into the lower 7/8 bits.
    Notice this imposes no requirements on the size of the registers! You can use the same instructions on an 8 bit, 12 bit, 16 bit, 18 bit, 24 bit, 32 bit data path machine. When the "receiver" instruction is used, it either uses this data as is, or adds
    a few more bits to extend the data more. I think my last iteration added 4/5 bits for the jump/call instructions. They can be used without the prefix for short jumps or calls.


    I remember that I used to think that it was a nutty cpu, why was he mixing jumps and addresses in one instruction. Now I have the skills to understand and appreciate it. So there is no point building a single Forth CPU, but a many core J1 barrel
    processor would be a most reasonable project. I could even reuse some of the 60+ VHDL math functions from the MicroCore.

    Interesting logic. Since the J1 is "perfect", you have given up on designing a new processor. So, instead, you want to multiplex it. What have you learned about pipelining? A barrel processor is essentially a pipelined processor, with different "
    plumbing" for the data items like stacks and registers. Be aware that a block RAM is ideal for multiplexing these.


    Comments? I am still new to all of this stuff. I have not yet built a chip as complex as this proposed one. so your expertise would be most appreciated.
    It is very hard to find people who have any idea what I am talking about.

    Try comp.arch.fpga. There are a few CPU oriented guys, but not so much stack processors. Most people think in terms of adding complexity, not reducing it.

    --

    Rick C.

    - Get 1,000 miles of free Supercharging
    - Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Koch@21:1/5 to All on Sat Jul 8 16:58:54 2023
    Hi Christopher,

    a barrel processor is a very special case, usually driven by special circumstances. On the math routines, I maintain Mecrisp-Ice which started as a fork of James Bowmans project, and now comes with interrupts, more opcodes, and a nice suite of math
    routines like trigs and FFT. Have a look before you dive deeper!

    https://mecrisp.sourceforge.net/

    Also consider a thesis involving RISC-V:

    https://github.com/BrunoLevy/learn-fpga/tree/master/FemtoRV/RTL/PROCESSOR

    Matthias

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Christopher Lozinski on Tue Jul 11 21:10:15 2023
    On Friday, July 7, 2023 at 10:16:04 PM UTC-7, Christopher Lozinski wrote:
    My master’s thesis to build a forth CPU was approved, but there is no way I could design something better than the J1. I remember that I used to think that it was a nutty cpu, why was he mixing jumps and addresses in one instruction.
    Now I have the skills to understand and appreciate it. So there is no point building
    a single Forth CPU, but a many core J1 barrel processor would be a most reasonable project. I could even reuse some of the 60+ VHDL math functions from the MicroCore.

    The world is full of maintenance programmers who have realized that they
    are too incompetent to ever write a program of their own, so their new plan
    is to find source-code written by a real programmer and make some modification to it, then claim that they are smarter than the original programmer.
    I don't have any respect for maintenance programmers.
    AFAIK, nobody in the world has respect for maintenance programmers.

    Go back to your original plan of designing something of your own.
    It might have weaknesses (first efforts usually do have weaknesses).
    Trolls might even describe it as a pile of crap --- but it will be your
    pile of crap --- you can have something of your own to be proud of.

    It is better to be a wolf that hunts mice, than a coyote that eats carrion.

    As for your idea regarding the J1, if this were a good idea then
    James Bowman would have already done it.
    I don't think that you have anything to offer.

    If you really want to be a maintenance programmer, then go get
    a job at Testra. You can be a maintenance programmer of MFX
    that I wrote --- then claim to be 10* smarter than I am!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Hugh Aguilar on Wed Jul 12 18:46:13 2023
    On 12/07/2023 2:10 pm, Hugh Aguilar wrote:

    It is better to be a wolf that hunts mice, than a coyote that eats carrion.

    Wolves may not take kindly to men with wolf dysphoria.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to hughaguilar96@gmail.com on Wed Jul 12 13:10:19 2023
    In article <9a308d48-48fc-456a-b6ac-e97d958a5fc4n@googlegroups.com>,
    Hugh Aguilar <hughaguilar96@gmail.com> wrote:
    <SNIP>
    The world is full of maintenance programmers who have realized that they
    are too incompetent to ever write a program of their own, so their new plan >is to find source-code written by a real programmer and make some modification >to it, then claim that they are smarter than the original programmer.
    I don't have any respect for maintenance programmers.
    AFAIK, nobody in the world has respect for maintenance programmers.

    On the contrary. If there is someone who has to maintain a program
    you have written, he has to surpass vastly your mental capacity.
    (That is not to say that is normally the case, it is the kind of person
    you are.)

    Groetjes.
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Christopher Lozinski on Wed Jul 12 12:24:42 2023
    On Friday, July 7, 2023 at 10:16:04 PM UTC-7, Christopher Lozinski wrote:
    The J1 is an amazing chip. Hard to improve on it,
    but it does have two weaknesses...

    My master’s thesis to build a forth CPU was approved, .
    but there is no way I could design something better than the J1.

    There are a lot of amazing processors, all of which have various
    weaknesses and strengths, and a focus on specific applications.
    Way back in 1994, Testra built their MiniForth processor on the
    Lattice isp1048 PLD that was for motion-control, specifically
    a laser etcher. I wrote MFX, the assembler/simulator and Forth
    cross-compiler. My assembler did out-of-ordering of the instructions.

    I have a processor design that you might be interested in.
    Whether it is "amazing" or not, remains to be seen.
    This is for the iCE40HX8K FPGA, which I'm told is the least
    expensive FPGA on the market. Your J1 needs a bigger FPGA.
    My design has support for running a byte-code VM in external memory. Internally, I have 6KW of code and 2KW of data. A program could be written
    to run entirely in internal memory, and it would be very fast, but it would have to also be very small. The idea with the byte-code VM is that it
    supports a large program of 64KW code and 64KW or more of data,
    and it supports a RTOS. The byte-code program would be quite slow,
    but the compiler supports FAST functions that compile as machine-code
    in internal memory; judicious use of FAST can boost the speed a lot.
    Mostly though, the speed is boosted because the ISRs are compiled into machine-code in internal memory; with most programs ISRs are the
    speed-critical aspect of the software (Pareto Analysis, except that it is
    more like 95/5 rather than 80/20). Another advantage of supporting
    a byte-code VM is that there can be more than one byte-code VM
    available --- I'm doing one for Forth --- a byte-code VM for C should be provided eventually though because most programmers (in America)
    demand C and won't consider Forth at all.

    I have an assembler and simulator already written and I'm working
    on the Forth now. If you are interested in trying this as your first
    Verilog project, I can show you my design. It is intended to be simple,
    with all of the instructions executing in a single clock-cycle, so it
    should be reasonable as a first effort for you. Contact me at: hughaguilar96@gmail.com
    This offer to see my design only goes out to Christopher Lozinski.
    All of the comp.lang.forth crowd will be ignored because you are
    loyal to Stephen Pelc. I have security measures built in to the design
    to prevent Stephen Pelc from doing a "clean room implementation"
    of my processor the way that he did with the RTX-2000. It is important
    that Stephen Pelc and also Red China should not make money from
    my R&D --- they are intellectual-property thieves --- they deserve to fail.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Christopher Lozinski@21:1/5 to All on Thu Jul 13 10:35:00 2023
    Thank you everyone for your replies.
    I am most interested in learning more about your cpu designs.
    I think that every cpu I look at has some good ideas.
    i am actually doing a talk about forth cpu's at FPGA world in Stockholm in September.
    I will probably repeat it a few other places, as it evolves.
    I will cover the J1 family, Mecrisp ice, MicroCore, EP16/24/32, HowerJ's forth-cpu, and any others you want to suggest. There was one other also mentioned on here recently, I forget the name. He had a very nice video. Or I can stay quiet about your
    design if you prefer.

    I still thank that there is opportunity in working with large numbers of forth cores. I think a grid of Forth cores for image processing would be most interesting. Images are the one computational problem I know of which is really in a plane, so a
    systolic array is most appropriate. Plus it makes for a good demo. I am also busy learning about cordic, quaternions, and image processing in general. I find all of this stuff quite fascinating.

    Plus I am still finishing up a ton of homework. The semester ended, but a bunch of us are still doing the assignments. So sorry for the slow reply.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian Fox@21:1/5 to Christopher Lozinski on Thu Jul 13 13:09:05 2023
    On Thursday, July 13, 2023 at 1:35:02 PM UTC-4, Christopher Lozinski wrote:
    Thank you everyone for your replies.
    I am most interested in learning more about your cpu designs.
    I think that every cpu I look at has some good ideas.
    i am actually doing a talk about forth cpu's at FPGA world in Stockholm in September.
    I will probably repeat it a few other places, as it evolves.
    I will cover the J1 family, Mecrisp ice, MicroCore, EP16/24/32, HowerJ's forth-cpu, and any others you want to suggest. There was one other also mentioned on here recently, I forget the name. He had a very nice video. Or I can stay quiet about your
    design if you prefer.

    I still thank that there is opportunity in working with large numbers of forth cores. I think a grid of Forth cores for image processing would be most interesting. Images are the one computational problem I know of which is really in a plane, so a
    systolic array is most appropriate. Plus it makes for a good demo. I am also busy learning about cordic, quaternions, and image processing in general. I find all of this stuff quite fascinating.

    Plus I am still finishing up a ton of homework. The semester ended, but a bunch of us are still doing the assignments. So sorry for the slow reply.

    For some historical overview (ie: this is not brand new stuff) you might want to mention RTX2000/2010
    by Harris. It was (is?) used in spacecraft because it is a RAD-hardened CPU. Due to the way it was designed, with the 8Mhz clock it performed 10M Forth instructions per second.

    One amazing spec to me is: 1 cycle sub-routine call and return!

    https://en.wikipedia.org/wiki/RTX2010

    looks like it is re-marketed by Intersil (year 2000) https://www.mouser.com/catalog/specsheets/intersil_fn3961.pdf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to Brian Fox on Thu Jul 13 13:32:08 2023
    On Thursday, July 13, 2023 at 4:09:07 PM UTC-4, Brian Fox wrote:
    On Thursday, July 13, 2023 at 1:35:02 PM UTC-4, Christopher Lozinski wrote:
    Thank you everyone for your replies.
    I am most interested in learning more about your cpu designs.
    I think that every cpu I look at has some good ideas.
    i am actually doing a talk about forth cpu's at FPGA world in Stockholm in September.
    I will probably repeat it a few other places, as it evolves.
    I will cover the J1 family, Mecrisp ice, MicroCore, EP16/24/32, HowerJ's forth-cpu, and any others you want to suggest. There was one other also mentioned on here recently, I forget the name. He had a very nice video. Or I can stay quiet about your
    design if you prefer.

    I still thank that there is opportunity in working with large numbers of forth cores. I think a grid of Forth cores for image processing would be most interesting. Images are the one computational problem I know of which is really in a plane, so a
    systolic array is most appropriate. Plus it makes for a good demo. I am also busy learning about cordic, quaternions, and image processing in general. I find all of this stuff quite fascinating.

    Plus I am still finishing up a ton of homework. The semester ended, but a bunch of us are still doing the assignments. So sorry for the slow reply.
    For some historical overview (ie: this is not brand new stuff) you might want to mention RTX2000/2010
    by Harris. It was (is?) used in spacecraft because it is a RAD-hardened CPU. Due to the way it was designed, with the 8Mhz clock it performed 10M Forth instructions per second.

    One amazing spec to me is: 1 cycle sub-routine call and return!

    https://en.wikipedia.org/wiki/RTX2010

    looks like it is re-marketed by Intersil (year 2000) https://www.mouser.com/catalog/specsheets/intersil_fn3961.pdf

    Doing things in one clock cyle is not hard. You simply have to make that a design goal. My processor design does everything in one clock cycle, including handling interrupts. I don't claim the processor is a "Forth" processor, because that is a
    meaningless statement. Any processor that is capable of emulating the Forth architecture will run Forth. The RTX is no more a Forth processor than a Pentium or an ARM would be.

    A subroutine requires just two things, save the next instruction address on the return stack and jump to the address specified, easy to do in one clock cycle. A return is even easier, just jump to the address on the return stack and pop the return stack.


    --

    Rick C.

    + Get 1,000 miles of free Supercharging
    + Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Brian Fox on Thu Jul 13 13:38:28 2023
    On Thursday, July 13, 2023 at 1:09:07 PM UTC-7, Brian Fox wrote:
    On Thursday, July 13, 2023 at 1:35:02 PM UTC-4, Christopher Lozinski wrote:
    Thank you everyone for your replies.
    I am most interested in learning more about your cpu designs.
    I think that every cpu I look at has some good ideas.
    i am actually doing a talk about forth cpu's at FPGA world in Stockholm in September.
    I will probably repeat it a few other places, as it evolves.
    I will cover the J1 family, Mecrisp ice, MicroCore, EP16/24/32, HowerJ's forth-cpu, and any others you want to suggest. There was one other also mentioned on here recently, I forget the name. He had a very nice video. Or I can stay quiet about your
    design if you prefer.

    I still thank that there is opportunity in working with large numbers of forth cores. I think a grid of Forth cores for image processing would be most interesting. Images are the one computational problem I know of which is really in a plane, so a
    systolic array is most appropriate. Plus it makes for a good demo. I am also busy learning about cordic, quaternions, and image processing in general. I find all of this stuff quite fascinating.

    Plus I am still finishing up a ton of homework. The semester ended, but a bunch of us are still doing the assignments. So sorry for the slow reply.
    For some historical overview (ie: this is not brand new stuff) you might want to mention RTX2000/2010
    by Harris. It was (is?) used in spacecraft because it is a RAD-hardened CPU. Due to the way it was designed, with the 8Mhz clock it performed 10M Forth instructions per second.

    One amazing spec to me is: 1 cycle sub-routine call and return!

    https://en.wikipedia.org/wiki/RTX2010

    looks like it is re-marketed by Intersil (year 2000) https://www.mouser.com/catalog/specsheets/intersil_fn3961.pdf

    Way back in 1994 at Testra I suggested first using the MC6812 programmed in C for the motion-control board, mostly because the MC6812 has a fast multiply. Testra only uses Forth, not C, though, so that idea died quickly.
    I then suggested the RTX-2000 that is a "Forth engine." This idea was also nixed
    because the RTX-2000 was too low performance and high price.
    The RTX-2000 is about the same performance as an MC68000 also of 8 Mhz..
    They ended up developing the MiniForth that ran at either 40 Mhz. or 80 Mhz. and executed up to five instructions in a single clock cycle.
    The 40 Mhz. version out-performed the competition's MC68000 board
    programmed in C and cost less --- this is the only case that I'm aware of in which Forth out-performed C and cost less --- everything from Forth Inc. or MPE is under-performing and over-priced.

    The RTX-2000 was a bad choice in 1994, and it is still a bad choice today.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hugh Aguilar@21:1/5 to Lorem Ipsum on Thu Jul 13 13:55:37 2023
    On Thursday, July 13, 2023 at 1:32:10 PM UTC-7, Lorem Ipsum wrote:
    The RTX is no more a Forth processor than a Pentium or an ARM would be.

    Rick Collins doesn't know what he is talking about!
    Of course the RTX-2000 is a Forth processor. The Pentium or ARM are not.

    A subroutine requires just two things,
    save the next instruction address on the return stack and jump to the address specified,
    easy to do in one clock cycle.

    In my design, the subroutine call is two instructions because it does two things.

    A return is even easier, just jump to the address on the return stack and pop the return stack.

    The subroutine return is one instruction because it does one thing.

    As a general rule of thumb, the people who claim that everything is easy
    are those who have not done anything. I think Rick Collins is a total fake. Christopher Lozinski admitted that he has taken one class on the subject
    and is not an expert. Most likely, Rick Collins has also taken one class
    on the subject, got a 'C', and now spends all of his time on comp.lang.forth pretending to be the world's expert on the subject, lording over the
    admitted novices such as Christopher Lozinski. Rick Collins is a total fake!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to Hugh Aguilar on Thu Jul 13 16:54:16 2023
    On Thursday, July 13, 2023 at 4:55:39 PM UTC-4, Hugh Aguilar wrote:
    On Thursday, July 13, 2023 at 1:32:10 PM UTC-7, Lorem Ipsum wrote:
    The RTX is no more a Forth processor than a Pentium or an ARM would be.
    Rick Collins doesn't know what he is talking about!
    Of course the RTX-2000 is a Forth processor. The Pentium or ARM are not.

    As usual, Hugh is off his meds again. There's no definition of a "Forth processor", so no one can say Hugh is wrong or right. My point is, virtually any processor will run the Forth language and so, is just as much a "Forth processor" as the RTX chip.
    Hugh literally can't accept or even understand this.


    A subroutine requires just two things,
    save the next instruction address on the return stack and jump to the address specified,
    easy to do in one clock cycle.
    In my design, the subroutine call is two instructions because it does two things.

    You are not very experienced in FPGA design, so this may be due to that limitation. Or maybe you have some unusual goals in your architecture that forces this choice. I did my design 20 years ago, when FPGAs at reasonable prices were much smaller. So
    I wanted to optimize program size. One of the most used operations in Forth programs is the subroutine call. It makes little sense to turn that into two instructions, especially if the instructions are large, such as many which are 16 bits.


    A return is even easier, just jump to the address on the return stack and pop the return stack.
    The subroutine return is one instruction because it does one thing.

    I've just explained that it does two things.

    1) Jump to address on the top of return stack.

    2) Discard the data on the top of return stack.

    See, I've even counted them for you, 1... 2...

    I think it was Jeff Fox who talked about poor Forth programmers who can't count, and so muck up the stack on exit from a word.


    As a general rule of thumb, the people who claim that everything is easy
    are those who have not done anything.

    And yet, I've designed multiple stack processors for FPGAs and used them in commercial designs.


    I think Rick Collins is a total fake.

    Of course. Your hallucinations are well documented here in c.l.f.


    Christopher Lozinski admitted that he has taken one class on the subject
    and is not an expert. Most likely, Rick Collins has also taken one class

    I've taken zero classes in processor design. It's not a topic that requires a class. Most of the techniques that would be used in a small CPU for an FPGA, don't use complicated features, because that bloats the size and complexity, slowing operation
    and increasing cost. Most people can understand that easily, without spending much time on it. Christopher has some preconceptions of what he wants his processor to be, things that excite him. Nothing wrong with that. But his processor design will
    ultimately be looking for an application it is optimal for, and he may not find one. Hard to say.


    on the subject, got a 'C', and now spends all of his time on comp.lang.forth pretending to be the world's expert on the subject, lording over the admitted novices such as Christopher Lozinski. Rick Collins is a total fake!

    And yet, ever so much more real than that Hugh guy, who does all manner of work in his basement (or maybe his mom's basement) and never gets paid for any of it. I really do feel sorry for the guy. He could do much better for himself as he is clearly a
    very sharp cookie. But until he acknowledges his problems, and gets help for them, he will live life here, in c.l.f., never seeing any of his work come to fruition.

    --

    Rick C.

    -- Get 1,000 miles of free Supercharging
    -- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Brian Fox on Fri Jul 14 16:05:36 2023
    On 14/07/2023 6:09 am, Brian Fox wrote:
    ...
    One amazing spec to me is: 1 cycle sub-routine call and return!

    If it sounds too good to be true, it usually is. '1 cycle' instructions
    sound great compared to CPUs of the 1970's. But it would be comparing
    apples and oranges. My intro into 1 cycle CPUs was AVR 8-bit. It wasn't
    at all positive. I enjoy saving bytes and gaining speed but when every instruction is a minimum of 16-bits and the speed is the same, it's a
    lost cause.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Thu Jul 13 23:51:45 2023
    On Friday, July 14, 2023 at 2:05:41 AM UTC-4, dxforth wrote:
    On 14/07/2023 6:09 am, Brian Fox wrote:
    ...
    One amazing spec to me is: 1 cycle sub-routine call and return!
    If it sounds too good to be true, it usually is. '1 cycle' instructions sound great compared to CPUs of the 1970's. But it would be comparing
    apples and oranges. My intro into 1 cycle CPUs was AVR 8-bit. It wasn't
    at all positive. I enjoy saving bytes and gaining speed but when every instruction is a minimum of 16-bits and the speed is the same, it's a
    lost cause.

    Are you saying the AVR has 16 bit instructions? I've never worked with them at that level.

    --

    Rick C.

    -+ Get 1,000 miles of free Supercharging
    -+ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Lorem Ipsum on Fri Jul 14 18:16:15 2023
    On 14/07/2023 4:51 pm, Lorem Ipsum wrote:
    On Friday, July 14, 2023 at 2:05:41 AM UTC-4, dxforth wrote:
    On 14/07/2023 6:09 am, Brian Fox wrote:
    ...
    One amazing spec to me is: 1 cycle sub-routine call and return!
    If it sounds too good to be true, it usually is. '1 cycle' instructions
    sound great compared to CPUs of the 1970's. But it would be comparing
    apples and oranges. My intro into 1 cycle CPUs was AVR 8-bit. It wasn't
    at all positive. I enjoy saving bytes and gaining speed but when every
    instruction is a minimum of 16-bits and the speed is the same, it's a
    lost cause.

    Are you saying the AVR has 16 bit instructions? I've never worked with them at that level.

    8-bit operations. 16-bit long instructions. A few instructions work
    with register pairs (16-bit operation).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian Fox@21:1/5 to dxforth on Fri Jul 14 07:07:03 2023
    On Friday, July 14, 2023 at 2:05:41 AM UTC-4, dxforth wrote:
    On 14/07/2023 6:09 am, Brian Fox wrote:
    ...
    One amazing spec to me is: 1 cycle sub-routine call and return!
    If it sounds too good to be true, it usually is. '1 cycle' instructions sound great compared to CPUs of the 1970's. But it would be comparing
    apples and oranges. My intro into 1 cycle CPUs was AVR 8-bit. It wasn't
    at all positive. I enjoy saving bytes and gaining speed but when every instruction is a minimum of 16-bits and the speed is the same, it's a
    lost cause.

    Perhaps a difference worth considering with RTX is in some cases
    multiple instructions can fetched in one 16 bit read and then operate
    in parallel in the processor. The room to do that was because
    there are no bits required for register selection and the instruction set
    is very small. This did require a smarter compiler however.

    Generally in my experience call/return on a register machine
    uses many more clocks than one, because you have to push/pop
    registers most of the time.

    But as with all things in engineering there is no free lunch.
    "Fast, Good, Cheap. Pick two"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to Brian Fox on Fri Jul 14 12:27:36 2023
    On Friday, July 14, 2023 at 10:07:05 AM UTC-4, Brian Fox wrote:
    On Friday, July 14, 2023 at 2:05:41 AM UTC-4, dxforth wrote:
    On 14/07/2023 6:09 am, Brian Fox wrote:
    ...
    One amazing spec to me is: 1 cycle sub-routine call and return!
    If it sounds too good to be true, it usually is. '1 cycle' instructions sound great compared to CPUs of the 1970's. But it would be comparing apples and oranges. My intro into 1 cycle CPUs was AVR 8-bit. It wasn't
    at all positive. I enjoy saving bytes and gaining speed but when every instruction is a minimum of 16-bits and the speed is the same, it's a
    lost cause.
    Perhaps a difference worth considering with RTX is in some cases
    multiple instructions can fetched in one 16 bit read and then operate
    in parallel in the processor. The room to do that was because
    there are no bits required for register selection and the instruction set
    is very small. This did require a smarter compiler however.

    TI designed a "VLIW" DSP processor that is actually 8 processors, each with a 32 bit instruction word. I don't see this as VLIW myself. VLIw was originally intended to be an instruction format that had very little encoding. Rather, control points in
    the processor would correspond with a bit in the instruction, minizing the decoding required to effect the instruction.

    This is essentially what most people are doing when they talk about "multiple instructions" per word in stack machines. The b16 is this way, as is the F18A. It improves throughput when you have variable sized instructions or when instructions are
    fetched from external memory. Internally in an FPGA, this is of limited value.

    I believe the RTX did have some capability of parallel execution of some instructions. I know people rave about the "return" bit, a 1-bit field that does a return. It can be used in parallel with any instruction that is not already altering the
    program flow, or the return stack.


    Generally in my experience call/return on a register machine
    uses many more clocks than one, because you have to push/pop
    registers most of the time.

    You have to save the registers you will use.


    But as with all things in engineering there is no free lunch.
    "Fast, Good, Cheap. Pick two"

    Not sure how that trope applies here.

    --

    Rick C.

    +- Get 1,000 miles of free Supercharging
    +- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian Fox@21:1/5 to Lorem Ipsum on Fri Jul 14 21:04:13 2023
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Brian Fox on Sat Jul 15 15:24:10 2023
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.

    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively
    large. Were there a better option Atmel (or a competitor) would have
    used it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Sat Jul 15 00:34:52 2023
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively
    large. Were there a better option Atmel (or a competitor) would have
    used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.

    In FPGAs, I tend to fit the instruction size to the block RAM width, which is typically 1, 2, 4, 8, 9, 16, 18, 32 or 36. I've actually used 8 and 9 bits, with the same instruction set. In 9 bits the immediate data fields get an extra bit which can be
    significant, and there are nearly twice as many opcodes available, which is hard to make use of, really. The 8 bit instruction size provides for more than 16 instructions, if I recall. I don't know what to do with over 32 instructions and this makes
    the decode, etc. more complex. So, mostly the 9 bit instruction has an extra immediate bit and a few extra used instructions.

    When I look at other processors, that seem very effective, they often have only 16 to 32 instructions. The b16 has been well received with a 5 bit instruction word. I don't recall how immediate data is handled, but it's probably like the F18a, which is
    also 5 bit instructions in a 16 (or is it 18?) bit word. The F18A uses either the remainder of the current word, or if it's not large enough, the next instruction word.

    To prevent waste of instruction space, my instruction format has a 1 bit instruction for a literal, which can be cascaded to build up any size operand that you wish, and is the way addresses are extended for the call and jump instructions, which have a 4/
    5 bit immediate field. Addresses are relative, so jumping can be -8 to +7 or -16 to +15 (8/9 bit instructions) without extending the address with literal instructions. This makes looping pretty efficient.

    Sorry for the rambling.

    --

    Rick C.

    ++ Get 1,000 miles of free Supercharging
    ++ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Lorem Ipsum on Sun Jul 16 20:45:48 2023
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively
    large. Were there a better option Atmel (or a competitor) would have
    used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one
    byte, 16-bit push/pops (also one byte). To my mind the latter would have
    been a better model for the flash sizes typically found in AVR devices.
    What use is a 1 cycle instruction CPU if the program doesn't fit.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Sun Jul 16 03:48:45 2023
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively
    large. Were there a better option Atmel (or a competitor) would have
    used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one byte, 16-bit push/pops (also one byte). To my mind the latter would have been a better model for the flash sizes typically found in AVR devices.
    What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.

    --

    Rick C.

    --- Get 1,000 miles of free Supercharging
    --- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Lorem Ipsum on Mon Jul 17 13:21:05 2023
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote:

    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively
    large. Were there a better option Atmel (or a competitor) would have
    used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one
    byte, 16-bit push/pops (also one byte). To my mind the latter would have
    been a better model for the flash sizes typically found in AVR devices.
    What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.

    Current 8-bit CPU's are not memory efficient.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Sun Jul 16 22:38:57 2023
    On Sunday, July 16, 2023 at 11:21:09 PM UTC-4, dxforth wrote:
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote: >>>>>
    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These
    devices have quite small flash yet instruction sizes are relatively >>>> large. Were there a better option Atmel (or a competitor) would have >>>> used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one >> byte, 16-bit push/pops (also one byte). To my mind the latter would have >> been a better model for the flash sizes typically found in AVR devices. >> What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.
    Current 8-bit CPU's are not memory efficient.

    And this is relevant to the conversation in what way? I'm just now following the flow of thought. I mentioned that instruction sized vary, but are mostly powers of two and gave an exception. You made a comment that doesn't seem to flow from that. I
    don't get the connection.

    --

    Rick C.

    --+ Get 1,000 miles of free Supercharging
    --+ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Lorem Ipsum on Mon Jul 17 20:36:22 2023
    On 17/07/2023 3:38 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 11:21:09 PM UTC-4, dxforth wrote:
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote:
    On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote: >>>>>>>
    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These >>>>>> devices have quite small flash yet instruction sizes are relatively >>>>>> large. Were there a better option Atmel (or a competitor) would have >>>>>> used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I think
    some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one >>>> byte, 16-bit push/pops (also one byte). To my mind the latter would have >>>> been a better model for the flash sizes typically found in AVR devices. >>>> What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.
    Current 8-bit CPU's are not memory efficient.

    And this is relevant to the conversation in what way? I'm just now following the flow of thought. I mentioned that instruction sized vary, but are mostly powers of two and gave an exception. You made a comment that doesn't seem to flow from that. I
    don't get the connection.


    It's news to me customers are biased towards instructions being a power (multiple?)
    of two. They buy what's available which at the moment is 1 cycle CPU's. Even if
    customers are aware it may not be best fit, what are they going to do - design their
    own CPU? Manufacturers have customers by the short and curlies.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Mon Jul 17 15:47:58 2023
    On Monday, July 17, 2023 at 6:37:45 AM UTC-4, dxforth wrote:
    On 17/07/2023 3:38 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 11:21:09 PM UTC-4, dxforth wrote:
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote: >>>>>> On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote: >>>>>>>
    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These >>>>>> devices have quite small flash yet instruction sizes are relatively >>>>>> large. Were there a better option Atmel (or a competitor) would have >>>>>> used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I
    think some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one
    byte, 16-bit push/pops (also one byte). To my mind the latter would have
    been a better model for the flash sizes typically found in AVR devices. >>>> What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.
    Current 8-bit CPU's are not memory efficient.

    And this is relevant to the conversation in what way? I'm just now following the flow of thought. I mentioned that instruction sized vary, but are mostly powers of two and gave an exception. You made a comment that doesn't seem to flow from that. I
    don't get the connection.

    It's news to me customers are biased towards instructions being a power (multiple?)
    of two. They buy what's available which at the moment is 1 cycle CPU's. Even if
    customers are aware it may not be best fit, what are they going to do - design their
    own CPU? Manufacturers have customers by the short and curlies.

    There are times you make absolutely no sense. Customers have choice. The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64 bits. Some, when addresses are combined with the opcode, will result in 48 bit instructions, but otherwise, the
    non-binary power sizes are very unusual. The only one I can even think of is the 12 bit instruction PIC.

    So, what are you trying to say?

    --

    Rick C.

    -+- Get 1,000 miles of free Supercharging
    -+- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Mon Jul 17 19:46:50 2023
    On Monday, July 17, 2023 at 10:18:39 PM UTC-4, dxforth wrote:
    On 18/07/2023 8:47 am, Lorem Ipsum wrote:
    On Monday, July 17, 2023 at 6:37:45 AM UTC-4, dxforth wrote:
    On 17/07/2023 3:38 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 11:21:09 PM UTC-4, dxforth wrote:
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote: >>>>>>>> On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote: >>>>>>>>>
    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These >>>>>>>> devices have quite small flash yet instruction sizes are relatively >>>>>>>> large. Were there a better option Atmel (or a competitor) would have
    used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I
    think some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one
    byte, 16-bit push/pops (also one byte). To my mind the latter would have
    been a better model for the flash sizes typically found in AVR devices.
    What use is a 1 cycle instruction CPU if the program doesn't fit. >>>>>
    Sorry, I don't follow what you are trying to say.
    Current 8-bit CPU's are not memory efficient.

    And this is relevant to the conversation in what way? I'm just now following the flow of thought. I mentioned that instruction sized vary, but are mostly powers of two and gave an exception. You made a comment that doesn't seem to flow from that. I
    don't get the connection.

    It's news to me customers are biased towards instructions being a power (multiple?)
    of two. They buy what's available which at the moment is 1 cycle CPU's. Even if
    customers are aware it may not be best fit, what are they going to do - design their
    own CPU? Manufacturers have customers by the short and curlies.

    There are times you make absolutely no sense. Customers have choice. The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64 bits. Some, when addresses are combined with the opcode, will result in 48 bit instructions, but otherwise, the
    non-binary power sizes are very unusual. The only one I can even think of is the 12 bit instruction PIC.

    So, what are you trying to say?
    I've already said it. Current 8-bit CPU's are not memory efficient. Users didn't
    decide that.

    Ok, enjoy.

    --

    Rick C.

    -++ Get 1,000 miles of free Supercharging
    -++ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Lorem Ipsum on Tue Jul 18 12:18:37 2023
    On 18/07/2023 8:47 am, Lorem Ipsum wrote:
    On Monday, July 17, 2023 at 6:37:45 AM UTC-4, dxforth wrote:
    On 17/07/2023 3:38 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 11:21:09 PM UTC-4, dxforth wrote:
    On 16/07/2023 8:48 pm, Lorem Ipsum wrote:
    On Sunday, July 16, 2023 at 6:45:51 AM UTC-4, dxforth wrote:
    On 15/07/2023 5:34 pm, Lorem Ipsum wrote:
    On Saturday, July 15, 2023 at 1:24:15 AM UTC-4, dxforth wrote: >>>>>>>> On 15/07/2023 2:04 pm, Brian Fox wrote:
    On Friday, July 14, 2023 at 3:27:38 PM UTC-4, Lorem Ipsum wrote: >>>>>>>>>
    "Fast, Good, Cheap. Pick two"
    Not sure how that trope applies here.

    I was considering stack machine versus register machine
    advantages/tradeoffs as I wrote it.

    I am sure there is a better trope. I just don't have one.
    In the case of AVR8 there is some sort of trade-off at play. These >>>>>>>> devices have quite small flash yet instruction sizes are relatively >>>>>>>> large. Were there a better option Atmel (or a competitor) would have >>>>>>>> used it.

    Better in what way? Often decisions are made so the product does not appear "goofy". For example, users are biased to hate instruction words that are not powers of 2. I suppose that's because of the typical mixing of data and instructions. I
    think some of the PICs have 12 bit instructions and no mixing. I suppose that would be a Harvard architecture. I can't think of any others.
    ...

    The 8085 had few registers, variable-length instructions as short as one >>>>>> byte, 16-bit push/pops (also one byte). To my mind the latter would have >>>>>> been a better model for the flash sizes typically found in AVR devices. >>>>>> What use is a 1 cycle instruction CPU if the program doesn't fit.

    Sorry, I don't follow what you are trying to say.
    Current 8-bit CPU's are not memory efficient.

    And this is relevant to the conversation in what way? I'm just now following the flow of thought. I mentioned that instruction sized vary, but are mostly powers of two and gave an exception. You made a comment that doesn't seem to flow from that. I
    don't get the connection.

    It's news to me customers are biased towards instructions being a power (multiple?)
    of two. They buy what's available which at the moment is 1 cycle CPU's. Even if
    customers are aware it may not be best fit, what are they going to do - design their
    own CPU? Manufacturers have customers by the short and curlies.

    There are times you make absolutely no sense. Customers have choice. The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64 bits. Some, when addresses are combined with the opcode, will result in 48 bit instructions, but otherwise, the
    non-binary power sizes are very unusual. The only one I can even think of is the 12 bit instruction PIC.

    So, what are you trying to say?

    I've already said it. Current 8-bit CPU's are not memory efficient. Users didn't
    decide that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to gnuarm.deletethisbit@gmail.com on Tue Jul 18 07:27:29 2023
    In article <a262ff14-cd28-4713-875e-73c604b9faa5n@googlegroups.com>,
    Lorem Ipsum <gnuarm.deletethisbit@gmail.com> wrote:
    There are times you make absolutely no sense. Customers have choice.
    The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64
    bits. Some, when addresses are combined with the opcode, will result in
    48 bit instructions, but otherwise, the non-binary power sizes are very >unusual. The only one I can even think of is the 12 bit instruction
    PIC.

    This makes sense because of memories. Memories in multiples of bytes
    (octets) are practical and have gained the upper hand not only in
    hardware but also in software.
    It is unimaginable that the billion euro's investment needed for
    10 bit memories are duplicated.
    This more or less dictates a decision that busses are a multiple of
    8 and consequences for the CPU architectures.

    Rick C.
    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From minforth@21:1/5 to none albert on Mon Jul 17 22:51:03 2023
    none albert schrieb am Dienstag, 18. Juli 2023 um 07:27:34 UTC+2:
    In article <a262ff14-cd28-4713...@googlegroups.com>,
    Lorem Ipsum <gnuarm.del...@gmail.com> wrote:
    There are times you make absolutely no sense. Customers have choice.
    The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64
    bits. Some, when addresses are combined with the opcode, will result in
    48 bit instructions, but otherwise, the non-binary power sizes are very >unusual. The only one I can even think of is the 12 bit instruction
    PIC.
    This makes sense because of memories. Memories in multiples of bytes
    (octets) are practical and have gained the upper hand not only in
    hardware but also in software.
    It is unimaginable that the billion euro's investment needed for
    10 bit memories are duplicated.
    This more or less dictates a decision that busses are a multiple of
    8 and consequences for the CPU architectures.

    Adding: most CPUS in the world are NOT used for computing but for device controls. F.ex. most analog-to-digital or digital-to-analog converters
    have bandwiths that are not powers of two.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to none albert on Tue Jul 18 01:19:51 2023
    On Tuesday, July 18, 2023 at 1:27:34 AM UTC-4, none albert wrote:
    In article <a262ff14-cd28-4713...@googlegroups.com>,
    Lorem Ipsum <gnuarm.del...@gmail.com> wrote:
    There are times you make absolutely no sense. Customers have choice.
    The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64
    bits. Some, when addresses are combined with the opcode, will result in
    48 bit instructions, but otherwise, the non-binary power sizes are very >unusual. The only one I can even think of is the 12 bit instruction
    PIC.
    This makes sense because of memories. Memories in multiples of bytes (octets) are practical and have gained the upper hand not only in
    hardware but also in software.

    You seem to be conflating data memory and program memory. There is no reason for the two to be common, really. Most program storage in smaller MCUs is Flash, while the data is mostly used in RAM. There is no reason to limit flash program storage to
    bytes. There is no "upper hand".


    It is unimaginable that the billion euro's investment needed for
    10 bit memories are duplicated.

    That makes no sense. Memory is no different from logic. If you need 10 bit memory, enter a 10 for bit width in the tool that designs the memory.


    This more or less dictates a decision that busses are a multiple of
    8 and consequences for the CPU architectures.

    Sorry, I don't see your logic.

    --

    Rick C.

    +-- Get 1,000 miles of free Supercharging
    +-- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to dxforth on Tue Jul 18 21:47:25 2023
    On Wednesday, July 19, 2023 at 12:27:01 AM UTC-4, dxforth wrote:
    On 18/07/2023 3:27 pm, albert wrote:
    In article <a262ff14-cd28-4713...@googlegroups.com>,
    Lorem Ipsum <gnuarm.del...@gmail.com> wrote:
    There are times you make absolutely no sense. Customers have choice.
    The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64
    bits. Some, when addresses are combined with the opcode, will result in >> 48 bit instructions, but otherwise, the non-binary power sizes are very >> unusual. The only one I can even think of is the 12 bit instruction
    PIC.

    This makes sense because of memories. Memories in multiples of bytes (octets) are practical and have gained the upper hand not only in
    hardware but also in software.
    It is unimaginable that the billion euro's investment needed for
    10 bit memories are duplicated.
    This more or less dictates a decision that busses are a multiple of
    8 and consequences for the CPU architectures.
    Only issue would be data stored in program memory. FlashForth handles
    it through de-blocking and address munging. To a user, data in program memory appears byte-addressed and word-aligned when in reality it's a different beast altogether.

    No rocket science there. Harvard architectures have been used in many devices, some with mismatched data sizes.

    I design my own processors. I have typically used a 4 or 5 bit instruction word, which never really matches the data paths. In fact, the data paths are not coded in any meaningful way other than a constant at compile time. The instruction set is data
    path agnostic. Read only type memory (even if it's only in use and actually in RAM) can be added to data memory as well as to the instruction space.

    I don't see any constraints, other than that users like to see lots of symmetry. Meanwhile, the very symmetric 68000 and Power PC architectures have faded away to be replaced by the Intel lines. But maybe I shouldn't drag large CPUs into what is
    essentially a small CPU discussion. I'm not so familiar with the various small CPU ISAs. I know some are positively bizarre, like the 1802. I can't recall the heritage of that CPU, but it has to have some history to be so odd.

    --

    Rick C.

    +-+ Get 1,000 miles of free Supercharging
    +-+ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to albert on Wed Jul 19 14:26:57 2023
    On 18/07/2023 3:27 pm, albert wrote:
    In article <a262ff14-cd28-4713-875e-73c604b9faa5n@googlegroups.com>,
    Lorem Ipsum <gnuarm.deletethisbit@gmail.com> wrote:
    There are times you make absolutely no sense. Customers have choice.
    The vast majority of CPUs have instruction sizes of 8, 16, 32, or 64
    bits. Some, when addresses are combined with the opcode, will result in
    48 bit instructions, but otherwise, the non-binary power sizes are very
    unusual. The only one I can even think of is the 12 bit instruction
    PIC.

    This makes sense because of memories. Memories in multiples of bytes
    (octets) are practical and have gained the upper hand not only in
    hardware but also in software.
    It is unimaginable that the billion euro's investment needed for
    10 bit memories are duplicated.
    This more or less dictates a decision that busses are a multiple of
    8 and consequences for the CPU architectures.

    Only issue would be data stored in program memory. FlashForth handles
    it through de-blocking and address munging. To a user, data in program
    memory appears byte-addressed and word-aligned when in reality it's a
    different beast altogether.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to gnuarm.deletethisbit@gmail.com on Wed Jul 19 12:33:12 2023
    In article <f3836786-0a36-4d32-a8c7-481f247ea263n@googlegroups.com>,
    Lorem Ipsum <gnuarm.deletethisbit@gmail.com> wrote:
    <SNIP>
    No rocket science there. Harvard architectures have been used in many >devices, some with mismatched data sizes.

    Harvard architectures with the possibility to write program memory
    are in fact botched Newman architectures.

    I don't see any constraints, other than that users like to see lots of >symmetry. Meanwhile, the very symmetric 68000 and Power PC
    architectures have faded away to be replaced by the Intel lines. But

    The newest development is that the CISCy Intel lines fades
    away quickly to be replaced by the all too symmetric risc-V.
    Users (like me) like that.

    Rick C.

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to none albert on Wed Jul 19 09:25:05 2023
    On Wednesday, July 19, 2023 at 6:33:16 AM UTC-4, none albert wrote:
    In article <f3836786-0a36-4d32...@googlegroups.com>,
    Lorem Ipsum <gnuarm.del...@gmail.com> wrote:
    <SNIP>
    No rocket science there. Harvard architectures have been used in many >devices, some with mismatched data sizes.
    Harvard architectures with the possibility to write program memory
    are in fact botched Newman architectures.

    If you say so.


    I don't see any constraints, other than that users like to see lots of >symmetry. Meanwhile, the very symmetric 68000 and Power PC
    architectures have faded away to be replaced by the Intel lines. But
    The newest development is that the CISCy Intel lines fades
    away quickly to be replaced by the all too symmetric risc-V.
    Users (like me) like that.

    Hmmm... I guess you are relating your dreams. Risc-V seems to be catching on in the MCU world where royalties are anathema, well, at least for the Chinese market. I've not seen any indication of Intel losing market share to the Risc-V in their
    pricier product lines.

    I realize that you like to pull people's legs and often live in a fantasy world. Why do you want to pretend like Risc-V is a significant CPU competing with the Intel lines?

    --

    Rick C.

    ++- Get 1,000 miles of free Supercharging
    ++- Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Koch@21:1/5 to All on Wed Jul 19 21:56:21 2023
    I realize that you like to pull people's legs and often live in a fantasy world. Why do you want to pretend like Risc-V is a significant CPU competing with the Intel lines?

    Haha, wait for it! RISC-V is advancing quickly. Have you seen the announcement that the Debian team plans official riscv64 architecture support in the upcoming version 13 "Trixie"?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lorem Ipsum@21:1/5 to Matthias Koch on Wed Jul 19 16:01:12 2023
    On Wednesday, July 19, 2023 at 3:56:26 PM UTC-4, Matthias Koch wrote:
    I realize that you like to pull people's legs and often live in a fantasy world. Why do you want to pretend like Risc-V is a significant CPU competing with the Intel lines?
    Haha, wait for it! RISC-V is advancing quickly. Have you seen the announcement that the Debian team plans official riscv64 architecture support in the upcoming version 13 "Trixie"?

    Linux already runs on the rPi and many other small processors. Doesn't mean they are competing with Intel in laptops and desktops in any meaningful way.

    I have one of the original rPis. It sucks as a desktop machine, even running just one tab in a browser. Maybe it would be a bit better with an rPi4, but it's not competition for mainstream processors.

    Are you seriously going to run compiles or FPGA simulations on a Risc-V instead of an Intel or AMD processor?

    What is different about the Risc-V compared to the 68000 family or the Power-PC family? Why would it compete when the others did not? Didn't the Power-PC have the full backing of Sun and IBM, yet it still could not keep up? Even Apple switched to
    Intel processors.

    I don't know how Risc-V will do in the mobile market. From what I've read, the reason it is getting traction is simply because there are no royalties to pay. So it appeals to the Chinese market. Have they ever produced anything that was significant in
    the mainstream markets?

    --

    Rick C.

    +++ Get 1,000 miles of free Supercharging
    +++ Tesla referral code - https://ts.la/richard11209

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Matthias Koch on Thu Jul 20 07:52:26 2023
    Matthias Koch <m.cook@gmx.net> writes:
    Have you seen the announcement that the Debian team plans official riscv64 architecture support in the upcoming version 13 "Trixie"?

    Ok, lack of support by Debian explains why the Visionfive Starfive V1
    that we have had for more than a year had a Fedora image to work with.

    Anyway, RISC-V still has quite a bit of catching up ahead of it, but
    it has a lot of mindshare, which helps a lot. I expect, though that
    it will first eat ARM's lunch before eating Intel's and AMD's dinner,
    if that ever happens.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2023: https://euro.theforth.net/2023

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Lorem Ipsum on Thu Jul 20 08:00:50 2023
    Lorem Ipsum <gnuarm.deletethisbit@gmail.com> writes:
    I have one of the original rPis. It sucks as a desktop machine, even runni= >ng just one tab in a browser. Maybe it would be a bit better with an rPi4,=
    but it's not competition for mainstream processors. =20

    It's a lot better with the rPi4. Concerning competetiveness with
    Intel's low-end cores:

    sieve bubble matrix fib fft release; CPU
    0.108 0.135 0.058 0.104 0.039 20230114; Celeron N4500 (Tremont) 2800MHz
    0.236 0.276 0.124 0.260 0.119 20230629; BCM2835 (1500MHz A73, Raspberry Pi 4) 0.519 0.555 0.483 0.797 0.729 20220226; 1GHz U74 (JH7100, Visionfive V1)

    What is different about the Risc-V compared to the 68000 family or the Powe= >r-PC family? Why would it compete when the others did not? Didn't the Pow= >er-PC have the full backing of Sun and IBM, yet it still could not keep up?=

    PowerPC had and still has backing by IBM, but not Sun. It seems that
    in the early 2000s the goals of the three components of the AIM
    consortium diverged. IBM wanted high-performance CPUs aimed at
    supercomputers and video game consoles, Motorola wanted (and produced)
    embedded CPUs, and Apple wanted low-power-consumption CPUs for cheap. Apparently Apple did not want to pay enough for Motorola or IBM to
    invest enough to develop CPUs competetive in the metrics relevant for
    Apple with what Intel was producing for the PC market.

    Even Apple switched to Intel processors.=20

    Maybe it's news to you, but Apple switched from Intel to ARM ("Apple
    Silicon") several years ago.

    I don't know how Risc-V will do in the mobile market. From what I've read,=
    the reason it is getting traction is simply because there are no royalties= to pay.

    The reason RISC-V is gaining traction is because it has academia
    behind it. Commercial architectures have been shackled by
    rights-holders whims, which makes it problematic to use commercial architectures in courses or research projects. So the people at
    Berkeley started RISC-V in order to get rid of the shackles. And many
    in academia choose RISC-V unless there are good reasons to choose
    something else; e.g., I am currently writing a paper where the
    examples use RISC-V. So in 10 years the market will have lots of
    people who know RISC-V.

    So it appeals to the Chinese market.

    Not having to pay royalties (ARM tax) for the architecture appeals to
    all capitalists, not just the Chinese. One example is WD, who AFAIK
    are switching from ARM to RISC-V for their hard disk and SSD
    controllers.

    - anton
    --
    M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
    comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
    New standard: https://forth-standard.org/
    EuroForth 2023: https://euro.theforth.net/2023

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to gnuarm.deletethisbit@gmail.com on Thu Jul 20 19:12:59 2023
    In article <f7a2be1f-9d98-46da-8762-9b4182347045n@googlegroups.com>,
    Lorem Ipsum <gnuarm.deletethisbit@gmail.com> wrote:
    <SNI
    I don't know how Risc-V will do in the mobile market. From what I've
    read, the reason it is getting traction is simply because there are no >royalties to pay. So it appeals to the Chinese market.

    My Forth (ciforth) runs on the RISC-V, under linux and I have tested the programs to control a midi keyboard. (My classic example of a
    sophisticated program).
    It is hard to see that the lithography machines China develops are
    going to be used to produce Intel or even ARM compatible machines.

    Have they ever
    produced anything that was significant in the mainstream markets?
    Dont be ridiculous. All but the most technically advanced stuff
    is produced in China.
    On the more high tech side solar cells and rare earth magnets come to mind.
    I have here a 2 XEON total 56 cores 25 Gbyte RAM HP workstation.
    All but a few parts are produced in China, Korea and Taiwan.

    The more visionary capitalist (Gates, Musk) realize that we are near
    a point of infinite production capacity. Something has to give way.
    We have a choice of total destruction of the original idea of communism.

    Rick C.

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From none) (albert@21:1/5 to m.cook@gmx.net on Thu Jul 20 19:35:25 2023
    In article <u99f56$29sk4$1@dont-email.me>,
    Matthias Koch <m.cook@gmx.net> wrote:

    I realize that you like to pull people's legs and often live in a
    fantasy world. Why do you want to pretend like Risc-V is a significant
    CPU competing with the Intel lines?

    Haha, wait for it! RISC-V is advancing quickly. Have you seen the >announcement that the Debian team plans official riscv64 architecture
    support in the upcoming version 13 "Trixie"?

    Linux's in the armbian sphere are already available for a long time.
    My DshanNezha system has a D1H processor. This System On A CHip has
    a 1000 + page exhaustive documentation which allow me to control
    gpio lines ("blinking leds") and a 30 Khz serial line (midi)
    with ease. Using a 64 bit port of ciforth for RISCV .
    The io library had to be adapted from ARM pi's , but the
    Forth relies only on system calls, and failed no tests.

    The family of 64 bits ciforths has now a new member RISCV https://home.hccnet.nl/a.w.m.van.der.horst/lina.html

    Boards are available. I ordered from https://www.analoglamb.com/
    a reliable supplier. (Warning this board is under Linux boards, not
    RISCV). If you want to experiment, pay 4 euro's more for the
    docking board.
    You don't want to connect to .5 mm gold fingers.
    The total cost is then 30 euro's

    Groetjes Albert
    --
    Don't praise the day before the evening. One swallow doesn't make spring.
    You must not say "hey" before you have crossed the bridge. Don't sell the
    hide of the bear until you shot it. Better one bird in the hand than ten in
    the air. First gain is a cat spinning. - the Wise from Antrim -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)