Forum: >>> Magnum BBS <<<

GA Misc chips. Rate you can address and execute off an external Srams?

From Wayne morellini@21:1/5 to All on Wed Aug 3 07:42:33 2022

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to
synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Wayne morellini on Wed Aug 3 09:47:12 2022

On Wednesday, August 3, 2022 at 10:42:34 AM UTC-4, Wayne morellini wrote:

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to

synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

What type of memory are you referring to?

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to Wayne morellini on Wed Aug 3 10:03:14 2022

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

..use an Sram chip

..

Static ram.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Wayne morellini on Wed Aug 3 13:48:32 2022

On Wednesday, August 3, 2022 at 1:03:15 PM UTC-4, Wayne morellini wrote:

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

..use an Sram chip

..

Static ram.

Static rams are getting harder to come by, but they all (the engineering qualified "all") have one thing in common, they use very simple timing to control operation, using no clocks. In a generic SRAM, the reads are purely asynchronous, only requiring
the WE to remain deasserted and an address to be set up. Then X ns after the last signal has changed, the output data will be stable and can be read.

A write is a bit more work. You have to set up the data and address, wait some minimum time which may be zero (and may be different for the address and data), not sure, and assert the WE. Wait some minimum time (a similar time to the read access time
usually) and deassert the WE.

Sometimes there is a read enable that simply tristates the data out, or other times it also is involved in the write cycle timing. Keep it deasserted when not reading and it should not be an issue.

Bottom line is, you need to check the data sheets.

One big problem is the temperature dependence of the F18A timing on temperature, voltage and process. So you have to leave wide margins in your timing calculations.

I'll let you map this to GA144 chip timings. They never specified the memory interface in a way that allows it to be used for a memory interface. It was actually intended to be used with SDRAM which is much lower power and lower cost. It's hard to
even find much SRAM these days and what's out there is very expensive. Check Digikey.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Rick C on Wed Aug 3 16:34:50 2022

On Wednesday, August 3, 2022 at 4:48:33 PM UTC-4, Rick C wrote:

On Wednesday, August 3, 2022 at 1:03:15 PM UTC-4, Wayne morellini wrote:

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

..use an Sram chip

..

Static ram.

Static rams are getting harder to come by, but they all (the engineering qualified "all") have one thing in common, they use very simple timing to control operation, using no clocks. In a generic SRAM, the reads are purely asynchronous, only requiring

the WE to remain deasserted and an address to be set up. Then X ns after the last signal has changed, the output data will be stable and can be read.

A write is a bit more work. You have to set up the data and address, wait some minimum time which may be zero (and may be different for the address and data), not sure, and assert the WE. Wait some minimum time (a similar time to the read access time

usually) and deassert the WE.

Sometimes there is a read enable that simply tristates the data out, or other times it also is involved in the write cycle timing. Keep it deasserted when not reading and it should not be an issue.

Bottom line is, you need to check the data sheets.

One big problem is the temperature dependence of the F18A timing on temperature, voltage and process. So you have to leave wide margins in your timing calculations.

I'll let you map this to GA144 chip timings. They never specified the memory interface in a way that allows it to be used for a memory interface. It was actually intended to be used with SDRAM which is much lower power and lower cost. It's hard to even

find much SRAM these days and what's out there is very expensive. Check Digikey.

I almost forgot. The fact that the memory interface was intended for SDRAM, means the address bus is only 18 bits wide I believe. So unless you cobble more nodes into the equation, you will be limited to a quarter MW of memory. So it gets even more
hairy.

Does Green Arrays not have an app note on this?

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to gnuarm.del...@gmail.com on Thu Aug 4 06:13:07 2022

On Thursday, August 4, 2022 at 6:48:33 AM UTC+10, gnuarm.del...@gmail.com wrote:

On Wednesday, August 3, 2022 at 1:03:15 PM UTC-4, Wayne morellini wrote:

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

..use an Sram chip

..

Static ram.

Static rams are getting harder to come by, but they all (the engineering qualified "all") have one thing in common, they use very simple timing to control operation, using no clocks. In a generic SRAM, the reads are purely asynchronous, only requiring

the WE to remain deasserted and an address to be set up. Then X ns after the last signal has changed, the output data will be stable and can be read.

A write is a bit more work. You have to set up the data and address, wait some minimum time which may be zero (and may be different for the address and data), not sure, and assert the WE. Wait some minimum time (a similar time to the read access time

usually) and deassert the WE.

Sometimes there is a read enable that simply tristates the data out, or other times it also is involved in the write cycle timing. Keep it deasserted when not reading and it should not be an issue.

Bottom line is, you need to check the data sheets.

One big problem is the temperature dependence of the F18A timing on temperature, voltage and process. So you have to leave wide margins in your timing calculations.

I'll let you map this to GA144 chip timings. They never specified the memory interface in a way that allows it to be used for a memory interface. It was actually intended to be used with SDRAM which is much lower power and lower cost. It's hard to even

find much SRAM these days and what's out there is very expensive. Check Digikey.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

Well, thanks for this. I presumed it was sram, as I stopped following it years back.

I was was trying to ask, in the real world what was the maximum, as I figured
somebody might have such experience.

Once I have such measurements, I can work out the ceiling and estimate how much
less it will be for various chips and what I want. Which ball parks the possibility, or
impossibility, of different scenarios, and see if it's worth it (to go further)

I choose sram because it should be the simplest to drive to higher speeds. Admittedly,
a prefetch next word would simplify performance a little. But, these are only peak
values, and have to be trimmed to use many PSram memories (pseudo sram which is
dram with its own internal timing and refresh which presented by an external Srams
interface. Sram is only an ideal starting point, I don't know if there is any Psram that
can keep up, but it's out there and common. I'll just start looking for something which
Can saturate the soft bus and work downwards. 700mhz would be ideal, but I'm reality,
If it's not on chip, I doubt I'm going fund that I'm the market place cheap, or at all.

The real situation is that the GA memories only allow basic functionality for that project,
but the alternatives also have compromises, like using miniature MCU boards bigger
than a GA system with external memory. So a GA with high speed memory
makes sense and offers a lot of functionality. The 18 bit restriction is fine, and I would
look at bank switching anyway. A full speed execute bus and more memory per node,
would be great, but enough sired of the soft bus would do a lot.

I was sizing up an ATiny85 to fit it in an RCA plug today, but need a way to make a little
board, or bendable board, to fit in there. I'm wondering about the possibility of using
connectors instead of a board.

I must say, that the industry is missing out by not making a single or dual, or more,
channel serial bus memory format, as the mobile serial is hitting over 20 Gb/s. This
allows a serial io interface with as many of these memories on the chip circumference
as processors there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to All on Sun Aug 7 07:57:06 2022

There is supposed to be a B version of the GA architecture coming, but I hope for
something that has hardware versions of the execution bus (and serial to parralel bus
memory version with timing at highest speed possible) and DMA io/distributor, with
new internode communications.

If they could have an option to package large Psram memory and storage dies in the same
package, for the developer chips, that would be great. The chip is so low heat, they
could vertically stack them. We will see what's next.

I'm going come out with it, I was looking at the possibility to mount such chips on-top or
below the GA144 package, to reduce overall system size, and have the circuit board
mounted likewise, to keep the package close to 1cm by 1cm. I just had another idea,
but deleted it, and another one suitable for the microcontroller board
market. But, it might be possible affix the chips and circuits in a millimetre or so
on the chip package face. What is the best low cost way to do it?

I'm actually looking for the smallest prototyping board possible for this package.

I still can't understand why interpreted execution through a software bus has to be slow
(I imagined it had a lot to do with maximum up data rate. Anybody know?). You are
basically sending a 18 bit address value and retrieving a 18 bit data value and branching
to a routine on each shift to get instruction on its own, which might be duplicate, shift to
align lookup table structure to branch to operations, then duplicate and repeat for next
instruction until you finish the 18 bit sequence. So, that is slow.

But, if only you could load 18 bits in like serial instructions and execute, but then it has
no context for any branch in the scheme. You would have to capture those branches
and redirect to the right place, slowing things down, if at all possible (I don't imagine
serial executing supports branching (haven't looked at it for years). So, your routine
then has to be load vm version of instruction word, execute instructions through local
branches, and if a branch, then interpret that branch according to large address space.

So, a thought last night, is to code everything as particles in large memory, execute by
small address space in the particle, then address larger address space like you would
in bank switching. Hard to program, but highest throughput. Now, can you use
a second node to handle addressing to speed things up (well, you have to, as you, you
don't have a second 18 lines on the first node, to my understanding)? Or would it
be better to use the second node in parralel to run a second parralel memory
access? Which is dual chip or dual port memory. So, here we might get 325+MHz, and a
lot of mucking around. I don't know this chip to be able to tell if such port execution is
possible at speed, and if the particle up particle transfer can be fine at speed.

If only the 6 bit port on the chip was mapped to the external address bus pins,
then it all automatically resolves addressing within the particle, and a software
write to address pins produces a particle bank swap. Now, you just have to
store the current particle address to start to produce a call return value mechanism.
So, that maybe several memory cycles to do calls to word routine in any particle in
the large memory. Programming to stay within a particle as much as possible
reduces the average time used by that mechanism. But, If you treat each 64 word
particle as a call without the ability to externally address internal word routines, you
reduce this amount even further.

Please, external Psram interface (or whatever guess full speed cheap) and serial to
parrallel version (I forget which embedded standard has this, but a few serial ports
that support this would be great to distribute data stores around the chip
circumference). Still 128 words of internal sram would also help (even 512)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to gnuarm.del...@gmail.com on Sun Aug 7 07:52:00 2022

On Thursday, August 4, 2022 at 6:48:33 AM UTC+10, gnuarm.del...@gmail.com wrote:

On Wednesday, August 3, 2022 at 1:03:15 PM UTC-4, Wayne morellini wrote:

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

..use an Sram chip

..

Static ram.

Static rams are getting harder to come by, but they all (the engineering qualified "all") have one thing in common, they use very simple timing to control operation, using no clocks. In a generic SRAM, the reads are purely asynchronous, only requiring

the WE to remain deasserted and an address to be set up. Then X ns after the last signal has changed, the output data will be stable and can be read.

A write is a bit more work. You have to set up the data and address, wait some minimum time which may be zero (and may be different for the address and data), not sure, and assert the WE. Wait some minimum time (a similar time to the read access time

usually) and deassert the WE.

Sometimes there is a read enable that simply tristates the data out, or other times it also is involved in the write cycle timing. Keep it deasserted when not reading and it should not be an issue.

Bottom line is, you need to check the data sheets.

One big problem is the temperature dependence of the F18A timing on temperature, voltage and process. So you have to leave wide margins in your timing calculations.

I'll let you map this to GA144 chip timings. They never specified the memory interface in a way that allows it to be used for a memory interface. It was actually intended to be used with SDRAM which is much lower power and lower cost. It's hard to even

find much SRAM these days and what's out there is very expensive. Check Digikey.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

So, I saw in your post ages ago, that you could only get up to 50mhz accessing
external memory on the GA144. I had suspected it might be like that, until I read
into the sdram app note. With Psram availability, it's a mystery why a native bus
wasn't implemented, as it's basically like the internal bus. These misc designs originally
cane about to use cache sram. I'm very grateful the market made the Pseudo
Sram concept, as you. An stick all sorts of commodity ram chips into the package
and with a cheap sliver of a memory controller circuit, translate that to an sram
interface. But, thinking about it, the same thing goes for any memory interface. My two
wire parralel to serial auto execute interface could just be a supper simple controller
to commodity mobile memories (though these have major issues with value
shift)l. But there is bound to be a runner up happy to use up their manufacturing
capacity in the Psram market, which includes a lot of microcontrollers.

So, the external memory for the 144, is not going be useful for me, and there is not
enough internal memory. I had started to think of some quick techniques to do read
and write more quickly, even if I could use execute a port to somehow execute a words
from a serial memory some how in a controllable order, or redesign an interpreter to get
closer to half 650mhz (say 213hz or less). Looks like they took my advice many years
back and put a software DMA in, where I had wanted a very simple digital circuit instead
etc, allowing ports to communicate it through. No real figures are given. I was hunting
down a very small prototyping board for it. Looking at ways powering a core of the
GA144 off of the RCA video socket.

But, it would be too complex for my users. Some will get it, some others would think
those people were legends and most wouldn't. Colorforth would be enough of an
abstraction.

A colorforth like virtual machine could run on something like an ATiny85
(whatever chips with enough memory, which can ingress video, manipulate enough,
and output a video stream, in a single chip package, likely connected together without
mainboard), and maintain code compatibility with versions based on other
chips, and the same VM abstracted instruction set shared with the other project,
and between products. The other projects instruction set architecture is going be
different from the VM but still compatible. You then can just use this before you order a
custom chip. I actually had figured out a way to do object orientated processing system
which could enable day a primitive system like an Atari 2600 to run complex 2D
games similar to Sonic the Hedgehog. So, the retro processor project is going be
interesting (the community processor is just a simple embedded with good io, the
retro was actually going be a very sophisticated processing, which I hadn't
remembered before, but had identified that dropping the game features it looked good
for embedded. These are just some of several different architecture proposals over
the decades. The retro one was an old 8 bit design proposal I had, that I am
applying new ideas to in 8 or 16 bit form. As you can see in the other thread, I have
located a minute processor FPGA card due for release, this could be tested on. The
other design proposals likely don't need instruction set validation, as instruction for
instruction, they are equivalent and can map too, an existing validated
instruction set. Your main issue, is if you want to change little things to improve coding
and workflow, like testing jumps, where you have to find out if that is a good idea. So,
the community processor, is Just another Misc processor in another format (JaMISC)
and will compile code unaltered if not self changing on the same address and io
space, of course).

You also will likely be able to find JavaScript API's for these chips, to optionally use, and
role the VM into that, to make development more comfortable (before native
API's are ready), but people don't need it, it would have a very small abstracted native
substrate to interface too.

So, still significant research to go, to calculate which cheap part has enough processor to
properly handle the work load. I'm also identifying the project for a commercial
version with high volume sales, as it substantially overlaps with a few other
planned projects. So, it's still not a waste of time as a low volume diy. A shame, there is
a lot not said, but the GA144 version would have been pretty interesting.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to Wayne morellini on Sun Aug 7 07:47:10 2022

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to

synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

I tried to post this s few nights back, but lost the post.

I'm feeling significantly better, after bad post (I used lots of vitamin C and maybe pine
needle tea, accidental bad reactions to two treatments, and found I had bought the
wrong version of something else I really needed, which was more harmful than good.
Then some flu, and heaps of unexpected problems and people causing me problems.
And on cream for one cancer, and getting another cut out. It now, is time to
rebuild and start to catch up doing somethings again. But I have discovered all the
deterioration has made things marginal, such as even if I get really well to do stuff,
slight sickness will make me less functional. Which is an issue doing complex technical
stuff. So that's where it stands with these projects.

I'll split the post.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to Wayne morellini on Sat Sep 3 23:28:07 2022

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to

synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

Syncing video filter project threads.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to Wayne morellini on Sat Sep 3 23:27:14 2022

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to

synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

Syncing video filter project threads.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Wayne morellini@21:1/5 to Wayne morellini on Sun Sep 4 08:41:42 2022

On Sunday, September 4, 2022 at 4:28:09 PM UTC+10, Wayne morellini wrote:

On Thursday, August 4, 2022 at 12:42:34 AM UTC+10, Wayne morellini wrote:

On the GA chips, there are no full external memory buss, and a multi node scheme has to be setup for a node to use an Sram chip to execute code off of it. I presumed this would be a lot of cycles per fetch, but then realised you could use nodes to

synchronise sequential and random addressing and fetch.

How many cycles does it take to send an address say for the next word) fetch it, and start executing it?

Thanks.

Syncing video filter project threads.

Video filter project

https://groups.google.com/g/comp.lang.forth/c/S-_qe2Kh6gk/m/6DH2cIxEBwAJ

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	53:33:46
Calls:	6,712
Files:	12,243
Messages:	5,355,268

GA Misc chips. Rate you can address and execute off an external Srams?

Who's Online

System Info