Forum: >>> Magnum BBS <<<

Low Power Super Computing

From Rick C@21:1/5 to All on Tue May 31 19:34:04 2022

I watched a video (well, part of it anyway) about the current top dog super computer that performs 52.2 GFLOPS per watt. I think that's the territory of the GA144, no? I can't recall how many watts it is, but I'm thinking it's around 1 watt running
flat out. Of course, it doesn't do floating point ops natively, so not really a good comparison. But for MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to gnuarm.del...@gmail.com on Tue May 31 21:52:55 2022

On Wednesday, June 1, 2022 at 4:34:06 AM UTC+2, gnuarm.del...@gmail.com wrote:

I watched a video (well, part of it anyway) about the current top dog super computer that performs 52.2 GFLOPS per watt. I think that's the territory of the GA144, no? I can't recall how many watts it is, but I'm thinking it's around 1 watt running

flat out. Of course, it doesn't do floating point ops natively, so not really a good comparison. But for MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

Is there no theoretical limit on the GLOPS/MIPS given a certain manufacturing process and maybe a few other parameters?

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Marcel Hendrix on Tue May 31 23:00:04 2022

On Wednesday, June 1, 2022 at 12:52:56 AM UTC-4, Marcel Hendrix wrote:

On Wednesday, June 1, 2022 at 4:34:06 AM UTC+2, gnuarm.del...@gmail.com wrote:

I watched a video (well, part of it anyway) about the current top dog super computer that performs 52.2 GFLOPS per watt. I think that's the territory of the GA144, no? I can't recall how many watts it is, but I'm thinking it's around 1 watt running

flat out. Of course, it doesn't do floating point ops natively, so not really a good comparison. But for MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

Is there no theoretical limit on the GLOPS/MIPS given a certain manufacturing process and maybe a few other parameters?

Yes, there is a theoretical limit on the energy used for a given computation. I remember a Scientific American paper about it back when they actually had papers, before they become another Discover magazine.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to gnuarm.deletethisbit@gmail.com on Wed Jun 1 09:54:21 2022

In article <d0c1bdda-dade-44bc-8626-5c7bb5298a57n@googlegroups.com>,
Rick C <gnuarm.deletethisbit@gmail.com> wrote:

I watched a video (well, part of it anyway) about the current top dog
super computer that performs 52.2 GFLOPS per watt. I think that's the >territory of the GA144, no? I can't recall how many watts it is, but
I'm thinking it's around 1 watt running flat out. Of course, it doesn't
do floating point ops natively, so not really a good comparison. But
for MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

It love to see breaking the hurdle of of 1000 sensible instructions
per second on a GA144 chip.

Rick C.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to gnuarm.deletethisbit@gmail.com on Wed Jun 1 10:00:38 2022

In article <0beff084-7928-42bd-a068-0f8ed5603590n@googlegroups.com>,
Rick C <gnuarm.deletethisbit@gmail.com> wrote:

On Wednesday, June 1, 2022 at 12:52:56 AM UTC-4, Marcel Hendrix wrote:

On Wednesday, June 1, 2022 at 4:34:06 AM UTC+2,

gnuarm.del...@gmail.com wrote:

I watched a video (well, part of it anyway) about the current top

dog super computer that performs 52.2 GFLOPS per watt. I think that's
the territory of the GA144, no? I can't recall how many watts it is, but
I'm thinking it's around 1 watt running flat out. Of course, it doesn't
do floating point ops natively, so not really a good comparison. But for >MIPS, its about 100 GIPS per watt.

Not too shabby for a 12 year old design.

Is there no theoretical limit on the GLOPS/MIPS given a certain >manufacturing process and maybe a few other parameters?

Yes, there is a theoretical limit on the energy used for a given
computation. I remember a Scientific American paper about it back when
they actually had papers, before they become another Discover magazine.

I remember an other article about reversible computation in the
same SA (that doesn't increase entropy) that requires no energy consumption. Apparently reversible computation can calculate anything.

P.S.
I dropped my subscription when they were expressing energy consumption equivalent to how many hairdryers.

Rick C.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Rick C on Wed Jun 1 12:06:39 2022

Rick C <gnuarm.deletethisbit@gmail.com> writes:

I watched a video (well, part of it anyway) about the current top dog super=
computer that performs 52.2 GFLOPS per watt. I think that's the territory= of the GA144, no?

No.

I can't recall how many watts it is, but I'm thinking i=
t's around 1 watt running flat out. Of course, it doesn't do floating poin= >t ops natively, so not really a good comparison. But for MIPS, its about 1= >00 GIPS per watt. =20

Doing what? Supercomputers are evaluated using the linpack benchmark,
which solves a dense system of linear equations <https://www.top500.org/project/linpack/>, something that
supercomputers tend to do not just for benchmarking.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Marcel Hendrix on Wed Jun 1 12:13:55 2022

Marcel Hendrix <mhx@iae.nl> writes:

Is there no theoretical limit on the GLOPS/MIPS given a certain manufacturi= >ng process and maybe a few other parameters?

Not that I know of.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Rick C on Wed Jun 1 12:14:52 2022

Rick C <gnuarm.deletethisbit@gmail.com> writes:

Yes, there is a theoretical limit on the energy used for a given computatio= >n.

But it has nothing to do with semiconductor processes.

You are thinking of the Landauer limit <https://en.wikipedia.org/wiki/Landauer%27s_principle>, which is far
below the power dissipation of computers implemented in current
processes.

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to Anton Ertl on Wed Jun 1 10:00:13 2022

On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
[..]

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

I expected there to be a minimum amount of energy
to push a bunch of electrons from one detectable state
to another one. Might be same principle as Landauer, but
his idea that information and energy are somehow related
I find hard to grasp.

A boundary that is maybe more of practical concern: are there
theoretical limits related to pipelining (i.e. branch removal)
and/or parallel computing?

The human brain does not seem much of a problem with the speed
of communication (between cells), and doesn't overheat. Unfortunately,
it most-times refuses to compute exactly what I want.

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Marcel Hendrix on Wed Jun 1 17:29:37 2022

Marcel Hendrix <mhx@iae.nl> writes:

On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
[..]

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

I think I read it in a collection by Feynmann (who held a regular
lecture about physics of computation in the 1980s).

I expected there to be a minimum amount of energy
to push a bunch of electrons from one detectable state
to another one.

I think that's already too implementation-specific for this kind of
reasoning.

Might be same principle as Landauer, but
his idea that information and energy are somehow related
I find hard to grasp.

Information and enthropy are related. E.g., consider Maxwell's demon.

A boundary that is maybe more of practical concern: are there
theoretical limits related to pipelining (i.e. branch removal)

Pipelining is not the same as branch removal. In (hardware)
pipelining, every pipeline stage adds ~5 gate delays to the delay of
the whole thing, for the holding latches, and for the jitter etc. of
the pipeline stage. It also adds to the power needs (both for the
additional gates and due to clocking higher). Intel planned to deepen
the Pentium 4 pipeline [sprangle&carmean02] in the Tejas (and AMD also
worked on a deeply pipelined CPU at the same time), but both projects
were cancelled in 2005; my guess is that there was a promising cooling technology that did not work out, so they could not produce CPUs with
such a high power density as planned.

Branch prediction helps avoid the branch penalty of deep pipelines;
you cannot predict a really random branch, but apparently patterns in
the data that we don't see easily can be used by branch predictors.

@InProceedings{sprangle&carmean02,
author = {Eric Sprangle and Doug Carmean},
title = {Increasing Processor Performance by Implementing
Deeper Pipelines},
crossref = {isca02},
pages = {25--34},
url = {http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/technology/deep-pipelines-isca02.pdf},
annote = {This paper starts with the Williamette (Pentium~4)
pipeline and discusses and evaluates changes to the
pipeline length. In particular, it gives numbers on
how lengthening various latencies would affect IPC;
on a per-cycle basis the ALU latency is most
important, then L1 cache, then L2 cache, then branch
misprediction; however, the total effect of
lengthening the pipeline to double the clock rate
gives the reverse order (because branch
misprediction gains more cycles than the other
latencies). The paper reports 52 pipeline stages
with 1.96 times the original clock rate as optimal
for the Pentium~4 microarchitecture, resulting in a
reduction of 1.45 of core time and an overall
speedup of about 1.29 (including waiting for
memory). Various other topics are discussed, such as
nonlinear effects when introducing bypasses, and
varying cache sizes. Recommended reading.}
}

and/or parallel computing?

Amdahl's law. Often underestimated, often overestimated.

The human brain does not seem much of a problem with the speed
of communication (between cells),

It does not compute very fast.

and doesn't overheat.

Actually humans are reported to spend 25% of their energy on the
brain, and certainly more when people are thinking hard. And it can
become too hot.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rick C@21:1/5 to Marcel Hendrix on Wed Jun 1 13:45:52 2022

On Wednesday, June 1, 2022 at 1:00:14 PM UTC-4, Marcel Hendrix wrote:

On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
[..]

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

I expected there to be a minimum amount of energy
to push a bunch of electrons from one detectable state
to another one. Might be same principle as Landauer, but
his idea that information and energy are somehow related
I find hard to grasp.

A boundary that is maybe more of practical concern: are there
theoretical limits related to pipelining (i.e. branch removal)
and/or parallel computing?

The human brain does not seem much of a problem with the speed
of communication (between cells), and doesn't overheat. Unfortunately,
it most-times refuses to compute exactly what I want.

The analysis has to be abstract. Electrons are not the only way to perform logic.

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to mhx@iae.nl on Thu Jun 2 11:35:06 2022

In article <6e8d4873-b1bc-4973-b5ae-7d8d58faa7ebn@googlegroups.com>,
Marcel Hendrix <mhx@iae.nl> wrote:

On Wednesday, June 1, 2022 at 2:27:35 PM UTC+2, Anton Ertl wrote:
[..]

The thing about reversible computation is that it does not erase
memory (what costs energy in Landauer's principle), so it would allow
going below the Landauer limit, in a sense. However, you still need
some energy to drive the computation in a specific direction, and more
for driving it faster (at least that's what I read at one point).

I expected there to be a minimum amount of energy
to push a bunch of electrons from one detectable state
to another one. Might be same principle as Landauer, but
his idea that information and energy are somehow related
I find hard to grasp.

Not the same as Landauer. Electric energy and gravitational energy
are types of free energy. They can converted into each order
without loss. Theoretically. Going from 95 to 99 to 99.9 % is
possible, but they require more and more sophistication.
That kind of thing. Lossless in a hard to reach limit.

A boundary that is maybe more of practical concern: are there
theoretical limits related to pipelining (i.e. branch removal)
and/or parallel computing?

I ignored that article because it wasn't practical, and I
didn't see consequences for real life.

-marcel

groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	54:30:38
Calls:	6,712
Files:	12,243
Messages:	5,355,328

Low Power Super Computing

Who's Online

System Info