ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
On 12/29/22 8:33 AM, Dimiter_Popoff wrote:
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
If you are looking for the very latest standards, yes. Enough data is
out there to handle a lot of basic MIPI operations. Since the small
player isn't going to be trying to implement the low level interface themselves (or at least shouldn't be trying to),
unless you are trying
to work with a bleeding edge camera (which you probably can't actually
buy if you are a small player) you can tend to find enough information
to use the camera.
My experiance is if you can actually buy the camera normally, there will
be the data available to use it.
The big problem is "Grey Market"
cameras, via unauthorized distributors you are at the mercy of the distributor to get you the needed data.
Not use about USB, perhaps USB cameras are covered in the standard
(yet to deal with that one).
There is a USB video standard, and many USB cameras can just be plugged
in and used.
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
Not use about USB, perhaps USB cameras are covered in the standard
(yet to deal with that one).
On 12/29/2022 19:21, Richard Damon wrote:
On 12/29/22 8:33 AM, Dimiter_Popoff wrote:
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
If you are looking for the very latest standards, yes. Enough data is
out there to handle a lot of basic MIPI operations. Since the small
player isn't going to be trying to implement the low level interface
themselves (or at least shouldn't be trying to),
So how does one use a MIPI camera without using the low level interface?
unless you are trying to work with a bleeding edge camera (which you
probably can't actually buy if you are a small player) you can tend to
find enough information to use the camera.
That is fair enough, as long as we are talking about some internal
sensor specifics of the "bleeding edge" cameras.
My experiance is if you can actually buy the camera normally, there
will be the data available to use it.
That's really reassuring. I am more interested in talking to MIPI
display modules than to cameras (at least the sequence is this) but
still.
The big problem is "Grey Market" cameras, via unauthorized
distributors you are at the mercy of the distributor to get you the
needed data.
Don't they conform to the MIPI standard? (which I have no access to).
Not use about USB, perhaps USB cameras are covered in the standard
(yet to deal with that one).
There is a USB video standard, and many USB cameras can just be
plugged in and used.
OK, I thought I had seen that some years ago. Might be an escape (though cameras found in phones and tablets etc. are probably all MIPI).
On Thursday, December 29, 2022 at 12:06:40 PM UTC-5, Richard Damon wrote:If not, then 1, 2, 3, 4 units of time. Even 16 levels of grey is much better than black and white.
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensorsUsing a DRAM in that manner would only give you a single bit value for
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
each pixel (maybe some more modern memories store multiple bits in a
cell so you get a few grey levels).
You could probably modulate the timing of the scans to get a range of grey scale, even if small. Let the chip integrate for 1 unit, 2 units, 4 units, etc. of time. I'm assuming light responsiveness of the human eye is logarithmic, rather than linear.
It would be a bit of processing to translate the thermometer codes into pixel values, but just time consuming, not hard.
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for
each pixel (maybe some more modern memories store multiple bits in a
cell so you get a few grey levels).
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
Not use about USB, perhaps USB cameras are covered in the standard
(yet to deal with that one).
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for each pixel (maybe some more modern memories store multiple bits in a cell so you get
a few grey levels).
There are some CMOS sensors that let you address pixels individually and in a random order (like you got with the DRAM) but by its nature, such a readout method tends to be slow, and space inefficient, so these interfaces tend to be
only available on smaller camera arrays.
That is why most sensors read out via row/column shift registers to a pixel serial (maybe multiple pixels per clock) output, and if the camera includes its
own A/D conversion, might serialize the results to minimize interconnect.
On 12/29/22 12:45 PM, Dimiter_Popoff wrote:
On 12/29/2022 19:21, Richard Damon wrote:
On 12/29/22 8:33 AM, Dimiter_Popoff wrote:
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
If you are looking for the very latest standards, yes. Enough data is
out there to handle a lot of basic MIPI operations. Since the small
player isn't going to be trying to implement the low level interface
themselves (or at least shouldn't be trying to),
So how does one use a MIPI camera without using the low level interface?
You use a chip that has a mipi interface, either a CPU or FPGA with a
built in MIPI interface or a MIPI converter chip that converts the MIPI interface into something you can deal with.
unless you are trying to work with a bleeding edge camera (which you
probably can't actually buy if you are a small player) you can tend
to find enough information to use the camera.
That is fair enough, as long as we are talking about some internal
sensor specifics of the "bleeding edge" cameras.
Bleeding Edge cameras/displays may need newer versions of MIPI than may
be easy to find in the consumer market. They may need bleeding edge processors.
As I mention below, more important are the configuration registers,
which might be harder to get for bleeding edge parts. This is often proprietary, as knowing what is adjustable is often part of the secret
sauce for those cameras.
My experiance is if you can actually buy the camera normally, there
will be the data available to use it.
That's really reassuring. I am more interested in talking to MIPI
display modules than to cameras (at least the sequence is this) but
still.
So you want a chip with MIPI DSI capability built in, or a convert chip.
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for
each pixel (maybe some more modern memories store multiple bits in a
cell so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
There are some CMOS sensors that let you address pixels individually
and in a random order (like you got with the DRAM) but by its nature,
such a readout method tends to be slow, and space inefficient, so
these interfaces tend to be only available on smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
That is why most sensors read out via row/column shift registers to a
pixel serial (maybe multiple pixels per clock) output, and if the
camera includes its own A/D conversion, might serialize the results to
minimize interconnect.
Yes, but then you have to store it in memory in order to examine it.
I.e., if your goal isn't just to pass the image out to a display,
then having to unpack the serial stream into RAM is an added cost.
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for each >>> pixel (maybe some more modern memories store multiple bits in a cell so you >>> get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but tend to be on
the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels individually and in >>> a random order (like you got with the DRAM) but by its nature, such a
readout method tends to be slow, and space inefficient, so these interfaces >>> tend to be only available on smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address to teh camera and then get back the pixel value is going to use up more total bandwidth than getting the image in a stream. The one exception would be if you
need just a very small percentage of the array data, and it is scattered over the array so a Region of Interest operation can't be used.
That is why most sensors read out via row/column shift registers to a pixel >>> serial (maybe multiple pixels per clock) output, and if the camera includes >>> its own A/D conversion, might serialize the results to minimize interconnect.
Yes, but then you have to store it in memory in order to examine it.
I.e., if your goal isn't just to pass the image out to a display,
then having to unpack the serial stream into RAM is an added cost.
Unless you make sure you get a camera with the same image format and timing as
your display.
On 12/29/2022 2:09 PM, Richard Damon wrote:
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value
for each pixel (maybe some more modern memories store multiple bits
in a cell so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but tend
to be on the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels individually
and in a random order (like you got with the DRAM) but by its
nature, such a readout method tends to be slow, and space
inefficient, so these interfaces tend to be only available on
smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address to
teh camera and then get back the pixel value is going to use up more
total bandwidth than getting the image in a stream. The one exception
would be if you need just a very small percentage of the array data,
and it is scattered over the array so a Region of Interest operation
can't be used.
No, you're missing the nature of the DRAM example.
You don't "send" the address of the memory cell desired *to* the DRAM.
You simply *address* the memory cell, directly. I.e., if there are
N locations in the DRAM, then N addresses in your address space are
consumed by it; one for each location in the array.
I'm looking for *that* sort of "direct access" in a camera.
I could *emulate* it by building a module that implements <whatever> interface to <whichever> camera and deserializes the data into a
RAM. Then, mapping that *entire* RAM into the address space of the
host processor.
(Keeping the RAM updated would require a pseudo dual-ported architecture; possibly toggling between an "active" RAM and an "updated" RAM so that
the full bandwidth of the RAM was available to the host)
Having the host processor (DMA, etc.) perform this task means it loses bandwidth to the "deserialization" activity.
On 12/29/2022 20:48, Richard Damon wrote:
On 12/29/22 12:45 PM, Dimiter_Popoff wrote:
On 12/29/2022 19:21, Richard Damon wrote:
On 12/29/22 8:33 AM, Dimiter_Popoff wrote:
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked >>>>> you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
If you are looking for the very latest standards, yes. Enough data
is out there to handle a lot of basic MIPI operations. Since the
small player isn't going to be trying to implement the low level
interface themselves (or at least shouldn't be trying to),
So how does one use a MIPI camera without using the low level interface?
You use a chip that has a mipi interface, either a CPU or FPGA with a
built in MIPI interface or a MIPI converter chip that converts the
MIPI interface into something you can deal with.
An FPGA with MIPI would do, I have not looked for one yet.
unless you are trying to work with a bleeding edge camera (which you
probably can't actually buy if you are a small player) you can tend
to find enough information to use the camera.
That is fair enough, as long as we are talking about some internal
sensor specifics of the "bleeding edge" cameras.
Bleeding Edge cameras/displays may need newer versions of MIPI than
may be easy to find in the consumer market. They may need bleeding
edge processors.
Well a 64 bit GHz range 4 or 8 core power architecture part should be
plenty. But I am not after bleeding edge cameras, a decent one I
can control will do.
As I mention below, more important are the configuration registers,
which might be harder to get for bleeding edge parts. This is often
proprietary, as knowing what is adjustable is often part of the secret
sauce for those cameras.
Do you get that sort of data for decent cameras? Sort of like how
to focus it etc.? Or do you have to rely on black box "converters",
like with wifi modules which won't let you get around their tcp/ip
stack?
My experiance is if you can actually buy the camera normally, there
will be the data available to use it.
That's really reassuring. I am more interested in talking to MIPI
display modules than to cameras (at least the sequence is this) but
still.
So you want a chip with MIPI DSI capability built in, or a convert chip.
Not really, no. I want to be able to put the framebuffer data into
the display like I have been doing with RGB, hsync, vsync etc., via
a parallel or lvds interface. Is there enough info out there how to
do this with an fpga? I think I have enough info to do hdmi this way,
but no MIPI. Well, my guess is that pixel data will still be pixel
data etc., can't be that hard.
On 12/29/2022 2:09 PM, Richard Damon wrote:
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value
for each pixel (maybe some more modern memories store multiple bits
in a cell so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but tend
to be on the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels individually
and in a random order (like you got with the DRAM) but by its
nature, such a readout method tends to be slow, and space
inefficient, so these interfaces tend to be only available on
smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address to
teh camera and then get back the pixel value is going to use up more
total bandwidth than getting the image in a stream. The one exception
would be if you need just a very small percentage of the array data,
and it is scattered over the array so a Region of Interest operation
can't be used.
No, you're missing the nature of the DRAM example.
You don't "send" the address of the memory cell desired *to* the DRAM.
You simply *address* the memory cell, directly. I.e., if there are
N locations in the DRAM, then N addresses in your address space are
consumed by it; one for each location in the array.
I'm looking for *that* sort of "direct access" in a camera.
I could *emulate* it by building a module that implements <whatever> interface to <whichever> camera and deserializes the data into a
RAM. Then, mapping that *entire* RAM into the address space of the
host processor.
(Keeping the RAM updated would require a pseudo dual-ported architecture; possibly toggling between an "active" RAM and an "updated" RAM so that
the full bandwidth of the RAM was available to the host)
Having the host processor (DMA, etc.) perform this task means it loses bandwidth to the "deserialization" activity.
That is why most sensors read out via row/column shift registers to
a pixel serial (maybe multiple pixels per clock) output, and if the
camera includes its own A/D conversion, might serialize the results
to minimize interconnect.
Yes, but then you have to store it in memory in order to examine it.
I.e., if your goal isn't just to pass the image out to a display,
then having to unpack the serial stream into RAM is an added cost.
Unless you make sure you get a camera with the same image format and
timing as your display.
I typically don't "display" the images captured. Rather, I use the
cameras as sensors: is there anything in the path of the closing
(or opening) garage door that should cause me to inhibit/abort
those actions? has the mail truck appeared at the mailbox, yet,
today? *who* is standing at the front door?
On 12/30/2022 0:57, Don Y wrote:
On 12/29/2022 2:09 PM, Richard Damon wrote:
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for >>>>> each pixel (maybe some more modern memories store multiple bits in a cell >>>>> so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but tend to be >>> on the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels individually and >>>>> in a random order (like you got with the DRAM) but by its nature, such a >>>>> readout method tends to be slow, and space inefficient, so these
interfaces tend to be only available on smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address to teh >>> camera and then get back the pixel value is going to use up more total
bandwidth than getting the image in a stream. The one exception would be if >>> you need just a very small percentage of the array data, and it is scattered
over the array so a Region of Interest operation can't be used.
No, you're missing the nature of the DRAM example.
You don't "send" the address of the memory cell desired *to* the DRAM.
You simply *address* the memory cell, directly. I.e., if there are
N locations in the DRAM, then N addresses in your address space are
consumed by it; one for each location in the array.
I'm looking for *that* sort of "direct access" in a camera.
I could *emulate* it by building a module that implements <whatever>
interface to <whichever> camera and deserializes the data into a
RAM. Then, mapping that *entire* RAM into the address space of the
host processor.
(Keeping the RAM updated would require a pseudo dual-ported architecture;
possibly toggling between an "active" RAM and an "updated" RAM so that
the full bandwidth of the RAM was available to the host)
Having the host processor (DMA, etc.) perform this task means it loses
bandwidth to the "deserialization" activity.
Well of course but are you sure you can really win much? At first
glance you'd be able to halve the memory bandwidth.
But then you may
run into problems with "doppler" kind of effects (clearly not Doppler
but you get the idea) if you access the frame being acquired; so you'll
want that double buffering you are talking about elsewhere (one frame
being acquired and one having been acquired prior to that). Which would
mean that somewhere something will have to do the copying you want to avoid...
Since you have already done it with USB cameras I think the practical
way is to just keep doing it this way, may be not USB if you can
find some more economic way to do it, MIPI or whatever.
On 12/29/22 5:57 PM, Don Y wrote:
On 12/29/2022 2:09 PM, Richard Damon wrote:
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value for >>>>> each pixel (maybe some more modern memories store multiple bits in a cell >>>>> so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but tend to be >>> on the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels individually and >>>>> in a random order (like you got with the DRAM) but by its nature, such a >>>>> readout method tends to be slow, and space inefficient, so these
interfaces tend to be only available on smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address to teh >>> camera and then get back the pixel value is going to use up more total
bandwidth than getting the image in a stream. The one exception would be if >>> you need just a very small percentage of the array data, and it is scattered
over the array so a Region of Interest operation can't be used.
No, you're missing the nature of the DRAM example.
You don't "send" the address of the memory cell desired *to* the DRAM.
You simply *address* the memory cell, directly. I.e., if there are
N locations in the DRAM, then N addresses in your address space are
consumed by it; one for each location in the array.
No, look at you DRAM timing again, the trasaction begins with the sending of the address over typically two clock edges with RAS and CAS, and then a couple
of clock cycles and then you get back on the data bus the answer.
Yes, the addresses come from an address bus, using address space out of the processor, but it is a multi-cycle operation. Typically, you read back a "burst" with some minimal caching on the processor side, but that is more a minor detail.
I'm looking for *that* sort of "direct access" in a camera.
Its been awhile, but I thought some CMOS cameras could work on a similar basis,
strobe a Row/Column address from pins on the camera, and a few clock cycles later you got a burst out of the camera starting at the address cell.
On 12/29/2022 6:33 AM, Dimiter_Popoff wrote:
On 12/29/2022 15:16, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Hah, Don, consider yourself lucky if you find a camera you have
enough documentation to use at all, serial or whatever.
The MIPI standards are only for politburo members (last time I looked
you need to make several millions annually to be able to *apply*
for membership, which of course costs thousands, annually again).
Not use about USB, perhaps USB cameras are covered in the standard
(yet to deal with that one).
I built my prototypes (proof-of-principle) using COTS USB cameras.
But, getting the data out of the serial data stream and into RAM so
it can be analyzed consumes memory bandwidth.
I'm currently trying to sort out an approximate cost factor "per
camera" (per video stream) and looking for ways that I can cut costs
(memory bandwidth requirements) to allow greater numbers of
cameras or higher frame rates.
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
I built my prototypes (proof-of-principle) using COTS USB cameras.
But, getting the data out of the serial data stream and into RAM so
it can be analyzed consumes memory bandwidth.
I'm currently trying to sort out an approximate cost factor "per
camera" (per video stream) and looking for ways that I can cut costs
(memory bandwidth requirements) to allow greater numbers of
cameras or higher frame rates.
You aren't going to find anything low cost ... if you want bandwidth
for multiple cameras, you need to look into bus based frame grabbers.
They still exist, but are (relatively) expensive and getting harder to
find.
On 12/30/22 2:27 AM, Don Y wrote:
Hi George!
[Hope you are faring well... enjoying the COLD! ;) ]
On 12/29/2022 10:29 PM, George Neuner wrote:
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
I built my prototypes (proof-of-principle) using COTS USB cameras.
But, getting the data out of the serial data stream and into RAM so
it can be analyzed consumes memory bandwidth.
I'm currently trying to sort out an approximate cost factor "per
camera" (per video stream) and looking for ways that I can cut costs
(memory bandwidth requirements) to allow greater numbers of
cameras or higher frame rates.
You aren't going to find anything low cost ... if you want bandwidth
for multiple cameras, you need to look into bus based frame grabbers.
They still exist, but are (relatively) expensive and getting harder to
find.
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above)
- reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras" sort of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap" processor right on each camera using a processor with a direct camera interface to pull in the image and do your processing and send the results over some comm-link to
the center core.
It is unclear what you actual image requirements per camera are, so it is hard
to say what level camera and processor you will need.
My first feeling is you seem to be assuming a fairly cheep camera and then doing some fairly simple processing over the partial image, in which case you might even be able to live with a camera that uses a crude SPI interface to bring the frame in, and a very simple processor.
Hi George!
[Hope you are faring well... enjoying the COLD! ;) ]
On 12/29/2022 10:29 PM, George Neuner wrote:
But, most cameras seem to have (bit- or word-) serial interfaces
nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
I built my prototypes (proof-of-principle) using COTS USB cameras.
But, getting the data out of the serial data stream and into RAM so
it can be analyzed consumes memory bandwidth.
I'm currently trying to sort out an approximate cost factor "per
camera" (per video stream) and looking for ways that I can cut costs
(memory bandwidth requirements) to allow greater numbers of
cameras or higher frame rates.
You aren't going to find anything low cost ... if you want bandwidth
for multiple cameras, you need to look into bus based frame grabbers.
They still exist, but are (relatively) expensive and getting harder to
find.
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above)
- reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
On 12/30/2022 9:24 AM, Richard Damon wrote:
On 12/30/22 2:27 AM, Don Y wrote:
Hi George!
[Hope you are faring well... enjoying the COLD! ;) ]
On 12/29/2022 10:29 PM, George Neuner wrote:
But, most cameras seem to have (bit- or word-) serial interfaces >>>>>>> nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
I built my prototypes (proof-of-principle) using COTS USB cameras.
But, getting the data out of the serial data stream and into RAM so
it can be analyzed consumes memory bandwidth.
I'm currently trying to sort out an approximate cost factor "per
camera" (per video stream) and looking for ways that I can cut costs >>>>> (memory bandwidth requirements) to allow greater numbers of
cameras or higher frame rates.
You aren't going to find anything low cost ... if you want bandwidth
for multiple cameras, you need to look into bus based frame grabbers.
They still exist, but are (relatively) expensive and getting harder to >>>> find.
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above)
- reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras"
sort of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap"
processor right on each camera using a processor with a direct camera
interface to pull in the image and do your processing and send the
results over some comm-link to the center core.
If I went the frame-grabber approach, that would be how I would address the hardware. But, it doesn't scale well. I.e., at what point do you throw in the towel and say there are too many concurrent images in the scene to
pile them all onto a single "host" processor?
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have
it "notice" any deviation from the "unobstructed" image.
When the scene gets too large to represent in enough detail in a single camera's field of view, then there needs to be a way to coordinate
multiple cameras to a single (virtual?) host. If those cameras were just "chunks of memory", then the *imagery* would be easy to examine in a single host -- though the processing power *might* need to increase geometrically (depending on your current goal)
Moving the processing to "host per camera" implementation gives you more MIPS. But, makes coordinating partial results tedious.
It is unclear what you actual image requirements per camera are, so it
is hard to say what level camera and processor you will need.
My first feeling is you seem to be assuming a fairly cheep camera and
then doing some fairly simple processing over the partial image, in
which case you might even be able to live with a camera that uses a
crude SPI interface to bring the frame in, and a very simple processor.
I use A LOT of cameras. But, I should be able to swap the camera (upgrade/downgrade) and still rely on the same *local* compute engine.
E.g., some of my cameras have Ir illuminators; it's not important
in others; some are PTZ; others fixed.
Watching for an obstruction in the path of a garage door (open/close)
has different requirements than trying to recognize a visitor at the front door. Or, identify the locations of the occupants of a facility.
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above)
- reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras" sort >>> of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap" processor >>> right on each camera using a processor with a direct camera interface to >>> pull in the image and do your processing and send the results over some
comm-link to the center core.
If I went the frame-grabber approach, that would be how I would address the >> hardware. But, it doesn't scale well. I.e., at what point do you throw in
the towel and say there are too many concurrent images in the scene to
pile them all onto a single "host" processor?
Thats why I didn't suggest that method. I was suggesting each camera has its own tightly coupled processor that handles the need of THAT
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate
processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have
it "notice" any deviation from the "unobstructed" image.
And if one camera can't fit the full scene, you use two cameras, each with there own processor, and they each process their own image.
The only problem is if your image processing algoritm need to compare parts of
the images between the two cameras, which seems unlikely.
It does say that if trying to track something across the cameras, you need enough overlap to allow them to hand off the object when it is in the overlap.
When the scene gets too large to represent in enough detail in a single
camera's field of view, then there needs to be a way to coordinate
multiple cameras to a single (virtual?) host. If those cameras were just >> "chunks of memory", then the *imagery* would be easy to examine in a single >> host -- though the processing power *might* need to increase geometrically >> (depending on your current goal)
Yes, but your "chunks of memory" model just doesn't exist as a viable camera model.
The CMOS cameras with addressable pixels have "access times" significantly lower than your typical memory (and is read once) so doesn't really meet that model. Some of them do allow for sending multiple small regions of intererst and down loading just those regions, but this then starts to require moderate processor overhead to be loading all these regions and updating the grabber to
put them where you want.
And yes, it does mean that there might be some cases where you need a core module that has TWO cameras connected to a single processor, either to get a wider field of view, or to combine two different types of camera (maybe a high
res black and white to a low res color if you need just minor color information, or combine a visible camera to a thermal camera). These just become another tool in your tool box.
Moving the processing to "host per camera" implementation gives you more
MIPS. But, makes coordinating partial results tedious.
Depends on what sort of partial results you are looking at.
It is unclear what you actual image requirements per camera are, so it is >>> hard to say what level camera and processor you will need.
My first feeling is you seem to be assuming a fairly cheep camera and then >>> doing some fairly simple processing over the partial image, in which case >>> you might even be able to live with a camera that uses a crude SPI interface
to bring the frame in, and a very simple processor.
I use A LOT of cameras. But, I should be able to swap the camera
(upgrade/downgrade) and still rely on the same *local* compute engine.
E.g., some of my cameras have Ir illuminators; it's not important
in others; some are PTZ; others fixed.
Doesn't sound reasonable. If you downgrade a camera, you can't count on it being able to meet the same requirements, or you over speced the initial camera.
You put on a camera a processor capable of handling the tasks you expect out of
that set of hardware. One type of processor likely can handle a variaty of different camera setup with
Watching for an obstruction in the path of a garage door (open/close)
has different requirements than trying to recognize a visitor at the front >> door. Or, identify the locations of the occupants of a facility.
Yes, so you don't want to "Pay" for the capability to recognize a visitor in your garage door sensor, so you use different levels of sensor/processor.
On 12/30/2022 11:02 AM, Richard Damon wrote:
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above) >>>>> - reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB
Cameras" sort of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap"
processor right on each camera using a processor with a direct
camera interface to pull in the image and do your processing and
send the results over some comm-link to the center core.
If I went the frame-grabber approach, that would be how I would
address the
hardware. But, it doesn't scale well. I.e., at what point do you
throw in
the towel and say there are too many concurrent images in the scene to
pile them all onto a single "host" processor?
Thats why I didn't suggest that method. I was suggesting each camera
has its own tightly coupled processor that handles the need of THAT
My existing "module" handles a single USB camera (with a fairly
heavy-weight
processor).
But, being USB-based, there is no way to look at *part* of an image.
And, I have to pay a relatively high cost (capturing the entire
image from the serial stream) to look at *any* part of it.
*If* a "camera memory" was available, I would site N of these
in the (64b) address space of the host and let the host pick
and choose which parts of which images it wanted to examine...
without worrying about all of the bandwidth that would have been
consumed deserializing those N images into that memory (which is
a continuous process)
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate
processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have >>> it "notice" any deviation from the "unobstructed" image.
And if one camera can't fit the full scene, you use two cameras, each
with there own processor, and they each process their own image.
That's the above approach, but...
The only problem is if your image processing algoritm need to compare
parts of the images between the two cameras, which seems unlikely.
Consider watching a single room (e.g., a lobby at a business) and
tracking the movements of "visitors". It's unlikely that an individual's movements would always be constrained to a single camera field. There will be times when he/she is "half-in" a field (and possibly NOT in the other, HALF in the other or ENTIRELY in the other). You can't ignore cases where the entire object (or, your notion of what that object's characteristics might be) is not entirely in the field as that leaves a vulnerability.
For example, I watch our garage door with *four* cameras. A camera is positioned on each side ("door jam"?) of the door "looking at" the other camera. This because a camera can't likely see the full height of the door opening ON ITS SIDE OF THE DOOR (so, the opposing camera watches "my side" and I'll watch *its* side!).
[The other two cameras are similarly positioned on the overhead *track*
onto which the door rolls, when open]
An object in (or near) the doorway can be visible in one (either) or
both cameras, depending on where it is located. Additionally, one of
those manifestations may be only "partial" as regards to where it is
located and intersects the cameras' fields of view.
The "cost" of watching the door is only the cost of the actual *cameras*.
The cost of the compute resources is amortized over the rest of the system
as those can be used for other, non-camera, non-garage related activities.
It does say that if trying to track something across the cameras, you
need enough overlap to allow them to hand off the object when it is in
the overlap.
And, objects that consume large portions of a camera's field of view
require similar handling (unless you can always guarantee that cameras
and targets are "far apart")
When the scene gets too large to represent in enough detail in a single
camera's field of view, then there needs to be a way to coordinate
multiple cameras to a single (virtual?) host. If those cameras were
just
"chunks of memory", then the *imagery* would be easy to examine in a
single
host -- though the processing power *might* need to increase
geometrically
(depending on your current goal)
Yes, but your "chunks of memory" model just doesn't exist as a viable
camera model.
Apparently not -- in the COTS sense. But, that doesn't mean I can't
build a "camera memory emulator".
The downside is that this increases the cost of the "actual camera"
(see my above comment wrt ammortization).
And, it just moves the point at which a single host (of fixed capabilities) can no longer handle the scene's complexity. (when you have 10 cameras?)
The CMOS cameras with addressable pixels have "access times"
significantly lower than your typical memory (and is read once) so
doesn't really meet that model. Some of them do allow for sending
multiple small regions of intererst and down loading just those
regions, but this then starts to require moderate processor overhead
to be loading all these regions and updating the grabber to put them
where you want.
You would, instead, let the "camera memory emulator" capture the entire
image from the camera and place the entire image in a contiguous
region of memory (from the perspective of the host). The cost of capturing the portions that are not used is hidden *in* the cost of the "emulator".
And yes, it does mean that there might be some cases where you need a
core module that has TWO cameras connected to a single processor,
either to get a wider field of view, or to combine two different types
of camera (maybe a high res black and white to a low res color if you
need just minor color information, or combine a visible camera to a
thermal camera). These just become another tool in your tool box.
I *think* (uncharted territory) that the better investment is to develop algorithms that let me distribute the processing among multiple
(single) "camera modules/nodes". How would your "two camera" exemplar address an application requiring *three* cameras? etc.
I can, currently, distribute this processing by treating the
region of memory into which a (local) camera's imagery is
deserialized as a "memory object" and then exporting *access*
to that object to other similar "camera modules/nodes".
But, the access times of non-local memory are horrendous, given
that the contents are ephemeral (if accesses could be *cached*
on each host needing them, then these costs diminish).
So, I need to come up with algorithms that let me export abstractions
instead of raw data.
Moving the processing to "host per camera" implementation gives you more >>> MIPS. But, makes coordinating partial results tedious.
Depends on what sort of partial results you are looking at.
"Bob's *head* is at X,Y+H,W in my image -- but, his body is not visible"
"Ah! I was wondering whose legs those were in *my* image!"
It is unclear what you actual image requirements per camera are, so
it is hard to say what level camera and processor you will need.
My first feeling is you seem to be assuming a fairly cheep camera
and then doing some fairly simple processing over the partial image,
in which case you might even be able to live with a camera that uses
a crude SPI interface to bring the frame in, and a very simple
processor.
I use A LOT of cameras. But, I should be able to swap the camera
(upgrade/downgrade) and still rely on the same *local* compute engine.
E.g., some of my cameras have Ir illuminators; it's not important
in others; some are PTZ; others fixed.
Doesn't sound reasonable. If you downgrade a camera, you can't count
on it being able to meet the same requirements, or you over speced the
initial camera.
Sorry, I was using up/down relative to "nominal camera", not "specific
camera
previously selected for application". I'd 8really* like to just have a single "camera module" (module = CPU+I/O) instead of one for camera type A and another for camera type B, etc.
You put on a camera a processor capable of handling the tasks you
expect out of that set of hardware. One type of processor likely can
handle a variaty of different camera setup with
Exactly. If a particular instance has an Ir illuminator, then you include controls for that in *the* "camera module". If another instance doesn't have
this ability, then those controls go unused.
Watching for an obstruction in the path of a garage door (open/close)
has different requirements than trying to recognize a visitor at the
front
door. Or, identify the locations of the occupants of a facility.
Yes, so you don't want to "Pay" for the capability to recognize a
visitor in your garage door sensor, so you use different levels of
sensor/processor.
Exactly. But, the algorithms that do the scene analysis can be the same; you just parameterize the image and the objects within it that you seek.
There will likely be some combinations that exceed the capabilities of
the hardware to process in real-time. So, you fall back to lower
frame rates or let the algorithms drop targets ("You watch Bob, I'll
watch Tom!")
On 12/30/22 4:59 PM, Don Y wrote:
On 12/30/2022 11:02 AM, Richard Damon wrote:
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above) >>>>>> - reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras" sort
of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap" processor
right on each camera using a processor with a direct camera interface to >>>>> pull in the image and do your processing and send the results over some >>>>> comm-link to the center core.
If I went the frame-grabber approach, that would be how I would address the
hardware. But, it doesn't scale well. I.e., at what point do you throw in
the towel and say there are too many concurrent images in the scene to >>>> pile them all onto a single "host" processor?
Thats why I didn't suggest that method. I was suggesting each camera has its
own tightly coupled processor that handles the need of THAT
My existing "module" handles a single USB camera (with a fairly heavy-weight >> processor).
But, being USB-based, there is no way to look at *part* of an image.
And, I have to pay a relatively high cost (capturing the entire
image from the serial stream) to look at *any* part of it.
Yep, having chosen USB as your interface, you have limited yourself.
Since you say you have a fairly heavy-weight processor, that frame grab likely
isn't you limiting factor.
*If* a "camera memory" was available, I would site N of these
in the (64b) address space of the host and let the host pick
and choose which parts of which images it wanted to examine...
without worrying about all of the bandwidth that would have been
consumed deserializing those N images into that memory (which is
a continuous process)
But such a camera would almost certainly be designed for the processor to be on
the same board as the camera, (or be VERY slow in access), so much less apt allow you to add multiple cameras to one processor.
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate
processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have >>>> it "notice" any deviation from the "unobstructed" image.
And if one camera can't fit the full scene, you use two cameras, each with >>> there own processor, and they each process their own image.
That's the above approach, but...
The only problem is if your image processing algoritm need to compare parts >>> of the images between the two cameras, which seems unlikely.
Consider watching a single room (e.g., a lobby at a business) and
tracking the movements of "visitors". It's unlikely that an individual's >> movements would always be constrained to a single camera field. There will >> be times when he/she is "half-in" a field (and possibly NOT in the other,
HALF in the other or ENTIRELY in the other). You can't ignore cases where >> the entire object (or, your notion of what that object's characteristics
might be) is not entirely in the field as that leaves a vulnerability.
Sounds like you aren't overlapping your cameras enough or have insufficent coverage. Maybe your problem is wrong field of view for your lens. Maybe you
need fewer but better cameras with wider fields of view.
This might be due to try to use "stock" inexpensive USB cameras.
For example, I watch our garage door with *four* cameras. A camera is
positioned on each side ("door jam"?) of the door "looking at" the other
camera. This because a camera can't likely see the full height of the door >> opening ON ITS SIDE OF THE DOOR (so, the opposing camera watches "my side" >> and I'll watch *its* side!).
Right, and if ANY see a problem, you stop. So no need for inter-camera coordination.
[The other two cameras are similarly positioned on the overhead *track*
onto which the door rolls, when open]
An object in (or near) the doorway can be visible in one (either) or
both cameras, depending on where it is located. Additionally, one of
those manifestations may be only "partial" as regards to where it is
located and intersects the cameras' fields of view.
But since you aren't trying to ID, only Detect, there still isn't a need for camera-camera processing, just camera-door controller
When the scene gets too large to represent in enough detail in a single >>>> camera's field of view, then there needs to be a way to coordinate
multiple cameras to a single (virtual?) host. If those cameras were just >>>> "chunks of memory", then the *imagery* would be easy to examine in a single
host -- though the processing power *might* need to increase geometrically >>>> (depending on your current goal)
Yes, but your "chunks of memory" model just doesn't exist as a viable camera
model.
Apparently not -- in the COTS sense. But, that doesn't mean I can't
build a "camera memory emulator".
The downside is that this increases the cost of the "actual camera"
(see my above comment wrt ammortization).
Yep, implementing this likely costs more than giving the camera a dedicated moderate processor to do the major work. Might not handle the actual ID problem
of your Door bell, but could likely process the live video, take a snapshot of
a region with a good view of the vistor coming, and send just that to your master system for ID.
The CMOS cameras with addressable pixels have "access times" significantly >>> lower than your typical memory (and is read once) so doesn't really meet >>> that model. Some of them do allow for sending multiple small regions of
intererst and down loading just those regions, but this then starts to
require moderate processor overhead to be loading all these regions and
updating the grabber to put them where you want.
You would, instead, let the "camera memory emulator" capture the entire
image from the camera and place the entire image in a contiguous
region of memory (from the perspective of the host). The cost of capturing >> the portions that are not used is hidden *in* the cost of the "emulator".
Yep, you could build you system with a two-port memory buffer between the frane
grabber loading with one port, and the decoding processor on the other.
The most cost effective way to do this is likely a commercial frame-grabber with built "two-port" memory, that sits in a slot of a PC type computer. These
would likely not work with a "USB Camera" (why would you need a frame grabber with a camera that has it built in) so would be totally changing your cost models.
IF your current design method is based on using USB cameras, trying to do a full custom interface may be out of your field of operation.
And yes, it does mean that there might be some cases where you need a core >>> module that has TWO cameras connected to a single processor, either to get a
wider field of view, or to combine two different types of camera (maybe a >>> high res black and white to a low res color if you need just minor color >>> information, or combine a visible camera to a thermal camera). These just >>> become another tool in your tool box.
I *think* (uncharted territory) that the better investment is to develop
algorithms that let me distribute the processing among multiple
(single) "camera modules/nodes". How would your "two camera" exemplar
address an application requiring *three* cameras? etc.
The first question comes, what processing are you thinking of that needs images
from 3 cameras.
Note, my two camera example was a case where the processing needed to be done did need data from two cameras.
If you have another task that needs a different camera, you just build a system
with one two camera model and one 1 camera module, relaying back to a central control, or you nominate one of the modules to be central control if the load there is light enough.
Your garage doer example would be built from 4 seperate and independent 1 camera modules, either going to one as the master, or to a 5th module acting as
the master.
The cases I can think of for needing to process three cameras together would be:
1) a system stiching images from 3 cameras and generating a single image out of
it, but that totally breaks your concept of needing only bits of the images, that inherently is using most of each camera, and doing some stiching processing on the overlaps.
2) A Multi-spectrum system, where again, you are taking the ENTIRE scene from the three cameras and producing a merged "false-color" image from them. Again,
this also breaks you partial image model.
I can, currently, distribute this processing by treating the
region of memory into which a (local) camera's imagery is
deserialized as a "memory object" and then exporting *access*
to that object to other similar "camera modules/nodes".
But, the access times of non-local memory are horrendous, given
that the contents are ephemeral (if accesses could be *cached*
on each host needing them, then these costs diminish).
So, I need to come up with algorithms that let me export abstractions
instead of raw data.
Sounds like you current design is very centralized. This limits its scalability,
My first feeling is you seem to be assuming a fairly cheep camera and then
doing some fairly simple processing over the partial image, in which case >>>>> you might even be able to live with a camera that uses a crude SPI
interface to bring the frame in, and a very simple processor.
I use A LOT of cameras. But, I should be able to swap the camera
(upgrade/downgrade) and still rely on the same *local* compute engine. >>>> E.g., some of my cameras have Ir illuminators; it's not important
in others; some are PTZ; others fixed.
Doesn't sound reasonable. If you downgrade a camera, you can't count on it >>> being able to meet the same requirements, or you over speced the initial >>> camera.
Sorry, I was using up/down relative to "nominal camera", not "specific camera
previously selected for application". I'd 8really* like to just have a
single "camera module" (module = CPU+I/O) instead of one for camera type A >> and another for camera type B, etc.
That only works if you are willing to spend for the sports car, even if you just need it to go around the block.
It depends a bit on how much span you need of capability. A $10 camera is likely having a very different interface to a $30,000 camera, so will need a different board. Some boards might handle multiple camera interface types if it
doesn't add a lot to the board, but you are apt to find that you need to make some choice.
Then some tasks will just need a lot more computer power than others. Yes, you
can just put too much computer power on the simple tasks, (and that might make
sense to early design the higher end processor), but ultimately you are going to want the less expensive lower end processors.
You put on a camera a processor capable of handling the tasks you expect out
of that set of hardware. One type of processor likely can handle a variaty
of different camera setup with
Exactly. If a particular instance has an Ir illuminator, then you include >> controls for that in *the* "camera module". If another instance doesn't have
this ability, then those controls go unused.
Yes, Auxilary functionality is often cheap to include the hooks for.
Exactly. But, the algorithms that do the scene analysis can be the same; >> you just parameterize the image and the objects within it that you seek.Watching for an obstruction in the path of a garage door (open/close)
has different requirements than trying to recognize a visitor at the front >>>> door. Or, identify the locations of the occupants of a facility.
Yes, so you don't want to "Pay" for the capability to recognize a visitor in
your garage door sensor, so you use different levels of sensor/processor. >>
Actually, "Tracking" can be a very different type of algorithm then "Detecting". You might be able to use a Tracking base algorithm to Detect, but
likely a much simpler algorithm can be used (needing less resources) to just detect.
There will likely be some combinations that exceed the capabilities of
the hardware to process in real-time. So, you fall back to lower
frame rates or let the algorithms drop targets ("You watch Bob, I'll
watch Tom!")
On 12/29/2022 5:40 PM, Richard Damon wrote:
On 12/29/22 5:57 PM, Don Y wrote:
On 12/29/2022 2:09 PM, Richard Damon wrote:
On 12/29/22 2:26 PM, Don Y wrote:
On 12/29/2022 10:06 AM, Richard Damon wrote:
On 12/29/22 8:16 AM, Don Y wrote:
ISTR playing with de-encapsulated DRAMs as image sensors
back in school (DRAM being relatively new technology, then).
But, most cameras seem to have (bit- or word-) serial interfaces >>>>>>> nowadays. Are there any (mainstream/high volume) devices that
"look" like a chunk of memory, in their native form?
Using a DRAM in that manner would only give you a single bit value >>>>>> for each pixel (maybe some more modern memories store multiple
bits in a cell so you get a few grey levels).
I mentioned the DRAM reference only as an exemplar of how a "true"
parallel, random access interface could exist.
Right, and cameras based on parallel random access do exist, but
tend to be on the smaller and slower end of the spectrum.
There are some CMOS sensors that let you address pixels
individually and in a random order (like you got with the DRAM)
but by its nature, such a readout method tends to be slow, and
space inefficient, so these interfaces tend to be only available
on smaller camera arrays.
But, if you are processing the image, such an approach can lead to
higher throughput than having to transfer a serial data stream into
memory (thus consuming memory bandwidth).
My guess is that in almost all cases, the need to send the address
to teh camera and then get back the pixel value is going to use up
more total bandwidth than getting the image in a stream. The one
exception would be if you need just a very small percentage of the
array data, and it is scattered over the array so a Region of
Interest operation can't be used.
No, you're missing the nature of the DRAM example.
You don't "send" the address of the memory cell desired *to* the DRAM.
You simply *address* the memory cell, directly. I.e., if there are
N locations in the DRAM, then N addresses in your address space are
consumed by it; one for each location in the array.
No, look at you DRAM timing again, the trasaction begins with the
sending of the address over typically two clock edges with RAS and
CAS, and then a couple of clock cycles and then you get back on the
data bus the answer.
But it's a single memory reference. Look at what happens when you deserialize a USB video stream into that same DRAM. The DMAC has
tied up the bus for the same amount of time that the processor
would have if it read those same N locations.
Yes, the addresses come from an address bus, using address space out
of the processor, but it is a multi-cycle operation. Typically, you
read back a "burst" with some minimal caching on the processor side,
but that is more a minor detail.
I'm looking for *that* sort of "direct access" in a camera.
Its been awhile, but I thought some CMOS cameras could work on a
similar basis, strobe a Row/Column address from pins on the camera,
and a few clock cycles later you got a burst out of the camera
starting at the address cell.
I don't want the camera to decide which pixels *it* thinks I want to see.
It sends me a burst of a row -- but the next part of the image I may have wanted to access may have been down the same *column*. Or, in another
part of the image entirely.
Serial protocols inherently deliver data in a predefined pattern
(often intended for display). Scene analysis doesn't necessarily
conform to that same pattern.
Serial protocols inherently deliver data in a predefined pattern
(often intended for display). Scene analysis doesn't necessarily
conform to that same pattern.
Isn't there a camera doing a protocol which allows you to request
a specific area only to be transferred? RFB like, VNC does that
all the time.
On 12/31/2022 4:15 AM, Dimiter_Popoff wrote:
Serial protocols inherently deliver data in a predefined pattern
(often intended for display). Scene analysis doesn't necessarily
conform to that same pattern.
Isn't there a camera doing a protocol which allows you to request
a specific area only to be transferred? RFB like, VNC does that
all the time.
That only makes sense if you know, a priori, which part(s) of the
image you might want to examine. E.g., it would work for
"exposing" just the portion of the field that "overlaps" some
other image. I can get fixed parts of partial frames from
*other* cameras just by ensuring the other camera puts that
portion of the image in a particular memory object and then
export that memory object to the node that wants it.
But, if a target can move into or out of the exposed area, then
you have to make a return trip to the camera to request MORE of
the field.
When your targets are "far away" (like a surveillance camera
monitoring a parking lot), targets don't move from their
previous noted positions considerably from one frame to the
next.
But, when the camera and targets are in close proximity,
there's greater (apparent) relative motion in the same
frame-interval. So, knowing where (x,y+WxH)) the portion of
the image of interest lay, previously, is less predictive
of where it may lie currently.
Having the entire image available means the software
can look <wherever> and <whenever>.
On 12/31/2022 20:16, Don Y wrote:
On 12/31/2022 4:15 AM, Dimiter_Popoff wrote:
Serial protocols inherently deliver data in a predefined pattern
(often intended for display). Scene analysis doesn't necessarily
conform to that same pattern.
Isn't there a camera doing a protocol which allows you to request
a specific area only to be transferred? RFB like, VNC does that
all the time.
That only makes sense if you know, a priori, which part(s) of the
image you might want to examine. E.g., it would work for
"exposing" just the portion of the field that "overlaps" some
other image. I can get fixed parts of partial frames from
*other* cameras just by ensuring the other camera puts that
portion of the image in a particular memory object and then
export that memory object to the node that wants it.
But, if a target can move into or out of the exposed area, then
you have to make a return trip to the camera to request MORE of
the field.
When your targets are "far away" (like a surveillance camera
monitoring a parking lot), targets don't move from their
previous noted positions considerably from one frame to the
next.
But, when the camera and targets are in close proximity,
there's greater (apparent) relative motion in the same
frame-interval. So, knowing where (x,y+WxH)) the portion of
the image of interest lay, previously, is less predictive
of where it may lie currently.
Having the entire image available means the software
can look <wherever> and <whenever>.
Well yes, obviously so, but this is valid whatever the interface.
Direct access to the sensor cells can't be double buffered so
you will have to transfer anyway to get the frame you are analyzing
static.
Perhaps you could find a way to make yourself some camera module
using an existing one, MIPI or even USB, since you are looking for low overall cost; and add some MCU board to it to do the buffering and
transfer areas on request. Or may be put enough CPU power together with
each camera to do most if not all of the analysis... Depending on
which achieves the lowest cost. But I can't say much on cost, that's
pretty far from me (as you know).
[Hope you are faring well... enjoying the COLD! ;) ]
On 12/30/2022 11:02 AM, Richard Damon wrote:
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above) >>>>> - reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras" sort >>>> of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap" processor >>>> right on each camera using a processor with a direct camera interface to >>>> pull in the image and do your processing and send the results over some >>>> comm-link to the center core.
If I went the frame-grabber approach, that would be how I would address the >>> hardware. But, it doesn't scale well. I.e., at what point do you throw in
the towel and say there are too many concurrent images in the scene to
pile them all onto a single "host" processor?
Thats why I didn't suggest that method. I was suggesting each camera has its >> own tightly coupled processor that handles the need of THAT
My existing "module" handles a single USB camera (with a fairly heavy-weight >processor).
But, being USB-based, there is no way to look at *part* of an image.
And, I have to pay a relatively high cost (capturing the entire
image from the serial stream) to look at *any* part of it.
*If* a "camera memory" was available, I would site N of these
in the (64b) address space of the host and let the host pick
and choose which parts of which images it wanted to examine...
without worrying about all of the bandwidth that would have been
consumed deserializing those N images into that memory (which is
a continuous process)
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate
processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have >>> it "notice" any deviation from the "unobstructed" image.
And if one camera can't fit the full scene, you use two cameras, each with >> there own processor, and they each process their own image.
That's the above approach, but...
The only problem is if your image processing algoritm need to compare parts of
the images between the two cameras, which seems unlikely.
Consider watching a single room (e.g., a lobby at a business) and
tracking the movements of "visitors". It's unlikely that an individual's >movements would always be constrained to a single camera field. There will >be times when he/she is "half-in" a field (and possibly NOT in the other, >HALF in the other or ENTIRELY in the other). You can't ignore cases where >the entire object (or, your notion of what that object's characteristics >might be) is not entirely in the field as that leaves a vulnerability.
For example, I watch our garage door with *four* cameras. A camera is >positioned on each side ("door jam"?) of the door "looking at" the other >camera. This because a camera can't likely see the full height of the door >opening ON ITS SIDE OF THE DOOR (so, the opposing camera watches "my side" >and I'll watch *its* side!).
[The other two cameras are similarly positioned on the overhead *track*
onto which the door rolls, when open]
An object in (or near) the doorway can be visible in one (either) or
both cameras, depending on where it is located. Additionally, one of
those manifestations may be only "partial" as regards to where it is
located and intersects the cameras' fields of view.
The "cost" of watching the door is only the cost of the actual *cameras*.
The cost of the compute resources is amortized over the rest of the system
as those can be used for other, non-camera, non-garage related activities.
It does say that if trying to track something across the cameras, you need >> enough overlap to allow them to hand off the object when it is in the overlap.
And, objects that consume large portions of a camera's field of view
require similar handling (unless you can always guarantee that cameras
and targets are "far apart")
When the scene gets too large to represent in enough detail in a single
camera's field of view, then there needs to be a way to coordinate
multiple cameras to a single (virtual?) host. If those cameras were just >>> "chunks of memory", then the *imagery* would be easy to examine in a single >>> host -- though the processing power *might* need to increase geometrically >>> (depending on your current goal)
Yes, but your "chunks of memory" model just doesn't exist as a viable camera >> model.
Apparently not -- in the COTS sense. But, that doesn't mean I can't
build a "camera memory emulator".
The downside is that this increases the cost of the "actual camera"
(see my above comment wrt ammortization).
And, it just moves the point at which a single host (of fixed capabilities) >can no longer handle the scene's complexity. (when you have 10 cameras?)
The CMOS cameras with addressable pixels have "access times" significantly >> lower than your typical memory (and is read once) so doesn't really meet that
model. Some of them do allow for sending multiple small regions of intererst >> and down loading just those regions, but this then starts to require moderate
processor overhead to be loading all these regions and updating the grabber to
put them where you want.
You would, instead, let the "camera memory emulator" capture the entire
image from the camera and place the entire image in a contiguous
region of memory (from the perspective of the host). The cost of capturing >the portions that are not used is hidden *in* the cost of the "emulator".
And yes, it does mean that there might be some cases where you need a core >> module that has TWO cameras connected to a single processor, either to get a >> wider field of view, or to combine two different types of camera (maybe a high
res black and white to a low res color if you need just minor color
information, or combine a visible camera to a thermal camera). These just
become another tool in your tool box.
I *think* (uncharted territory) that the better investment is to develop >algorithms that let me distribute the processing among multiple
(single) "camera modules/nodes". How would your "two camera" exemplar >address an application requiring *three* cameras? etc.
I can, currently, distribute this processing by treating the
region of memory into which a (local) camera's imagery is
deserialized as a "memory object" and then exporting *access*
to that object to other similar "camera modules/nodes".
But, the access times of non-local memory are horrendous, given
that the contents are ephemeral (if accesses could be *cached*
on each host needing them, then these costs diminish).
So, I need to come up with algorithms that let me export abstractions
instead of raw data.
Moving the processing to "host per camera" implementation gives you more >>> MIPS. But, makes coordinating partial results tedious.
Depends on what sort of partial results you are looking at.
"Bob's *head* is at X,Y+H,W in my image -- but, his body is not visible"
"Ah! I was wondering whose legs those were in *my* image!"
It is unclear what you actual image requirements per camera are, so it is >>>> hard to say what level camera and processor you will need.
My first feeling is you seem to be assuming a fairly cheep camera and then >>>> doing some fairly simple processing over the partial image, in which case >>>> you might even be able to live with a camera that uses a crude SPI interface
to bring the frame in, and a very simple processor.
I use A LOT of cameras. But, I should be able to swap the camera
(upgrade/downgrade) and still rely on the same *local* compute engine.
E.g., some of my cameras have Ir illuminators; it's not important
in others; some are PTZ; others fixed.
Doesn't sound reasonable. If you downgrade a camera, you can't count on it >> being able to meet the same requirements, or you over speced the initial camera.
Sorry, I was using up/down relative to "nominal camera", not "specific camera >previously selected for application". I'd 8really* like to just have a >single "camera module" (module = CPU+I/O) instead of one for camera type A >and another for camera type B, etc.
You put on a camera a processor capable of handling the tasks you expect out of
that set of hardware. One type of processor likely can handle a variaty of >> different camera setup with
Exactly. If a particular instance has an Ir illuminator, then you include >controls for that in *the* "camera module". If another instance doesn't have >this ability, then those controls go unused.
Watching for an obstruction in the path of a garage door (open/close)
has different requirements than trying to recognize a visitor at the front >>> door. Or, identify the locations of the occupants of a facility.
Yes, so you don't want to "Pay" for the capability to recognize a visitor in >> your garage door sensor, so you use different levels of sensor/processor.
Exactly. But, the algorithms that do the scene analysis can be the same;
you just parameterize the image and the objects within it that you seek.
There will likely be some combinations that exceed the capabilities of
the hardware to process in real-time. So, you fall back to lower
frame rates or let the algorithms drop targets ("You watch Bob, I'll
watch Tom!")
[Hope you are faring well... enjoying the COLD! ;) ]
Not very. Don't think I have your latest email.
On Fri, 30 Dec 2022 14:59:39 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:
On 12/30/2022 11:02 AM, Richard Damon wrote:
So, my options are:
- reduce the overall frame rate such that N cameras can
be serviced by the USB (or whatever) interface *and*
the processing load
- reduce the resolution of the cameras (a special case of the above) >>>>>> - reduce the number of cameras "per processor" (again, above)
- design a "camera memory" (frame grabber) that I can install
multiply on a single host
- develop distributed algorithms to allow more bandwidth to
effectively be applied
The fact that you are starting for the concept of using "USB Cameras" sort
of starts you with that sort of limit.
My personal thought on your problem is you want to put a "cheap" processor
right on each camera using a processor with a direct camera interface to >>>>> pull in the image and do your processing and send the results over some >>>>> comm-link to the center core.
If I went the frame-grabber approach, that would be how I would address the
hardware. But, it doesn't scale well. I.e., at what point do you throw in
the towel and say there are too many concurrent images in the scene to >>>> pile them all onto a single "host" processor?
Thats why I didn't suggest that method. I was suggesting each camera has its
own tightly coupled processor that handles the need of THAT
My existing "module" handles a single USB camera (with a fairly heavy-weight >> processor).
But, being USB-based, there is no way to look at *part* of an image.
And, I have to pay a relatively high cost (capturing the entire
image from the serial stream) to look at *any* part of it.
*If* a "camera memory" was available, I would site N of these
in the (64b) address space of the host and let the host pick
and choose which parts of which images it wanted to examine...
without worrying about all of the bandwidth that would have been
consumed deserializing those N images into that memory (which is
a continuous process)
That's the way all cameras work - at least low level. The camera
captures a field (or a frame, depending) on its CCD, and then the CCD
pixel data is read out serially by a controller.
What you are looking for is some kind of local frame buffering at the
camera. There are some "smart" cameras that provide that ... and also generally a bunch of image analysis functions that you may or may not
find useful. I haven't played with any of them in a long time, and
when I did the image functions were too primitive for my purpose, so I
really can't recommend anything.
ISTM that the better solution is to develop algorithms that can
process portions of the scene, concurrently, on different "hosts".
Then, coordinate these "partial results" to form the desired result.
I already have a "camera module" (host+USB camera) that has adequate
processing power to handle a "single camera scene". But, these all
assume the scene can be easily defined to fit in that camera's field
of view. E.g., point a camera across the path of a garage door and have >>>> it "notice" any deviation from the "unobstructed" image.
And if one camera can't fit the full scene, you use two cameras, each with >>> there own processor, and they each process their own image.
That's the above approach, but...
The only problem is if your image processing algoritm need to compare parts of
the images between the two cameras, which seems unlikely.
Consider watching a single room (e.g., a lobby at a business) and
tracking the movements of "visitors". It's unlikely that an individual's
movements would always be constrained to a single camera field. There will >> be times when he/she is "half-in" a field (and possibly NOT in the other,
HALF in the other or ENTIRELY in the other). You can't ignore cases where >> the entire object (or, your notion of what that object's characteristics
might be) is not entirely in the field as that leaves a vulnerability.
I've done simple cases following objects from one camera to another,
but not dealing with different angles/points of view - the cameras had contiguous views with a bit of overlap. That made it relatively easy.
Following a person, e.g., seen quarter-behind in one camera, and
tracking them to another camera that sees a side view - from the
/other/ side -
Just following a person is easy, but tracking a specific person,
particularly when multiple people are present, gets very complicated
very quickly.
On 12/31/2022 1:13 PM, Dimiter_Popoff wrote:
On 12/31/2022 20:16, Don Y wrote:
On 12/31/2022 4:15 AM, Dimiter_Popoff wrote:
Serial protocols inherently deliver data in a predefined pattern
(often intended for display). Scene analysis doesn't necessarily
conform to that same pattern.
Isn't there a camera doing a protocol which allows you to request
a specific area only to be transferred? RFB like, VNC does that
all the time.
That only makes sense if you know, a priori, which part(s) of the
image you might want to examine. E.g., it would work for
"exposing" just the portion of the field that "overlaps" some
other image. I can get fixed parts of partial frames from
*other* cameras just by ensuring the other camera puts that
portion of the image in a particular memory object and then
export that memory object to the node that wants it.
But, if a target can move into or out of the exposed area, then
you have to make a return trip to the camera to request MORE of
the field.
When your targets are "far away" (like a surveillance camera
monitoring a parking lot), targets don't move from their
previous noted positions considerably from one frame to the
next.
But, when the camera and targets are in close proximity,
there's greater (apparent) relative motion in the same
frame-interval. So, knowing where (x,y+WxH)) the portion of
the image of interest lay, previously, is less predictive
of where it may lie currently.
Having the entire image available means the software
can look <wherever> and <whenever>.
Well yes, obviously so, but this is valid whatever the interface.
Direct access to the sensor cells can't be double buffered so
you will have to transfer anyway to get the frame you are analyzing
static.
I would assume the devices would have evolved an "internal buffer"
(as I said, my experience with *DRAM* in this manner was 40+ years
ago)
Perhaps you could find a way to make yourself some camera module
using an existing one, MIPI or even USB, since you are looking for low
overall cost; and add some MCU board to it to do the buffering and
transfer areas on request. Or may be put enough CPU power together with
each camera to do most if not all of the analysis... Depending on
which achieves the lowest cost. But I can't say much on cost, that's
pretty far from me (as you know).
My current approach gives me that -- MIPS, size, etc. But, the cost
of transferring parts of the image (without adding a specific mechanism)
is a "shared page" (DSM). So, host (on node A) references part of
node *B*s frame buffer and the page (on B) containing that memory
address gets shipped back to node A and mapped into A's memory.
....
But, transport delays make this unsuitable for real-time work;
a megabyte of imagery would require 100ms to transfer, in "raw"
form. (I could encode it on the originating host; transfer it
and then decode it on the receiving host -- at the expense of MIPS.
This is how I "record" video without saturating the network)
So, you (B) want to "abstract" the salient features of the image
while it is on B and then transfer just those to A. *Use*
them, on A, and then move on to the next set of features
(that B has computed while A was busy chewing on the last set)
Or, give A direct access to the native data (without A having
to capture video streams from each of the cameras that it wants
to potentially examine)
On 1/1/2023 7:04 AM, Dimiter_Popoff wrote:
........
In RFB, the server can - and should - decide which parts of the
framebuffer have changed and send across only them. Which works
fine for computer generated images - plenty of single colour areas,
Yes, but if the receiving end has no interest in those areas
of the image, then you're just wasting effort (bandwidth)
transfering them -- esp if the areas of interest will need
that bandwidth!
no noise etc. In your case you might have to resort to jpeg
the image downgrading its quality so "small" changes would
disappear, I think those who write video encoders do something
like that (for my vnc server lossless RLE was plenty, but it it
is not very efficient when the screen is some real life photo,
obviously).
I think the solution is to share abstractions. Design the
algorithms so they can address partial "objects of interest"
and report on those. Then, coordinate those partial results
to come up with a unified concept of what's happening in
the observed scene.
But, this is a fair bit harder than just trying to look at
a unified frame buffer and detect objects/motion!
OTOH, if it was easy, it would be boring ("What's to be learned
from doing something that's already been done?")
Perhaps you could find a way to make yourself some camera module
using an existing one, MIPI or even USB, since you are looking for low
overall cost; and add some MCU board to it to do the buffering and
transfer areas on request. Or may be put enough CPU power together with
each camera to do most if not all of the analysis... Depending on
which achieves the lowest cost. But I can't say much on cost, that's
pretty far from me (as you know).
My current approach gives me that -- MIPS, size, etc. But, the cost
of transferring parts of the image (without adding a specific mechanism)
is a "shared page" (DSM). So, host (on node A) references part of
node *B*s frame buffer and the page (on B) containing that memory
address gets shipped back to node A and mapped into A's memory.
I assume A and B are connected over Ethernet via tcp/ip? Or are they
just two cores on the same chip or something?
But, transport delays make this unsuitable for real-time work;
a megabyte of imagery would require 100ms to transfer, in "raw"
form. (I could encode it on the originating host; transfer it
and then decode it on the receiving host -- at the expense of MIPS.
This is how I "record" video without saturating the network)
100 ms latency would be an issue if you face say A-Train
(for you and the rest who have not watched "The Boys" - he is a
super (a "sup" as they have it) who can run fast enough to not
be seen by normal humans...) :-).
So, you (B) want to "abstract" the salient features of the image
while it is on B and then transfer just those to A. *Use*
them, on A, and then move on to the next set of features
(that B has computed while A was busy chewing on the last set)
Or, give A direct access to the native data (without A having
to capture video streams from each of the cameras that it wants
to potentially examine)
In RFB, the server can - and should - decide which parts of the
framebuffer have changed and send across only them. Which works
fine for computer generated images - plenty of single colour areas,
no noise etc. In your case you might have to resort to jpeg
the image downgrading its quality so "small" changes would
disappear, I think those who write video encoders do something
like that (for my vnc server lossless RLE was plenty, but it it
is not very efficient when the screen is some real life photo,
obviously).
On 1/1/2023 7:04 AM, Dimiter_Popoff wrote:
The bigger problem is throughput. You don't care if all of your
references are skewed 100ms in time; add enough buffering to
ensure every frame remains available for that full 100ms and
just expect the results to be "late".
The problem happens when there's another frame coming before
you've finished processing the current frame. And so on.
So, while it is "slick" and eliminates a lot of explicit remote
access code being exposed to the algorithm (e.g., "get me location
X,Y of the remote frame buffer"), it's just not practical for the >application.
In RFB, the server can - and should - decide which parts of the
framebuffer have changed and send across only them. Which works
fine for computer generated images - plenty of single colour areas,
Yes, but if the receiving end has no interest in those areas
of the image, then you're just wasting effort (bandwidth)
transfering them -- esp if the areas of interest will need
that bandwidth!
no noise etc. In your case you might have to resort to jpeg
the image downgrading its quality so "small" changes would
disappear, I think those who write video encoders do something
like that (for my vnc server lossless RLE was plenty, but it it
is not very efficient when the screen is some real life photo,
obviously).
I think the solution is to share abstractions. Design the
algorithms so they can address partial "objects of interest"
and report on those. Then, coordinate those partial results
to come up with a unified concept of what's happening in
the observed scene.
But, this is a fair bit harder than just trying to look at
a unified frame buffer and detect objects/motion!
OTOH, if it was easy, it would be boring ("What's to be learned
from doing something that's already been done?")
On 1/1/2023 23:28, Don Y wrote:
On 1/1/2023 7:04 AM, Dimiter_Popoff wrote:
........
In RFB, the server can - and should - decide which parts of the
framebuffer have changed and send across only them. Which works
fine for computer generated images - plenty of single colour areas,
Yes, but if the receiving end has no interest in those areas
of the image, then you're just wasting effort (bandwidth)
transfering them -- esp if the areas of interest will need
that bandwidth!
But nothing is stopping the receiving end to request a particular area
and the sending side sending just the changed parts of it.
I am not suggesting you use RFB, I use it just as an example.
no noise etc. In your case you might have to resort to jpeg
the image downgrading its quality so "small" changes would
disappear, I think those who write video encoders do something
like that (for my vnc server lossless RLE was plenty, but it it
is not very efficient when the screen is some real life photo,
obviously).
I think the solution is to share abstractions. Design the
algorithms so they can address partial "objects of interest"
and report on those. Then, coordinate those partial results
to come up with a unified concept of what's happening in
the observed scene.
Well I think this is the way to go, too. This implies enough
CPU horsepowers per camera which nowadays might be practical.
But, this is a fair bit harder than just trying to look at
a unified frame buffer and detect objects/motion!
Well yes but you lose the framebuffer transfer problem, no
need to do your "remote virtual machine" for that etc.
OTOH, if it was easy, it would be boring ("What's to be learned
from doing something that's already been done?")
Not only that; if it were easy everyone else would be doing it :-).
On Sun, 1 Jan 2023 14:28:20 -0700, Don Y <blockedofcourse@foo.invalid>
wrote:
On 1/1/2023 7:04 AM, Dimiter_Popoff wrote:
The bigger problem is throughput. You don't care if all of your
references are skewed 100ms in time; add enough buffering to
ensure every frame remains available for that full 100ms and
just expect the results to be "late".
The problem happens when there's another frame coming before
you've finished processing the current frame. And so on.
So, while it is "slick" and eliminates a lot of explicit remote
access code being exposed to the algorithm (e.g., "get me location
X,Y of the remote frame buffer"), it's just not practical for the
application.
All cameras have a free-run "demand" mode in which (between resets)
the CCD is always accumulating - waiting to be read out. But many
also have a mode in they do nothing until commanded.
In any event, without command the controller will just service the CCD
- it won't transfer the image anywhere unless asked.
Many "smart" cameras can do ersatz stream compression by double
buffering internally and performing image subtraction to remove
unchanging (to some threshold) images. In a motion activated
environment this can greatly cut down on the number of images YOU have
to process.
Better ones also offer a suite of onboard image processing functions:
motion detection, contrast expansion, thresholding, line finding ...
now even some offer pattern object recognition. If the functions they provide are useful, it can pay to take advantage of them.
I know you are (thinking of) designing your own ... you should maybe
think hard about what smarts you want onboard.
In RFB, the server can - and should - decide which parts of the
framebuffer have changed and send across only them. Which works
fine for computer generated images - plenty of single colour areas,
Yes, but if the receiving end has no interest in those areas
of the image, then you're just wasting effort (bandwidth)
transfering them -- esp if the areas of interest will need
that bandwidth!
That's true, but protocols like VNC's "copyrect" encoding essentially
divide the image into a large checkerboard, and transfers only those "squares" where the underlying image has changed. What is considered a "change" could be further limited on the sending side by
pre-processing: erosion and/or thresholding.
The biggest problem always is how much extra buffering you need for as-yet-unprocessed images in the stream - while you're working on one
thing, you easily can lose something else.
no noise etc. In your case you might have to resort to jpeg
the image downgrading its quality so "small" changes would
disappear, I think those who write video encoders do something
like that (for my vnc server lossless RLE was plenty, but it it
is not very efficient when the screen is some real life photo,
obviously).
And RLE or copyrect can be combined further with lossless LZ
compression.
For really good results, wavelet compression is the best - it
basically reduces the whole image to a set of equation coefficients,
and you can preserve (or degrade) detail in the reconstructed image by altering how many coefficients are calculated from the original.
But it is compute intensive: you really need a DSP or SIMD CPU to do
it efficiently.
I think the solution is to share abstractions. Design the
algorithms so they can address partial "objects of interest"
and report on those. Then, coordinate those partial results
to come up with a unified concept of what's happening in
the observed scene.
But, this is a fair bit harder than just trying to look at
a unified frame buffer and detect objects/motion!
OTOH, if it was easy, it would be boring ("What's to be learned
from doing something that's already been done?")
As I said previously, smart cameras can do things like motion
detection onboard, and report the AOI along with the image.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 308 |
Nodes: | 16 (2 / 14) |
Uptime: | 89:10:20 |
Calls: | 6,923 |
Calls today: | 1 |
Files: | 12,382 |
Messages: | 5,433,817 |