On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any improvement,
other operating systems offer some good ideas that it would be nice to
see in VMS, especially around security and internal isolation in general.
Then discuss the ideas and concepts ...
On 2024-01-24, Dave Froble <davef@tsoft-inc.com> wrote:
On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any improvement, >>> other operating systems offer some good ideas that it would be nice to
see in VMS, especially around security and internal isolation in general. >>>
Then discuss the ideas and concepts ...
OK.
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
Proper package management and management of updates.
Loadable and unloadable kernel modules, with device driver/filesystem/etc >functionality available from within these modules.
ASLR and KASLR support.
Proper timezone management. (Everything is always UTC based, and your >timezone is merely a local session property with no effect on the
on-disk timestamps).
The last one is policy-based, not technical:
A vendor that has proper security reporting mechanisms.
Does anyone have any others to add to the list ?
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
Proper package management
and management of updates.
Loadable and unloadable kernel modules, with device driver/filesystem/etc functionality available from within these modules.
ASLR and KASLR support.
Proper timezone management. (Everything is always UTC based, and your timezone is merely a local session property with no effect on the
on-disk timestamps).
Some sort of userspace pluggable filesystem support.
FUSE, 9P + a mount driver, whatever.
On 1/25/2024 10:31 AM, Dan Cross wrote:
Some sort of userspace pluggable filesystem support.
FUSE, 9P + a mount driver, whatever.
That would also be nice.
But how many potential VMS users will consider "userspace
pluggable filesystem support" important in decision process?
On 1/25/2024 8:21 AM, Simon Clubley wrote:
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
The market want containers.
I suspect that means Hoff jails with a marketing label of "container"
instead of "jail".
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
+better control structures
+better data types
But I doubt it makes sense business wise.
VMS got:
* DCL for backwards compatibility
* GNV bash for *nix compatibility
* Python and Perl for more programmatic scripting
Even though DCL2 or XDCL would be nice then I don't think it will
increase VMS sale.
Proper package management
Traditional Linux package management at the OS level would
be the wrong path. The result is a mess.
The right approach is package management at the application level.
maven, nuget, pypi, npm, composer etc. not yum, dnf etc..
For managing the truly OS stuff relative little is needed. PCSI2 or XPCSI.
and management of updates.
An option for more automated updates of VMS would be nice.
Loadable and unloadable kernel modules, with device
driver/filesystem/etc functionality available from within these modules.
Nice.
But again I doubt it will increase VMS sale.
ASLR and KASLR support.
That would probably come as part of ongoing security enhancements at
some point in time.
Proper timezone management. (Everything is always UTC based, and your
timezone is merely a local session property with no effect on the
on-disk timestamps).
Nice but tricky to implement without breaking stuff.
On 2024-01-25 20:20:09 +0000, Arne Vajhøj said:
On 1/25/2024 8:21 AM, Simon Clubley wrote:
Mandatory Access Controls (my preference) or jails (Stephen's
preference).
The market want containers.
I suspect that means Hoff jails with a marketing label of "container"
instead of "jail".
Jails / sandboxes can be used as a component of containers, but—as I've commented elsewhere—containers are far too reminiscent of licensing arbitrage. Which can somewhat dampen vendor enthusiasm.
Jails / sandboxes can be built upon some of the parts of mandatory
access controls, but I ~never want to have to use a system configured
for SEVMS-style MAC. Jails, sure. SEVMS-style MAC, not so much.
A shell with decent modern functionality such as:
    Proper command history retention and merging from multiple sessions >>>     Easy searching of command history
    Tab completion
    Editing long command lines
    Globbing
+better control structures
+better data types
But I doubt it makes sense business wise.
VMS got:
* DCL for backwards compatibility
* GNV bash for *nix compatibility
* Python and Perl for more programmatic scripting
Even though DCL2 or XDCL would be nice then I don't think it will
increase VMS sale.
Likely not perceived as an increase sales. Though as happened with DII
COE, sometimes major customers will establish requirements here.
There are a lot of things in this same general category too, which is
the other side of facilitating and encouraging new adoptions.
ASLR and KASLR support.
That would probably come as part of ongoing security enhancements at
some point in time.
Stack canaries might be easier.
Proper timezone management. (Everything is always UTC based, and your
timezone is merely a local session property with no effect on the
on-disk timestamps).
Nice but tricky to implement without breaking stuff.
That's been the compatibility hobgoblin ~forever. The quadword format
is embedded all over the place. For some sites, switching to UTC as the
base works fine.
I've run OpenVMS servers set to UTC at various installations, too, ("Oh, that? Yeah. The server is in England." usually suffices.)
Downside is that saved dates can be off by a day pending a rewrite,
which can absolutely be a non-starter for some sites.
On 26/01/2024 00:18, Arne Vajhøj wrote:
Very few people work at the command prompt today. I doubt "shell power"
will become a requirement.
Not sure that is true. MS Servers don't have a GUI, Most Linux servers are installed without a GUI
GUI is for userspace
Very few people work at the command prompt today. I doubt "shell power"
will become a requirement.
On 26/01/2024 00:18, Arne Vajhøj wrote:
Very few people work at the command prompt today. I doubt "shell power"
will become a requirement.
Not sure that is true. MS Servers don't have a GUI, Most Linux servers
are installed without a GUI
On 1/25/2024 6:59 PM, Stephen Hoffman wrote:
Jails / sandboxes can be built upon some of the parts of mandatory
access controls, but I ~never want to have to use a system configured
for SEVMS-style MAC. Jails, sure. SEVMS-style MAC, not so much.
SEVMS-style MAC was targeting the 1980's requirements.
On 2024-01-25, Arne Vajhøj <arne@vajhoej.dk> wrote:
On 1/25/2024 6:59 PM, Stephen Hoffman wrote:
Jails / sandboxes can be built upon some of the parts of mandatory
access controls, but I ~never want to have to use a system configured
for SEVMS-style MAC. Jails, sure. SEVMS-style MAC, not so much.
SEVMS-style MAC was targeting the 1980's requirements.
When I talk about MAC, I am talking about SELinux style MAC, not SEVMS.
I've read the public SEVMS documentation and it is way too limiting for today's world. SELinux fits right in however. One of the things I like
about SELinux is just how fine-grained and how wide-ranging the control
is. For example, you can allow a service to make outgoing TCP connections
on some ports and deny it access to everything other TCP port.
That way, even if the service gets compromised, the shellcode _still_
can't make an outgoing connection on any TCP port the service has been
denied access to.
On 26/01/2024 00:18, Arne Vajhøj wrote:
Very few people work at the command prompt today. I doubt "shell power"
will become a requirement.
Not sure that is true. MS Servers don't have a GUI, Most Linux servers
are installed without a GUI
GUI is for userspace
On 2024-01-24, Dave Froble <davef@tsoft-inc.com> wrote:
On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any improvement, >>> other operating systems offer some good ideas that it would be nice to
see in VMS, especially around security and internal isolation in general. >>>
Then discuss the ideas and concepts ...
OK.
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
On 1/26/2024 8:16 AM, Simon Clubley wrote:
On 2024-01-25, Arne Vajhøj <arne@vajhoej.dk> wrote:
On 1/25/2024 6:59 PM, Stephen Hoffman wrote:
Jails / sandboxes can be built upon some of the parts of mandatory
access controls, but I ~never want to have to use a system configured
for SEVMS-style MAC. Jails, sure. SEVMS-style MAC, not so much.
SEVMS-style MAC was targeting the 1980's requirements.
When I talk about MAC, I am talking about SELinux style MAC, not SEVMS.
I've read the public SEVMS documentation and it is way too limiting for
today's world. SELinux fits right in however. One of the things I like
about SELinux is just how fine-grained and how wide-ranging the control
is. For example, you can allow a service to make outgoing TCP connections
on some ports and deny it access to everything other TCP port.
That way, even if the service gets compromised, the shellcode _still_
can't make an outgoing connection on any TCP port the service has been
denied access to.
Is that even MAC? Elsewhere it is called a software firewall.
It is certainly a well known feature. Windows also got it.
In theory it does enhance security. With no other mitigations
in place it can prevent some problems. Like Log4Shell.
But I don't know about how much impact it has in real life.
Secure servers are already behind a firewall that by default
blocks, so outgoing traffic is blocked.
Be a bit reasonable Simon ...
On 1/25/2024 10:31 AM, Dan Cross wrote:
Some sort of userspace pluggable filesystem support.
FUSE, 9P + a mount driver, whatever.
That would also be nice.
But how many potential VMS users will consider "userspace
pluggable filesystem support" important in decision process?
On production systems, I don't think it's all that useful for users to
be able to mount and dismount filesystems, or to install their own new >filesystem drivers of their own design.
I -do- think that there is some security benefit in having the filesystem >support in user space, but I also think the performance penalty is usually >not worth it.
What -would- be useful would be the ability to plug new filesystems easily >into the kernel, along with ntfs and various fat drivers supplied as needed. >Do I need to be able to do this dynamically from user space? Not really.
... some [users] may never have seen a command prompt on any OS.
On 2024-01-26, Dave Froble <davef@tsoft-inc.com> wrote:
Be a bit reasonable Simon ...
Mandatory Access Controls or jails can absolutely be a _direct_ part of
a production environment.
I also notice you left out all the other things in my list. They can also
be a direct part of a production environment. :-)
I am being reasonable, and the list is a very reasonable list for
production environments.
Simon.
Simon Clubley formulated the question :
Does anyone have any others to add to the list ?
Yes. Some kind of automatic disaster recovery. That is, if a process,
or a set of processes, run on a system that crashes, those processes are automatically restarted on another cluster member, transparently, with
no manual intervention, and continue from the point they were at when
the system crashed. No transaction lost of any kind, and without having
to add anything in the code that those processes are running. The
operating system (or layered product) does all the work transparently.
Should work with code written 30 years ago, with ACMS applications,
anything.
On 2024-01-24, Dave Froble <davef@tsoft-inc.com> wrote:
On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any improvement, >>> other operating systems offer some good ideas that it would be nice to
see in VMS, especially around security and internal isolation in general. >>>
Then discuss the ideas and concepts ...
OK.
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
Proper package management and management of updates.
Loadable and unloadable kernel modules, with device driver/filesystem/etc functionality available from within these modules.
ASLR and KASLR support.
Proper timezone management. (Everything is always UTC based, and your timezone is merely a local session property with no effect on the
on-disk timestamps).
The last one is policy-based, not technical:
A vendor that has proper security reporting mechanisms.
Does anyone have any others to add to the list ?
Simon.
On 1/28/2024 8:25 AM, Marc Van Dyck wrote:
Simon Clubley formulated the question :
Does anyone have any others to add to the list ?
Yes. Some kind of automatic disaster recovery. That is, if a process,
or a set of processes, run on a system that crashes, those processes are
automatically restarted on another cluster member, transparently, with
no manual intervention, and continue from the point they were at when
the system crashed. No transaction lost of any kind, and without having
to add anything in the code that those processes are running. The
operating system (or layered product) does all the work transparently.
Should work with code written 30 years ago, with ACMS applications,
anything.
Something like Tandem NonStop lock-step?
Arne
On 1/28/2024 8:25 AM, Marc Van Dyck wrote:
Simon Clubley formulated the question :
Does anyone have any others to add to the list ?
Yes. Some kind of automatic disaster recovery. That is, if a process,
or a set of processes, run on a system that crashes, those processes are
automatically restarted on another cluster member, transparently, with
no manual intervention, and continue from the point they were at when
the system crashed. No transaction lost of any kind, and without having
to add anything in the code that those processes are running. The
operating system (or layered product) does all the work transparently.
Should work with code written 30 years ago, with ACMS applications,
anything.
Something like Tandem NonStop lock-step?
Arne
On 1/28/2024 8:32 AM, Arne Vajhøj wrote:
On 1/28/2024 8:25 AM, Marc Van Dyck wrote:
Simon Clubley formulated the question :
Does anyone have any others to add to the list ?
Yes. Some kind of automatic disaster recovery. That is, if a process,
or a set of processes, run on a system that crashes, those processes are >>> automatically restarted on another cluster member, transparently, with
no manual intervention, and continue from the point they were at when
the system crashed. No transaction lost of any kind, and without having
to add anything in the code that those processes are running. The
operating system (or layered product) does all the work transparently.
Should work with code written 30 years ago, with ACMS applications,
anything.
Something like Tandem NonStop lock-step?
Arne
Well, no, not really. What I'd envision would be what I'd call an application monitor, for lack of a better name, that would be able to know what the applications should be doing, to monitor that activity, and to do whatever necessary to continue the activity, should anything happen to that activity. Yeah, non-stop, but not the Tandem design.
Just a concept, and design and implementation might be "interesting".
I'd just note that the OSs would be included as applications, so re-starting them from where they were interrupted would be included in the concept. So, yeah, the monitor would be outside/over the OSs. Perhaps something like happens with VMs. Except VMs want to move the activity to another system, not recover on the same system.
On 2024-01-24, Dave Froble <davef@tsoft-inc.com> wrote:
On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any improvement, >>> other operating systems offer some good ideas that it would be nice to
see in VMS, especially around security and internal isolation in general. >>>
Then discuss the ideas and concepts ...
OK.
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple sessions
Easy searching of command history
Tab completion
Editing long command lines
Globbing
Proper package management and management of updates.
Loadable and unloadable kernel modules, with device driver/filesystem/etc functionality available from within these modules.
ASLR and KASLR support.
Proper timezone management. (Everything is always UTC based, and your timezone is merely a local session property with no effect on the
on-disk timestamps).
The last one is policy-based, not technical:
A vendor that has proper security reporting mechanisms.
Does anyone have any others to add to the list ?
Simon.
Marc Van Dyck schrieb am 29.01.2024 um 16:36:
Dave Froble formulated the question :
I'd just note that the OSs would be included as applications, so
re-starting them from where they were interrupted would be included
in the concept. So, yeah, the monitor would be outside/over the OSs.
Perhaps something like happens with VMs. Except VMs want to move the
activity to another system, not recover on the same system.
Whatever the design and implementation, this would be a really useful
and marketable addition to the OpenVMS cluster concept. Clusters were
invented 40 years ago to implement horizontal scalability, because
vertical scalability was impossible, technically or financially. This
issue has mostly disappeared today, current hardware being able to
deliver any power we might want. Today's clusters are essentially
put in place for redundancy or disaster recovery purposes ; the next
logical step should be to provide this redundancy in a transparent way
to the system user.
This should also be, as opposed to simple user niceties, something that
allows VSi to make money with.
Would OpenVMS Service Control cover your needs?
<https://vmssoftware.com/products/service-control/>
Service Control was originally developed by Wolfgang Burger at HP in
Vienna and later adopted by VSI. As far as I know it is (still) offered
as a service, not a product - but only VSI can tell.
Dave Froble formulated the question :
On 1/28/2024 8:32 AM, Arne Vajhøj wrote:
On 1/28/2024 8:25 AM, Marc Van Dyck wrote:
Simon Clubley formulated the question :
Does anyone have any others to add to the list ?
Yes. Some kind of automatic disaster recovery. That is, if a process,
or a set of processes, run on a system that crashes, those processes
are
automatically restarted on another cluster member, transparently, with >>>> no manual intervention, and continue from the point they were at when
the system crashed. No transaction lost of any kind, and without having >>>> to add anything in the code that those processes are running. The
operating system (or layered product) does all the work transparently. >>>> Should work with code written 30 years ago, with ACMS applications,
anything.
Something like Tandem NonStop lock-step?
Arne
Well, no, not really. What I'd envision would be what I'd call an
application monitor, for lack of a better name, that would be able to
know what the applications should be doing, to monitor that activity,
and to do whatever necessary to continue the activity, should anything
happen to that activity. Yeah, non-stop, but not the Tandem design.
Just a concept, and design and implementation might be "interesting".
I'd just note that the OSs would be included as applications, so
re-starting them from where they were interrupted would be included in
the concept. So, yeah, the monitor would be outside/over the OSs.
Perhaps something like happens with VMs. Except VMs want to move the
activity to another system, not recover on the same system.
Whatever the design and implementation, this would be a really useful
and marketable addition to the OpenVMS cluster concept. Clusters were invented 40 years ago to implement horizontal scalability, because
vertical scalability was impossible, technically or financially. This
issue has mostly disappeared today, current hardware being able to
deliver any power we might want. Today's clusters are essentially
put in place for redundancy or disaster recovery purposes ; the next
logical step should be to provide this redundancy in a transparent way
to the system user.
This should also be, as opposed to simple user niceties, something that allows VSi to make money with.
On 1/29/2024 6:21 PM, Hans Bachner wrote:
Marc Van Dyck schrieb am 29.01.2024 um 16:36:
Dave Froble formulated the question :
I'd just note that the OSs would be included as applications, so re-starting
them from where they were interrupted would be included in the concept. So,
yeah, the monitor would be outside/over the OSs. Perhaps something like >>>> happens with VMs. Except VMs want to move the activity to another system, >>>> not recover on the same system.
Whatever the design and implementation, this would be a really useful
and marketable addition to the OpenVMS cluster concept. Clusters were
invented 40 years ago to implement horizontal scalability, because
vertical scalability was impossible, technically or financially. This
issue has mostly disappeared today, current hardware being able to
deliver any power we might want. Today's clusters are essentially
put in place for redundancy or disaster recovery purposes ; the next
logical step should be to provide this redundancy in a transparent way
to the system user.
This should also be, as opposed to simple user niceties, something that
allows VSi to make money with.
Would OpenVMS Service Control cover your needs?
<https://vmssoftware.com/products/service-control/>
Service Control was originally developed by Wolfgang Burger at HP in Vienna >> and later adopted by VSI. As far as I know it is (still) offered as a service,
not a product - but only VSI can tell.
This indeed seems like the app<--->VMS equivalent of
VM<--->ESXi.
You define an app to be running on one node in the cluster
and if something happens then the software start the app
on another node.
Like you define a VM to be running on one ESXi server in the
cluster and if something happens then VMWare spin up
the VM on another ESXi server.
Arne
On 1/29/2024 7:00 PM, Arne Vajhøj wrote:
On 1/29/2024 6:21 PM, Hans Bachner wrote:
Marc Van Dyck schrieb am 29.01.2024 um 16:36:
Dave Froble formulated the question :
I'd just note that the OSs would be included as applications, so
re-starting
them from where they were interrupted would be included in the concept. >>>>> So,
yeah, the monitor would be outside/over the OSs. Perhaps something like >>>>> happens with VMs. Except VMs want to move the activity to another
system,
not recover on the same system.
Whatever the design and implementation, this would be a really useful
and marketable addition to the OpenVMS cluster concept. Clusters were
invented 40 years ago to implement horizontal scalability, because
vertical scalability was impossible, technically or financially. This
issue has mostly disappeared today, current hardware being able to
deliver any power we might want. Today's clusters are essentially
put in place for redundancy or disaster recovery purposes ; the next
logical step should be to provide this redundancy in a transparent way >>>> to the system user.
This should also be, as opposed to simple user niceties, something that >>>> allows VSi to make money with.
Would OpenVMS Service Control cover your needs?
<https://vmssoftware.com/products/service-control/>
Service Control was originally developed by Wolfgang Burger at HP in
Vienna
and later adopted by VSI. As far as I know it is (still) offered as a
service,
not a product - but only VSI can tell.
This indeed seems like the app<--->VMS equivalent of
VM<--->ESXi.
You define an app to be running on one node in the cluster
and if something happens then the software start the app
on another node.
Like you define a VM to be running on one ESXi server in the
cluster and if something happens then VMWare spin up
the VM on another ESXi server.
Arne
Well, there are apps, and then there are other apps ...
Ok, a web server handling connection requests. Perhaps one or more connections are disrupted before finishing. A re-start will begin to again handle connection requests. Perhaps reasonable.
Then, an example from one of my old customers:
Orders were build interactively, and the data was stored in an intermediate file. When done building, the intermediate file is then queued to a poster that processes the data and performs updates to all pertinent database files, then deletes the intermediate file.
Ok, what happens when the system crashes during processing of an order? Things are left incomplete and a nasty mess. Re-starting the poster will make things worse. So, just restarting is not such a good idea.
In the example, best not to process the order that was interrupted. Thankfully, this almost never happened. Thank you VMS and DEC hardware and battery backup UPS. But, it was still a possibility.
The partial solution was to build checkpoints into the design. At each specific point in the poster, a flag was set, and forced to disk, as each file update occurred. The poster was set up to respect the checkpoint flags.
Worked sort of well. Thee was still the possibility the checkpoint flags weren't written to disk. I didn't have an app that reviewed the information, and automatically re-queued it telling the poster where to re-start. That was a tedious manual task.
Hey, with most things, there is a point of diminishing returns on efforts. Just not worth the cost.
Please don't start ranting about a database with 2 stage commits. Didn't have one.
But my point is, just re-starting an application isn't always a solution.
On 2024-01-24, Dave Froble <davef@tsoft-inc.com> wrote:
On 1/24/2024 8:13 AM, Simon Clubley wrote:
On 2024-01-23, Dave Froble <davef@tsoft-inc.com> wrote:
What is really rude is talking about Linux on c.o.v ...
Unless you consider VMS to be perfect and not in need of any
improvement, other operating systems offer some good ideas that it
would be nice to see in VMS, especially around security and
internal isolation in general.
Then discuss the ideas and concepts ...
OK.
A random sample of things from Linux/Unix I would like to see in VMS:
Mandatory Access Controls (my preference) or jails (Stephen's
preference).
A shell with decent modern functionality such as:
Proper command history retention and merging from multiple
sessions Easy searching of command history
Tab completion
Editing long command lines
Globbing
Proper package management and management of updates.
Loadable and unloadable kernel modules, with device
driver/filesystem/etc functionality available from within these
modules.
ASLR and KASLR support.
Proper timezone management. (Everything is always UTC based, and your timezone is merely a local session property with no effect on the
on-disk timestamps).
The last one is policy-based, not technical:
A vendor that has proper security reporting mechanisms.
Does anyone have any others to add to the list ?
Simon.
Dave Froble wrote on 30/01/2024 :
Ok, a web server handling connection requests. Perhaps one or more
connections are disrupted before finishing. A re-start will begin to
again handle connection requests. Perhaps reasonable.
Then, an example from one of my old customers:
Orders were build interactively, and the data was stored in an
intermediate file. When done building, the intermediate file is then
queued to a poster that processes the data and performs updates to all
pertinent database files, then deletes the intermediate file.
Ok, what happens when the system crashes during processing of an
order? Things are left incomplete and a nasty mess. Re-starting the
poster will make things worse. So, just restarting is not such a good
idea.
In the example, best not to process the order that was interrupted.
Thankfully, this almost never happened. Thank you VMS and DEC
hardware and battery backup UPS. But, it was still a possibility.
The partial solution was to build checkpoints into the design. At
each specific point in the poster, a flag was set, and forced to disk,
as each file update occurred. The poster was set up to respect the
checkpoint flags.  Worked sort of well. Thee was still the
possibility the checkpoint flags weren't written to disk. I didn't
have an app that reviewed the information, and automatically re-queued
it telling the poster where to re-start. That was a tedious manual task. >>
Hey, with most things, there is a point of diminishing returns on
efforts. Just not worth the cost.
Please don't start ranting about a database with 2 stage commits.
Didn't have one.
But my point is, just re-starting an application isn't always a solution.
No, just restarting isn't the solution. And engineering the application
to support random restarts isn't either. Just select a process from a
system window, drag and drop it in another system window, and it
continues to run on the other system as if nothing happened. That's what
I'm after...
On 1/30/2024 5:20 AM, Marc Van Dyck wrote:
Dave Froble wrote on 30/01/2024 :
Ok, a web server handling connection requests. Perhaps one or moreNo, just restarting isn't the solution. And engineering the application
connections are disrupted before finishing. A re-start will begin to again >>> handle connection requests. Perhaps reasonable.
Then, an example from one of my old customers:
Orders were build interactively, and the data was stored in an intermediate >>> file. When done building, the intermediate file is then queued to a poster >>> that processes the data and performs updates to all pertinent database files,
then deletes the intermediate file.
Ok, what happens when the system crashes during processing of an order?
Things are left incomplete and a nasty mess. Re-starting the poster will >>> make things worse. So, just restarting is not such a good idea.
In the example, best not to process the order that was interrupted.
Thankfully, this almost never happened. Thank you VMS and DEC hardware and >>> battery backup UPS. But, it was still a possibility.
The partial solution was to build checkpoints into the design. At each
specific point in the poster, a flag was set, and forced to disk, as each >>> file update occurred. The poster was set up to respect the checkpoint flags.
Worked sort of well. Thee was still the possibility the checkpoint flags >>> weren't written to disk. I didn't have an app that reviewed the information,
and automatically re-queued it telling the poster where to re-start. That >>> was a tedious manual task.
Hey, with most things, there is a point of diminishing returns on efforts. >>> Just not worth the cost.
Please don't start ranting about a database with 2 stage commits. Didn't >>> have one.
But my point is, just re-starting an application isn't always a solution. >>
to support random restarts isn't either. Just select a process from a
system window, drag and drop it in another system window, and it
continues to run on the other system as if nothing happened. That's what
I'm after...
There are different models for HA:
A) application managed - the application store state somewhere where
a new instance can pick it up - this is not that hard to implement
but the application need to be written for it
B) system managed - the system store state somewhere where
a new instance can pick it up - this is hard to implement
but the application doesn't need to be written for it
A2) same as A with a feature where the system can move the application
from one node to another node - don't schedule the
processes/threads, copy memory content to other node, get various
files/network connections opened on the other node, schedule the
processes/threads on the new node, kill the instance on the
old node - harder than A but easier than B
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 418 |
Nodes: | 16 (2 / 14) |
Uptime: | 27:27:51 |
Calls: | 8,773 |
Files: | 13,289 |
Messages: | 5,965,324 |