Forum: >>> Magnum BBS <<<

Scheduling algorithms for "foreign" binaries

From Don Y@21:1/5 to All on Fri Jan 3 16:15:15 2020

I have an "open" system to which "uncertified" binaries can be added,
at any time. Once added, they can be invoked at will (even their own!).

[Core services are implemented at a different level -- different
resource controls, scheduling, etc. -- to ensure their continuous
availability and service guarantees. These services tend to be
either more demanding (e.g., multimedia) or essential (e.g., comms).
Letting foreign binaries compete with that closed system jeopardizes
those guarantees and the functionality of the system, as a whole]

The resources available in the system vary, over time. This is a
consequence of the dynamic nature of task invocations as well as
the dynamic nature of hardware availability (i.e., more nodes
on-line means more potential resources). Because the system is open,
none of these decisions can be made at design time.

Resource usage per task (lowercase T) is tracked and constrained.
So, the system can see who's using what and act on (against!)
abusers. This can be with or without prejudice (e.g., "Sorry,
I assume you're only using the resources that you NEED to perform
your job but we don't have those resources to spare, currently.
Too bad, so sad, good bye!")

The real-time scheduler makes scheduling decisions based on deadline data
(with input from the resource manager... some greedy task may be deemed
a poor choice to run even though it looks to be the most urgent!)

With no certifying authority in place, there's nothing to keep a task
from claiming it has urgent deadlines in an attempt to buck the queue.
This, of course, can work against it because the scheduler's job isn't
to ensure the most urgent deadlines are met but, rather, that the
most "work" gets done (because there is no independent authority
to indicate that task A's job is more meaningful than task B's!).
So, the scheduler may decide that the resource (i.e., time) spent
on that "urgent" task is better spent on helping some other task(s)
achieve *their* goals. (Too bad, so sad, good bye!)

Treating the expressed deadline as always "hard" -- after which, there
is no point in continuing the task's work -- is artificially draconian.
If a (foreign!) task knows that it faces termination if its deadline
*happens* to be too soon (for current conditions), then there is
pressure on it to delay that deadline -- perhaps too long.

There are occasions when a "softer" deadline is more appropriate and
could provide opportunity for the workload scheduler to bring more
resources on-line and rebalance the load.

This suggests a simple/intuitive approach is to allow these foreign tasks
to express TWO deadlines:
- the deadline that they would LIKE to meet
- the deadline after which their efforts are irrelevant

Noting, of course, that they might not meet *either* of them and, in
some cases, may never be granted admission if the workload scheduler
deems resources to be insufficient.

Two questions:
- what situation(s) does this approach fail to address?
- how could a hostile actor manipulate these parameters to game the system
(i.e., artificially inflate his significance)?

There's an AI that advises the workload scheduler so the system can
eventually adjust its bias towards (or against) the task in question.
But, the time for the AI to learn that can be long (in user time),
depending on how often the task is invoked and how often the system
needs to kill() it.

Email preferred (for those of you that have access to it)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Richard Damon@21:1/5 to Don Y on Fri Jan 3 22:02:07 2020

On 1/3/20 6:15 PM, Don Y wrote:

I have an "open" system to which "uncertified" binaries can be added,
at any time. Once added, they can be invoked at will (even their own!).

[Core services are implemented at a different level -- different
resource controls, scheduling, etc. -- to ensure their continuous availability and service guarantees. These services tend to be
either more demanding (e.g., multimedia) or essential (e.g., comms).
Letting foreign binaries compete with that closed system jeopardizes
those guarantees and the functionality of the system, as a whole]

The resources available in the system vary, over time. This is a consequence of the dynamic nature of task invocations as well as
the dynamic nature of hardware availability (i.e., more nodes
on-line means more potential resources). Because the system is open,
none of these decisions can be made at design time.

Resource usage per task (lowercase T) is tracked and constrained.
So, the system can see who's using what and act on (against!)
abusers. This can be with or without prejudice (e.g., "Sorry,
I assume you're only using the resources that you NEED to perform
your job but we don't have those resources to spare, currently.
Too bad, so sad, good bye!")

The real-time scheduler makes scheduling decisions based on deadline data (with input from the resource manager... some greedy task may be deemed
a poor choice to run even though it looks to be the most urgent!)

With no certifying authority in place, there's nothing to keep a task
from claiming it has urgent deadlines in an attempt to buck the queue.
This, of course, can work against it because the scheduler's job isn't
to ensure the most urgent deadlines are met but, rather, that the
most "work" gets done (because there is no independent authority
to indicate that task A's job is more meaningful than task B's!).
So, the scheduler may decide that the resource (i.e., time) spent
on that "urgent" task is better spent on helping some other task(s)
achieve *their* goals. (Too bad, so sad, good bye!)

Treating the expressed deadline as always "hard" -- after which, there
is no point in continuing the task's work -- is artificially draconian.
If a (foreign!) task knows that it faces termination if its deadline *happens* to be too soon (for current conditions), then there is
pressure on it to delay that deadline -- perhaps too long.

There are occasions when a "softer" deadline is more appropriate and
could provide opportunity for the workload scheduler to bring more
resources on-line and rebalance the load.

This suggests a simple/intuitive approach is to allow these foreign tasks
to express TWO deadlines:
- the deadline that they would LIKE to meet
- the deadline after which their efforts are irrelevant

Noting, of course, that they might not meet *either* of them and, in
some cases, may never be granted admission if the workload scheduler
deems resources to be insufficient.

Two questions:
- what situation(s) does this approach fail to address?
- how could a hostile actor manipulate these parameters to game the system
(i.e., artificially inflate his significance)?

There's an AI that advises the workload scheduler so the system can eventually adjust its bias towards (or against) the task in question.
But, the time for the AI to learn that can be long (in user time),
depending on how often the task is invoked and how often the system
needs to kill() it.

Email preferred (for those of you that have access to it)

(Hard to E-mail to in .invalid address)

Handling totally unverified programs is of course tough. I presume the
system sandboxes each application well enough that they can't really do
any harm besides using up resources, but that does sort of limit what
they can productively do.

My general approach to system like this is to schedule based on some performance vs cost metric, with cost be a combination of resources
needed and urgency, so tasks that want resource NOW, pay a premium for
them, and if they don't quickly produce some useful work, they quickly
starve themselves of resource since they can't afford them. You only
penalize tasks that start with short deadlines, if the task has been
around for a while, and then the deadline starts to loom, but resources
haven't been given to it, gets some real priority.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Don Y@21:1/5 to Richard Damon on Sat Jan 4 18:47:21 2020

Hi Richard,

On 1/3/2020 8:02 PM, Richard Damon wrote:

Handling totally unverified programs is of course tough. I presume the system sandboxes each application well enough that they can't really do any harm besides using up resources, but that does sort of limit what they can productively do.

Yes. The CM system requires each "app" to declare (in fine-grained detail)
its resource requirements along with the interface(s) and operations that
it expects to use as well as the interfaces/operations that it will provide. The idea is that the user can SEE the "costs" and potential interactions
before agreeing to admit the app to their system. (I'll build an AI to help advise the user as he typically won't be able to evaluate all of this
technical detail).

Because this information is published, *other* folks can comment on the relative merit of the app's "requirements" vs. functionality offered
(do you really want a "flashlight app" looking at your address book?)

The installer sets up the machinery to ensure these resources and interfaces will be made available (when the task/app is invoked). The OS subsequently ensures that nothing that hasn't been explicitly declared and "agreed upon" (contract) is available to the app.

Additionally, if an app TRIES to do something that it hasn't declared
to the installer, then it is unceremoniously killed, marked as "hostile",
and blacklisted from running, thereafter. (there's no conceivable
explanation for trying to do something that you shouldn't; you're either *malicious* or *buggy*!)

But, I can't prevent an app from using the resources that it *has* been
granted on admission -- unless the system's instantaneous needs (e.g.,
load) indicate a need for constraint.

The scheduling criteria are the means by which I specify the limits
on the app's need for "TIMELY work time".

My general approach to system like this is to schedule based on some performance vs cost metric, with cost be a combination of resources needed and
urgency, so tasks that want resource NOW, pay a premium for them, and if they don't quickly produce some useful work, they quickly starve themselves of resource since they can't afford them. You only penalize tasks that start with
short deadlines, if the task has been around for a while, and then the deadline
starts to loom, but resources haven't been given to it, gets some real priority.

The problem is that you can't easily deduce "importance" from any
scheduling metric. And if you let the task define its own importance,
you rely on the benevolence AND omnipotence of the developer to
come up with a "correct" assessment of its self-worth; does he have
adequate understanding of the other tasks in the system to be able to objectively rate THIS task in that continuum of importance?

[This is why static priority schemes are silly; all they do is provide
an /ex post factum/ mechanism to tweek the system's performance to compensate for issues that should have been accommodated in the *design*. By extension, quasi-dynamic schemes are equally so. Open the system to foreign binaries
and you may as well assume everything will "need" to run at MAX_PRIORITY!]

A task with a short deadline (lumping the "soft" and "hard" together, for
the moment) and minimal resource requirements may be totally insignificant. E.g., a task that winks an indicator/annunciator (short deadline, minimal resources) can likely be elided without impacting system performance (esp
if its dismissal is only transient). OTOH, a task with a distant deadline might be VERY important, regardless of the amount of resources used.

I'm looking for something that is easy for a developer to "grok" (a very deliberate choice of words!) and relatively easy for him to quantify and express -- I don't want him to have to compute some "work metric" for his application in order to determine its viability as a competing app).

*And*, I want it easy/apparent for the potentially hostile/exploitive
developer to understand how his "greed" can backfire on his application's performance/utility ("You want all these resources? Well, then I guess
YOU are the prime candidate to delay or kill-off when the system load increases! <smile>")

The dual deadline *feels* like it should be easy to understand AND
express. E.g., you'd *like* that "1_Hz_indicator_wink" to be invoked
at a very specific time/deadline (perhaps 0.5 seconds?). And, it surely wouldn't make any sense to let it run much later than 0.9999999 seconds;
it's "worthless" beyond that point! You'd *like* that button press to be
acted upon within ~200ms -- and definitely wouldn't want the action to
be more than a second or two delayed (lest the user "forget" that he
pressed the button). OTOH, if you're tagging "commercials" in a media
stream, you only need to get it done before the user is likely going to
want to view that edited stream (hours? days??)!

Furthermore, these criteria remain valid regardless of *where* the task
is executing; they don't need to be "adjusted" based on the "priorities"
of any co-resident tasks if it migrates to a different node! (though its RELATIVE "importance" may change, based on the actors that its competing
with in its new execution environment). And, the criteria don't change
when an entirely new set of ADDITIONAL tasks are added to the system!

I can't think of a situation that the dual deadline criteria doesn't
address -- even if only suboptimally. Nor can I think of a hack that
allows it to be exploited /with a likelihood of success/! (you're just
as likely to end up shooting yourself in the foot if you try to game
the system)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	293
Nodes:	16 (2 / 14)
Uptime:	238:33:37
Calls:	6,624
Files:	12,172
Messages:	5,319,942

Scheduling algorithms for "foreign" binaries

Who's Online

System Info