Forum: >>> Magnum BBS <<<

Terminate milter processing early.

From G.W. Haywood@21:1/5 to All on Thu Jun 3 15:56:49 2021

This is the situation which I face right now.

Suppose I run two milters to filter out unwanted mail.

One has a large memory footprint, but it executes quickly.

The other has a small memory footprint, but can take a long time to
execute - because it does things like lookups in DNSBLs which can
easily take ten seconds.

Both milters allocate resources, and so both need to hook the 'abort'
and 'close' callbacks so they can clean up after themselves.

The large, fast milter may know that it has nothing left to do for any particular message, but since Sendmail doesn't call its close callback
until after the second milter has finished its processing (even if the
first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
stick around doing nothing but waste resources until the second milter
has terminated.

There's room in memory for half a dozen copies of the large milter, a
hundred or more of the small one. If the average execution times for
the two milters differ by a factor of twenty, then running five copies
of the large milter to a hundred of the small one seems about right.
But there's no point in running more copies of the smaller, slower
milter than of the larger, faster one because as things are the slow
ones just hold things up for the fast ones.

Ideally, I should like to be able to tell Sendmail to call the close
callback for the first milter when the first milter knows it's done
all it needs to do, but let the second milter continue processing.
The first milter can then see the next message in the queue and get on
with things.

My search skills don't seem to be up to finding how to do this, or if
it's even possible. Thanks in advance for any thoughts. Suggestions
of the "buy more memory" variety will be sent to /dev/null.

Ged.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Thu Jun 3 19:28:36 2021

G.W. Haywood wrote:

The large, fast milter may know that it has nothing left to do for any particular message, but since Sendmail doesn't call its close callback
until after the second milter has finished its processing (even if the
first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
stick around doing nothing but waste resources until the second milter
has terminated.

Why doesn't the "large" milter invoke its cleanup function as soon
as it "know[s] that it has nothing left to do"? The cleanup function
can set a flag in the milter context that it was called and did its
work (release all the memory), so further calls can just check that.

--
Note: please read the netiquette before posting. I will almost never
reply to top-postings which include a full copy of the previous
article(s) at the end because it's annoying, shows that the poster
is too lazy to trim his article, and it's wasting the time of all readers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From G.W. Haywood@21:1/5 to Claus Assmann on Fri Jun 4 16:15:09 2021

On Thu, 3 Jun 2021, Claus Assmann wrote:

G.W. Haywood wrote:

The large, fast milter may know that it has nothing left to do for any
particular message, but since Sendmail doesn't call its close callback
until after the second milter has finished its processing (even if the
first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
stick around doing nothing but waste resources until the second milter
has terminated.

Why doesn't the "large" milter invoke its cleanup function as soon
as it "know[s] that it has nothing left to do"? The cleanup function
can set a flag in the milter context that it was called and did its
work (release all the memory), so further calls can just check that.

Thanks, but it's not quite as simple as that.

Yes it could clean up as it goes along, but there's a *lot* of state information. Some of it is per message, some is per connection, some
is per recipient, some per sender, some is persistent... In a milter
with ten or more callbacks, each having not only its 'personal' set of
state information but also with some overlaps between them, it quickly
gets ugly, difficult to manage - and so, error-prone. New code would
carry the risks of any code which would otherwise not be needed. I've
spent over five years working on the large milter. It now does what I
want, and it's more or less ready for others to try using it. They're queueing. Doing what you suggest would mean another significant chunk
of work coding and testing before I'd feel comfortable releasing it (I
don't have the energy to maintain two versions of it, so releasing one
with 'faster' resource management and one without isn't an option).

The 'close' and 'abort' callbacks are there for exactly this purpose.
They exist anyway and have been exercised for years; it would be much
easier, cleaner and more efficient if they could be used as I suggest.

Is there no way, then, short of hacking the Sendmail source? As you
know I'm not above trying my hand at that. I'd have thought it would
be a good _FFR_ to have in Sendmail anyway. I'm probably not the best qualified around here to take it on, although I routinely run a couple
of other patches (which would very likely make your toes curl).

Ged.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Sat Jun 5 07:15:08 2021

G.W. Haywood wrote:

On Thu, 3 Jun 2021, Claus Assmann wrote:

Why doesn't the "large" milter invoke its cleanup function as soon
as it "know[s] that it has nothing left to do"? The cleanup function
can set a flag in the milter context that it was called and did its
work (release all the memory), so further calls can just check that.

Thanks, but it's not quite as simple as that.

Maybe someone else understands the problem because I don't.

! Ideally, I should like to be able to tell Sendmail to call the close
! callback for the first milter when the first milter knows it's done

Why does it make a difference between sendmail "call[ing] the close
callback" and your own milter doing that?

! The large, fast milter may know that it has nothing left to do for any
! particular message, but since Sendmail doesn't call its close callback
! until after the second milter has finished its processing (even if the
! first milter returns 'SMFIS_ACCEPT' from the connect callback)

If your "milter returns 'SMFIS_ACCEPT' from the connect callback"
then why would it even allocate resources which it doesn't need?
If your milter returns SMFIS_ACCEPT from a transaction callback
then sendmail has no way to know that your milter doesn't even want
the subsequent transactions, i.e., you would have to come up with
another return code.

Anway, you got the source, so you can hack it to do what you
consider necessary for your special situation.

--
Note: please read the netiquette before posting. I will almost never
reply to top-postings which include a full copy of the previous
article(s) at the end because it's annoying, shows that the poster
is too lazy to trim his article, and it's wasting the time of all readers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From G.W. Haywood@21:1/5 to Claus Assmann on Sat Jun 5 12:36:13 2021

On Sat, 5 Jun 2021, Claus Assmann wrote:

Maybe someone else understands the problem because I don't.

I guess I haven't explained it very well. Let me try again.

I have two milters. One is big (huge) but fast, the other is small
but it can be very slow. Often, the small one holds up the big one so
that the big one consumes a lot of resources while doing *nothing* but
wait for the MTA to release it from duty. That can be a nuisance, it
can hold up the flow of mail because there's no milter child process
available for Sendmail to connect to when a new connection comes in.
All the existing child processes are busy, some of them doing nothing.

! Ideally, I should like to be able to tell Sendmail to call the close
! callback for the first milter when the first milter knows it's done

Why does it make a difference between sendmail "call[ing] the close
callback" and your own milter doing that?

Until I read your question above, I had the impression that callbacks
are for the MTA to call. It never occurred to me to call them myself
and I'd have expected doing that to break the communications between
Sendmail and the milter. Does Sendmail not care if I call the close
callback myself? If it's allowed by the milter protocol, then the
problem might be solved. :) Granted I haven't read that it isn't, I
guess I just ass-u-me-d it isn't. I'll experiment with the idea.

If your "milter returns 'SMFIS_ACCEPT' from the connect callback"
then why would it even allocate resources which it doesn't need?

As I said, some of the state is persistent and it allocates quite a
variety of resources, some of which it allocates at 'connect' in the expectation of using it both there and in later callbacks. When it deallocates, it has to choose what it frees (garbage collection) and
when it frees it. On top of all this it's written in Perl so before
it even allocates any storage the thing starts off pretty big anyway
and there's not a lot I can do about that - short of rewriting parts
of it in C, which is on the cards when some of my outlandish theory
is either justified or proved not to be useful. It generally looks
like I'm on the right track, at least other people seem to see much
more spam and malicious stuff in their INBOX than I do.

If your milter returns SMFIS_ACCEPT from a transaction callback
then sendmail has no way to know that your milter doesn't even want
the subsequent transactions, i.e., you would have to come up with
another return code.

Hmmm, that doesn't sound so good for my situation and it's not what
I've taken home from the docs. The milter docs state as follows:

[quote source=file:///.../sendmail-8.17.0.0/libmilter/docs/api.html]

SMFIS_ACCEPT

For a connection-oriented routine, accept this connection without
further filter processing; ***CALL XXFI_CLOSE***. <=== my ephasis.

For a message- or recipient-oriented routine, accept this message
without further filtering.

[/quote]

There it says "call xxfi_close", which at first I took to mean that
the MTA would call xxfi_close immediately. Unfortunately it doesn't
call the callback until after completing all other milter processing
which is why I'm in this, er, bind. My naive thought processes ran
along the lines of

"If the milter replies SMFIS_ACCEPT, then that must tell Sendmail that
the only subsequent transaction will be close (or perhaps abort)".

Is that wrong? If so, why?

Anway, you got the source, so you can hack it to do what you
consider necessary for your special situation.

Quite so. Thanks again for your help, and for your dedication in the
face of all the stupid questions.

Ged.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Sun Jun 6 19:33:58 2021

G.W. Haywood wrote:

On Sat, 5 Jun 2021, Claus Assmann wrote:

Why does it make a difference between sendmail "call[ing] the close callback" and your own milter doing that?

Until I read your question above, I had the impression that callbacks
are for the MTA to call. It never occurred to me to call them myself

In C they are functions which are called by libmilter,
using a documented API and required return values.

and I'd have expected doing that to break the communications between
Sendmail and the milter. Does Sendmail not care if I call the close
callback myself? If it's allowed by the milter protocol, then the

In C the callbacks do not communicate with the MTA,
obviously that's only handled by libmilter.

I don't know anything about whatever "libmilter" perl uses,
so you would have to check its docs.

--
Note: please read the netiquette before posting. I will almost never
reply to top-postings which include a full copy of the previous
article(s) at the end because it's annoying, shows that the poster
is too lazy to trim his article, and it's wasting the time of all readers.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From G.W. Haywood@21:1/5 to Claus Assmann on Tue Jun 8 16:00:21 2021

On Sun, 6 Jun 2021, Claus Assmann wrote:

G.W. Haywood wrote:

On Sat, 5 Jun 2021, Claus Assmann wrote:

Why does it make a difference between sendmail "call[ing] the close
callback" and your own milter doing that?

Until I read your question above, I had the impression that callbacks
are for the MTA to call. It never occurred to me to call them myself

In C they are functions which are called by libmilter,
using a documented API and required return values.

Sorry, I wasn't clear. I was treating libmilter as part of Sendmail
itself, because libmilter is normally compiled together with Sendmail
(at least that's how I've always done it for milters written in C).

and I'd have expected doing that to break the communications between
Sendmail and the milter. Does Sendmail not care if I call the close
callback myself? If it's allowed by the milter protocol, then the

In C the callbacks do not communicate with the MTA,
obviously that's only handled by libmilter.

Yes, I understand that of course. But the question stands, or maybe
phrased in a different way: Does Sendmail not care if close callbacks
are called *other* than because Sendmail has for example seen the end
of the message plus the client's 'QUIT' command and told the milters
(via libmilter) about that?

I don't know anything about whatever "libmilter" perl uses,
so you would have to check its docs.

I'm the maintainer of the Perl equivalent of libmilter which I use. I
took it over in a sadly unmaintained state a few years ago, fixed the
many reported bugs and overhauled the docs. The interface is called Sendmail::PMilter. (For anyone else reading that's not to be confused
with Sendmail::Milter, which is also unmaintained and seems abandoned
by its author. I tried to take over maintainership of that too, but
the author was most unhelpful.)

Sendmail::PMilter does for Perl milters more or less what libmilter
does for C milters. It closely mimics the libmilter communications.
But now you have me asking if it makes a difference whether libmilter
is built or not. That also hadn't occurred to me. Thanks! It looks
like I have a bit more work to do.

PS: Sorry I have to keep changing the Eszett to 'ss'. My news client
(Alpine) chokes on that character and refuses to send the message.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop: Keyop

Location: Huddersfield, West Yorkshire, UK

Users: 296

Nodes: 16 (2 / 14)

Uptime: 48:09:49

Calls: 6,648

Files: 12,198

Messages: 5,329,985

Terminate milter processing early.

Who's Online

System Info