• Terminate milter processing early.

    From G.W. Haywood@21:1/5 to All on Thu Jun 3 15:56:49 2021
    This is the situation which I face right now.

    Suppose I run two milters to filter out unwanted mail.

    One has a large memory footprint, but it executes quickly.

    The other has a small memory footprint, but can take a long time to
    execute - because it does things like lookups in DNSBLs which can
    easily take ten seconds.

    Both milters allocate resources, and so both need to hook the 'abort'
    and 'close' callbacks so they can clean up after themselves.

    The large, fast milter may know that it has nothing left to do for any particular message, but since Sendmail doesn't call its close callback
    until after the second milter has finished its processing (even if the
    first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
    stick around doing nothing but waste resources until the second milter
    has terminated.

    There's room in memory for half a dozen copies of the large milter, a
    hundred or more of the small one. If the average execution times for
    the two milters differ by a factor of twenty, then running five copies
    of the large milter to a hundred of the small one seems about right.
    But there's no point in running more copies of the smaller, slower
    milter than of the larger, faster one because as things are the slow
    ones just hold things up for the fast ones.

    Ideally, I should like to be able to tell Sendmail to call the close
    callback for the first milter when the first milter knows it's done
    all it needs to do, but let the second milter continue processing.
    The first milter can then see the next message in the queue and get on
    with things.

    My search skills don't seem to be up to finding how to do this, or if
    it's even possible. Thanks in advance for any thoughts. Suggestions
    of the "buy more memory" variety will be sent to /dev/null.

    Ged.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Thu Jun 3 19:28:36 2021
    G.W. Haywood wrote:

    The large, fast milter may know that it has nothing left to do for any particular message, but since Sendmail doesn't call its close callback
    until after the second milter has finished its processing (even if the
    first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
    stick around doing nothing but waste resources until the second milter
    has terminated.

    Why doesn't the "large" milter invoke its cleanup function as soon
    as it "know[s] that it has nothing left to do"? The cleanup function
    can set a flag in the milter context that it was called and did its
    work (release all the memory), so further calls can just check that.


    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From G.W. Haywood@21:1/5 to Claus Assmann on Fri Jun 4 16:15:09 2021
    On Thu, 3 Jun 2021, Claus Assmann wrote:

    G.W. Haywood wrote:

    The large, fast milter may know that it has nothing left to do for any
    particular message, but since Sendmail doesn't call its close callback
    until after the second milter has finished its processing (even if the
    first milter returns 'SMFIS_ACCEPT' from the connect callback) it must
    stick around doing nothing but waste resources until the second milter
    has terminated.

    Why doesn't the "large" milter invoke its cleanup function as soon
    as it "know[s] that it has nothing left to do"? The cleanup function
    can set a flag in the milter context that it was called and did its
    work (release all the memory), so further calls can just check that.

    Thanks, but it's not quite as simple as that.

    Yes it could clean up as it goes along, but there's a *lot* of state information. Some of it is per message, some is per connection, some
    is per recipient, some per sender, some is persistent... In a milter
    with ten or more callbacks, each having not only its 'personal' set of
    state information but also with some overlaps between them, it quickly
    gets ugly, difficult to manage - and so, error-prone. New code would
    carry the risks of any code which would otherwise not be needed. I've
    spent over five years working on the large milter. It now does what I
    want, and it's more or less ready for others to try using it. They're queueing. Doing what you suggest would mean another significant chunk
    of work coding and testing before I'd feel comfortable releasing it (I
    don't have the energy to maintain two versions of it, so releasing one
    with 'faster' resource management and one without isn't an option).

    The 'close' and 'abort' callbacks are there for exactly this purpose.
    They exist anyway and have been exercised for years; it would be much
    easier, cleaner and more efficient if they could be used as I suggest.

    Is there no way, then, short of hacking the Sendmail source? As you
    know I'm not above trying my hand at that. I'd have thought it would
    be a good _FFR_ to have in Sendmail anyway. I'm probably not the best qualified around here to take it on, although I routinely run a couple
    of other patches (which would very likely make your toes curl).

    Ged.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Sat Jun 5 07:15:08 2021
    G.W. Haywood wrote:
    On Thu, 3 Jun 2021, Claus Assmann wrote:

    Why doesn't the "large" milter invoke its cleanup function as soon
    as it "know[s] that it has nothing left to do"? The cleanup function
    can set a flag in the milter context that it was called and did its
    work (release all the memory), so further calls can just check that.

    Thanks, but it's not quite as simple as that.

    Maybe someone else understands the problem because I don't.

    ! Ideally, I should like to be able to tell Sendmail to call the close
    ! callback for the first milter when the first milter knows it's done

    Why does it make a difference between sendmail "call[ing] the close
    callback" and your own milter doing that?

    ! The large, fast milter may know that it has nothing left to do for any
    ! particular message, but since Sendmail doesn't call its close callback
    ! until after the second milter has finished its processing (even if the
    ! first milter returns 'SMFIS_ACCEPT' from the connect callback)

    If your "milter returns 'SMFIS_ACCEPT' from the connect callback"
    then why would it even allocate resources which it doesn't need?
    If your milter returns SMFIS_ACCEPT from a transaction callback
    then sendmail has no way to know that your milter doesn't even want
    the subsequent transactions, i.e., you would have to come up with
    another return code.

    Anway, you got the source, so you can hack it to do what you
    consider necessary for your special situation.

    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From G.W. Haywood@21:1/5 to Claus Assmann on Sat Jun 5 12:36:13 2021
    On Sat, 5 Jun 2021, Claus Assmann wrote:

    Maybe someone else understands the problem because I don't.

    I guess I haven't explained it very well. Let me try again.

    I have two milters. One is big (huge) but fast, the other is small
    but it can be very slow. Often, the small one holds up the big one so
    that the big one consumes a lot of resources while doing *nothing* but
    wait for the MTA to release it from duty. That can be a nuisance, it
    can hold up the flow of mail because there's no milter child process
    available for Sendmail to connect to when a new connection comes in.
    All the existing child processes are busy, some of them doing nothing.

    ! Ideally, I should like to be able to tell Sendmail to call the close
    ! callback for the first milter when the first milter knows it's done

    Why does it make a difference between sendmail "call[ing] the close
    callback" and your own milter doing that?

    Until I read your question above, I had the impression that callbacks
    are for the MTA to call. It never occurred to me to call them myself
    and I'd have expected doing that to break the communications between
    Sendmail and the milter. Does Sendmail not care if I call the close
    callback myself? If it's allowed by the milter protocol, then the
    problem might be solved. :) Granted I haven't read that it isn't, I
    guess I just ass-u-me-d it isn't. I'll experiment with the idea.

    If your "milter returns 'SMFIS_ACCEPT' from the connect callback"
    then why would it even allocate resources which it doesn't need?

    As I said, some of the state is persistent and it allocates quite a
    variety of resources, some of which it allocates at 'connect' in the expectation of using it both there and in later callbacks. When it deallocates, it has to choose what it frees (garbage collection) and
    when it frees it. On top of all this it's written in Perl so before
    it even allocates any storage the thing starts off pretty big anyway
    and there's not a lot I can do about that - short of rewriting parts
    of it in C, which is on the cards when some of my outlandish theory
    is either justified or proved not to be useful. It generally looks
    like I'm on the right track, at least other people seem to see much
    more spam and malicious stuff in their INBOX than I do.

    If your milter returns SMFIS_ACCEPT from a transaction callback
    then sendmail has no way to know that your milter doesn't even want
    the subsequent transactions, i.e., you would have to come up with
    another return code.

    Hmmm, that doesn't sound so good for my situation and it's not what
    I've taken home from the docs. The milter docs state as follows:

    [quote source=file:///.../sendmail-8.17.0.0/libmilter/docs/api.html]

    SMFIS_ACCEPT

    For a connection-oriented routine, accept this connection without
    further filter processing; ***CALL XXFI_CLOSE***. <=== my ephasis.

    For a message- or recipient-oriented routine, accept this message
    without further filtering.

    [/quote]

    There it says "call xxfi_close", which at first I took to mean that
    the MTA would call xxfi_close immediately. Unfortunately it doesn't
    call the callback until after completing all other milter processing
    which is why I'm in this, er, bind. My naive thought processes ran
    along the lines of

    "If the milter replies SMFIS_ACCEPT, then that must tell Sendmail that
    the only subsequent transaction will be close (or perhaps abort)".

    Is that wrong? If so, why?

    Anway, you got the source, so you can hack it to do what you
    consider necessary for your special situation.

    Quite so. Thanks again for your help, and for your dedication in the
    face of all the stupid questions.

    Ged.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to G.W. Haywood on Sun Jun 6 19:33:58 2021
    G.W. Haywood wrote:
    On Sat, 5 Jun 2021, Claus Assmann wrote:

    Why does it make a difference between sendmail "call[ing] the close callback" and your own milter doing that?

    Until I read your question above, I had the impression that callbacks
    are for the MTA to call. It never occurred to me to call them myself

    In C they are functions which are called by libmilter,
    using a documented API and required return values.

    and I'd have expected doing that to break the communications between
    Sendmail and the milter. Does Sendmail not care if I call the close
    callback myself? If it's allowed by the milter protocol, then the

    In C the callbacks do not communicate with the MTA,
    obviously that's only handled by libmilter.

    I don't know anything about whatever "libmilter" perl uses,
    so you would have to check its docs.

    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From G.W. Haywood@21:1/5 to Claus Assmann on Tue Jun 8 16:00:21 2021
    On Sun, 6 Jun 2021, Claus Assmann wrote:

    G.W. Haywood wrote:
    On Sat, 5 Jun 2021, Claus Assmann wrote:

    Why does it make a difference between sendmail "call[ing] the close
    callback" and your own milter doing that?

    Until I read your question above, I had the impression that callbacks
    are for the MTA to call. It never occurred to me to call them myself

    In C they are functions which are called by libmilter,
    using a documented API and required return values.

    Sorry, I wasn't clear. I was treating libmilter as part of Sendmail
    itself, because libmilter is normally compiled together with Sendmail
    (at least that's how I've always done it for milters written in C).

    and I'd have expected doing that to break the communications between
    Sendmail and the milter. Does Sendmail not care if I call the close
    callback myself? If it's allowed by the milter protocol, then the

    In C the callbacks do not communicate with the MTA,
    obviously that's only handled by libmilter.

    Yes, I understand that of course. But the question stands, or maybe
    phrased in a different way: Does Sendmail not care if close callbacks
    are called *other* than because Sendmail has for example seen the end
    of the message plus the client's 'QUIT' command and told the milters
    (via libmilter) about that?

    I don't know anything about whatever "libmilter" perl uses,
    so you would have to check its docs.

    I'm the maintainer of the Perl equivalent of libmilter which I use. I
    took it over in a sadly unmaintained state a few years ago, fixed the
    many reported bugs and overhauled the docs. The interface is called Sendmail::PMilter. (For anyone else reading that's not to be confused
    with Sendmail::Milter, which is also unmaintained and seems abandoned
    by its author. I tried to take over maintainership of that too, but
    the author was most unhelpful.)

    Sendmail::PMilter does for Perl milters more or less what libmilter
    does for C milters. It closely mimics the libmilter communications.
    But now you have me asking if it makes a difference whether libmilter
    is built or not. That also hadn't occurred to me. Thanks! It looks
    like I have a bit more work to do.

    PS: Sorry I have to keep changing the Eszett to 'ss'. My news client
    (Alpine) chokes on that character and refuses to send the message.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)