• Fastest way to inject a lot of mail?

    From John Levine@21:1/5 to All on Tue Mar 5 21:14:44 2024
    One of my clients has an application that builds several hundred customized messages reporting what changed in the past day, and sends each one to a
    list of people who have subscribed to it. (This isn't spam, they complain
    when they don't get it.)

    We currently send the mail by putting all the recipients on the bcc:
    line and running /usr/sbin/sendmail -t and feeding it the message
    through a pipe. By the time all the messages are done this takes a
    while. Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to John Levine on Tue Mar 5 19:31:36 2024
    Hi John,

    On 3/5/24 15:14, John Levine wrote:
    One of my clients has an application that builds several hundred
    customized messages reporting what changed in the past day, and sends
    each one to a list of people who have subscribed to it. (This isn't
    spam, they complain when they don't get it.)

    ;-)

    We currently send the mail by putting all the recipients on the bcc:
    line and running /usr/sbin/sendmail -t and feeding it the message
    through a pipe. By the time all the messages are done this takes a
    while. Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    Please clarify -- I'm trying to understand / confirm -- are sending
    multiple envelope recipients per customized report? Or are you sending individual messages per recipient?

    Also, what delivery mode are you using? Queued / interactive? (That
    might not be the proper nomenclature.)



    --
    Grant. . . .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco Moock@21:1/5 to All on Wed Mar 6 10:51:37 2024
    On 05.03.2024 um 21:14 Uhr John Levine wrote:

    We currently send the mail by putting all the recipients on the bcc:
    line and running /usr/sbin/sendmail -t and feeding it the message
    through a pipe. By the time all the messages are done this takes a
    while. Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    Can you find the reason for that?
    You can check the logs when the mail from the MSP reached the MTA.
    Are milters in use?

    --
    kind regards
    Marco

    Send spam to 1709669684muell@cartoonies.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to mm+usenet-es@dorfdsl.de on Wed Mar 6 20:59:04 2024
    It appears that Marco Moock <mm+usenet-es@dorfdsl.de> said:
    On 05.03.2024 um 21:14 Uhr John Levine wrote:

    We currently send the mail by putting all the recipients on the bcc:
    line and running /usr/sbin/sendmail -t and feeding it the message
    through a pipe. By the time all the messages are done this takes a
    while. Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    Can you find the reason for that?

    Yeah, because it's doing a lot of work sending tens of thousands of
    messages. There's no milters on the system where they're injected.
    They go through a smarthost with a DKIM signing milter but it seems
    plenty fast.

    In answer to another question, the number of recipients per message
    varies but is typically between 10 and 50. At some point we should
    redo it to do individual deliveries so we can customize them more
    but not any time soon.



    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to John Levine on Wed Mar 6 19:30:17 2024
    On 3/5/24 15:14, John Levine wrote:
    By the time all the messages are done this takes a while.

    Please quantify "all the messages" and "takes a while".

    How many messages (SMTP envelopes)?

    How long does it take (seconds / minutes / hours / days)?



    --
    Grant. . . .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to John Levine on Thu Mar 7 01:06:35 2024
    John Levine wrote:

    Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    First you need to identify the bottleneck(s),
    then you can work on solutions.

    BTW: did you read the fine documentation?
    (hint: "TUNING"...)

    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From HQuest@21:1/5 to All on Fri Mar 8 00:59:38 2024
    Aside of the recommended "Tuning" by Claus, your workflow the way I read felt inefficient. You prep the mail, fire up a sendmail MSA instance (just to wrap the message with proper mail headers), handle to another sendmail MTA or whatever (to add DKIM
    headers) and move it forward... if you already have a trusted MTA elsewhere, why don't you just deliver the message right into that MTA via the application itself?

    If you are lazy (as I am), a rudimentary, poorly written, very insecure and extremely lazy bash script can do the job:
    echo ${mail_message_complete_with_envelope_headers_and_ehlo} > /dev/tcp/$smtpsrv/$smtpport

    Assuming you trust it enough to cut the authentication and StartTLS pieces to save precious CPU cycles. The DKIM header will still be added by the MTA, but by not spinning multiple MSAs which, combined with its dynamic libraries, you save quite some time
    to get messages moving.

    Or if you really want to use sendmail as MSA because authentication/TLS/reasons, keep one running, and deliver the email message to it via TCP/IP. A fork() is always much faster than a full load.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to HQuest on Fri Mar 8 00:36:21 2024
    HQuest wrote:

    echo ${mail_message_complete_with_envelope_headers_and_ehlo} > /dev/tcp/$smtpsrv/$smtpport

    That might fail due to "unauthorized PIPELINING".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From HQuest@21:1/5 to All on Fri Mar 8 12:48:58 2024
    Claus Aßmann wrote:
    That might fail due to "unauthorized PIPELINING".

    My cron scripts and/or sendmail.cf would beg to differ, but my point is that it is much superior to deliver a message to a running MSA/MTA than spinning up a new copy of sendmail for every message to be delivered.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to HQuest on Fri Mar 8 15:04:43 2024
    HQuest wrote:
    Claus Aßmann wrote:
    That might fail due to "unauthorized PIPELINING".

    My cron scripts and/or sendmail.cf would beg to differ, but my point is

    Do you run 8.18? Did you disable the extra checks?
    What happens if your MTA is "too busy" (421) or replies with some
    error to one of the commands?

    that it is much superior to deliver a message to a running MSA/MTA than spinning up a new copy of sendmail for every message to be delivered.

    That might be the case, but AFAICT it does not apply to the way the
    OP is submitting mails (a single mail with a large list of recipients).

    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Sat Mar 9 00:25:28 2024
    According to Claus Aßmann <INVALID_NO_CC_REMOVE_IF_YOU_DO_NOT_POST_ml+sendmail(-no-copies-please)@esmtp.org>:
    John Levine wrote:

    Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    First you need to identify the bottleneck(s),
    then you can work on solutions.

    Well, yeah, that's why I was wondering whether running the sendmail program
    is likely to be slow.

    BTW: did you read the fine documentation?
    (hint: "TUNING"...)

    I did and unless I missed something, it says nothing about injecting
    mail via the sendmail command other than the obvious thing that you
    want to queue rather than delivering synchronously.

    So here's a question: I have on the order of 10,000 messages, each
    with a dozen or so recipients. It's currently running the sendmail
    command for each one. If I opened a connection to 127.0.0.1 and
    did a sequence of MAIL FROM/RCPT TO/DATA, would that be faster? How
    about if I did it with N processes in parallel for some modest N? It
    currently takes about 6 hours on a moderately fast VPS.

    If nobody has any idea, OK, but it's hard to believe I'm the first person
    ever to wonder about this.

    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to John Levine on Fri Mar 8 19:39:50 2024
    On 3/8/24 18:25, John Levine wrote:
    So here's a question: I have on the order of 10,000 messages, each
    with a dozen or so recipients.

    That's quite a few discrete messages.

    It currently takes about 6 hours on a moderately fast VPS.

    Rough math, that's a little over 2 seconds per message.

    On one hand that seems a little slow, but on the other hand, maybe not.

    How big are the messages? There's a big difference if it's a few kB of
    text vs multiple MB of attachments.

    Depending on the VPS and the disk(s) backing it, I could see how this
    may be a disk I/O performance issue. This seems especially germane on a
    VPS which is likely shared and may have disk I/O throttling.

    I'd suggest looking at this from an OS performance perspective.

    If it's Linux, `iostat -x 1` or `sar` or `nmon` are good candidates.

    I don't remember, are there any milters in Sendmail?

    What are you using for the DNS server? Is it local to the system or are
    you dependent on something across the network. If it's across the
    network, how far across the network is it?

    Are there any errors in any logs?

    I would naively think that Sendmail itself could handle messages quite a
    bit faster. But I'm probably thinking about SMTP interface vs command
    line forking.



    --
    Grant. . . .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to John Levine on Fri Mar 8 21:32:01 2024
    On 3/8/24 21:19, John Levine wrote:
    Not large, plain text, maybe 10K.

    ACK

    I'll have to check but I believe there's a local cache on the LAN.
    It's sending it all to a smarthost so I wouldn't expect a lot of
    DNS traffic.

    I'm inclined to agree with you. But I don't know what type of DNS
    queries Sendmail might be doing. I think that a sniffer would answer
    that in short order.

    No, it all works, just not terribly fast.

    ACK

    Have you considered wrapping your call to the sendmail binary in time
    and seeing how long things are taking?

    Is there a chance that the vast majority are very fast and after some
    threshold something slows down considerably for a period of time?

    Dare I say it, this is where more data tends to help.

    Right. Hey, here's a question: if I injected the mail via SMTP to
    127.0.0.1, would that be faster than forking and running sendmail?
    Slower? Or am I the first person in sendmail's 35 year history to
    ask this question?

    I would be flabbergasted if you are the first to ask this. I suspect
    many -> most that have are not paying attention to this newsgroup to answer.



    --
    Grant. . . .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Sat Mar 9 03:19:52 2024
    According to Grant Taylor <gtaylor@tnetconsulting.net>:
    On 3/8/24 18:25, John Levine wrote:
    So here's a question: I have on the order of 10,000 messages, each
    with a dozen or so recipients.

    That's quite a few discrete messages.

    It currently takes about 6 hours on a moderately fast VPS.

    Rough math, that's a little over 2 seconds per message.

    On one hand that seems a little slow, but on the other hand, maybe not.

    How big are the messages? There's a big difference if it's a few kB of
    text vs multiple MB of attachments.

    Not large, plain text, maybe 10K.

    If it's Linux, `iostat -x 1` or `sar` or `nmon` are good candidates.

    I don't remember, are there any milters in Sendmail?

    Not on this machine.

    What are you using for the DNS server? Is it local to the system or are
    you dependent on something across the network. If it's across the
    network, how far across the network is it?

    I'll have to check but I believe there's a local cache on the LAN. It's sending it all to a smarthost so I wouldn't expect a lot of DNS traffic.

    Are there any errors in any logs?

    No, it all works, just not terribly fast.

    I would naively think that Sendmail itself could handle messages quite a
    bit faster. But I'm probably thinking about SMTP interface vs command
    line forking.

    Right. Hey, here's a question: if I injected the mail via SMTP to
    127.0.0.1, would that be faster than forking and running sendmail?
    Slower? Or am I the first person in sendmail's 35 year history to ask
    this question?

    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to John Levine on Sat Mar 9 01:22:03 2024
    John Levine wrote:
    According to Grant Taylor <gtaylor@tnetconsulting.net>:

    What are you using for the DNS server? Is it local to the system or are

    I'll have to check but I believe there's a local cache on the LAN. It's sending it all to a smarthost so I wouldn't expect a lot of DNS traffic.

    Seems like a wrong expectation - or did you turn off DNS lookups?
    It's explained in the fine documentation mentioned earlier:
    * DNS Lookups

    If it's one mail with lots of addresses: all of this is done
    sequentially in one process - so hopefully all the data is in a
    local cache.


    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrzej Adam Filip@21:1/5 to John Levine on Sat Mar 9 08:29:22 2024
    John Levine <johnl@taugh.com> wrote:
    According to Claus A.mann <INVALID_NO_CC_REMOVE_IF_YOU_DO_NOT_POST_ml+sendmail(-no-copies-please)@esmtp.org>:
    John Levine wrote:

    Is there a faster way to do it? SMTP to 127.0.0.1? LMTP?

    First you need to identify the bottleneck(s),
    then you can work on solutions.

    Well, yeah, that's why I was wondering whether running the sendmail program is likely to be slow.

    BTW: did you read the fine documentation?
    (hint: "TUNING"...)

    I did and unless I missed something, it says nothing about injecting
    mail via the sendmail command other than the obvious thing that you
    want to queue rather than delivering synchronously.

    So here's a question: I have on the order of 10,000 messages, each
    with a dozen or so recipients. It's currently running the sendmail
    command for each one. If I opened a connection to 127.0.0.1 and
    did a sequence of MAIL FROM/RCPT TO/DATA, would that be faster? How
    about if I did it with N processes in parallel for some modest N? It currently takes about 6 hours on a moderately fast VPS.

    If nobody has any idea, OK, but it's hard to believe I'm the first person ever to wonder about this.

    Hints about injecting a few thousand emails to local sendmail:
    1. Inject multiple messages via the same SMTP connection
    to avoid needles forking of local sendmail processes
    2. Group recipients by recipient's domain
    to reduce number of outgoing SMTP sessions
    3. Use parallel SMTP injections with VERB SMTP command
    to avoid merely queuing due to high local system load.
    VERB turns on "sequential delivery" with reporting delivery progress
    so your parallel submits won't merely put messages to the queue for
    later delivery. It may require allowing VERB from 127.0.0.1 *ONLY*.
    Start with a few SMTP sessions. Consider slow increase to a few
    dozens.

    AFAIR Sympa mail list manager provides some useful hints.
    Queue fast and send out fast does not mean the same for mass mailing.

    Anyway: Expect a few+ surprises from anti-spam measures of receiving
    servers, fresh/new surprises *too* . Usenet message short advises
    must be incomplete so do expect a few nasty surprises.

    About lack of recipes: People tend to avoid providing too easy recipes
    also for (incompetent) spammers.

    --
    [Andrew] Andrzej A. Filip

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Claus =?iso-8859-1?Q?A=DFmann?= @21:1/5 to John Levine on Sat Mar 9 03:38:01 2024
    John Levine wrote:

    (hint: "TUNING"...)

    I did and unless I missed something, it says nothing about injecting

    You are looking for an answer to one specific question - but maybe
    your question does not address the actual problem?

    As others have told you: if you don't know what's "slow", you won't
    be able to solve the problem -- except maybe by trying different things.

    So you could just try your alternative (one SMTP session, multiple transactions) to see what happens.

    PS: someone once complained that some MTA was slow sending mail
    until I asked them about the actual data... which showed their
    (outgoing) internet bandwidth was completely used by the MTA (because
    they sent one RCPT per TA with "lots" of RCPTs and large mails).
    That is, unless you know which bottleneck is actually "hit"
    it's hard to tell what to do differently...

    --
    Note: please read the netiquette before posting. I will almost never
    reply to top-postings which include a full copy of the previous
    article(s) at the end because it's annoying, shows that the poster
    is too lazy to trim his article, and it's wasting the time of all readers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Sat Mar 9 19:38:25 2024
    According to Claus Aßmann <INVALID_NO_CC_REMOVE_IF_YOU_DO_NOT_POST_ml+sendmail(-no-copies-please)@esmtp.org>:
    I'll have to check but I believe there's a local cache on the LAN. It's
    sending it all to a smarthost so I wouldn't expect a lot of DNS traffic.

    Seems like a wrong expectation - or did you turn off DNS lookups?
    It's explained in the fine documentation mentioned earlier:
    * DNS Lookups

    Ah, it's hiding in the TUNING file. I suppose I can turn off the
    canonify stuff. The DNS caches are on the same LAN with ping times
    under a millisecond so cache location is not likely to be a problem,
    but the addreesses in the list should all be real ones.

    Does sendmail really replace CNAMEs in recipient host names? That's
    been deprecated for 25 years.

    I must say it's pretty impressive that sendmail's internal structure is so opaque that nobody has any idea whether running the sendmail program is
    likely to be faster or slower than TCP submission.

    --
    Regards,
    John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)