• Request for review of performance advice

    From Victoria Risk@21:1/5 to bind-users on Tue Jul 7 18:57:18 2020
    A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to our developers and this was pretty successful. We would like to do something similar for BIND, provide a dozen or
    so tips for how to maximize your throughput with BIND. However, as usual, everything is more complicated with BIND.

    Can those of you who care about performance, who have worked to improve your performance, share some of your suggestions that have the most impact? Please also comment if you think any of these ideas below are stupid or dangerous. I have combined advice
    for resolvers and for authoritative servers, I hope it is clear which is which...

    The ideas we have fall into four general categories:

    System design
    1a) Use a load balancer to specialize your resolvers and maximize your cache hit ratio. A load balancer is traditionally designed to spread the traffic out evenly among a pool of servers, but it can also be used to concentrate related queries on one
    server to make its cache as hot as possible. For example, if all queries for domains in .info are sent to one server in a pool, there is a better chance that an answer will be in the cache there.

    1b) If you have a large authoritative system with many servers, consider dedicating some machines to propagate transfers. These machines, called transfer servers, would not answer client queries, but just send notifies and process IXFR requests.

    1c) Deploy ghost secondaries. If you store copies of authoritative zones on resolvers (resolvers as undelegated secondaries), you can avoid querying those authoritative zones. The most obvious uses of this would be mirroring the root zone locally or
    mirroring your own authoritative zones on your resolver.

    we have other system design ideas that we suspect would help, but we are not sure, so I will wait to see if anyone suggests them.

    OS settings and the system environment
    2a) Run on bare metal if possible, not on virtual machines or in the cloud. (any idea how much difference this makes? the only reference we can cite is pretty out of date - https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_
    perf_OARC_Apr_14.pdf <https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf> )

    2b) Consider using with-tuning-large. (https://kb.isc.org/docs/aa-01314 <https://kb.isc.org/docs/aa-01314>) This is a compile time option, so not something you can switch on and off during production.

    2c) Consider which R/W lock choice you want to use - https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named <https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named> For the highest tested query
    rates (> 100,000 queries per second), pthreads read-write locks with hyper-threading enabled seem to be the best-performing choice by far.

    2d) Pay attention to your choice of NIC cards. We have found wide variations in their performance. (Can anyone suggest what specifically to look for?)

    2e) Make sure your socket send buffers are big enough. (not sure if this is obsolete advice, do we need to tell people how to tell if their buffers are causing delays?)

    2f) When the number of CPUs is very large (32 or more), the increase in UDP listeners may not provide any performance improvement and might actually reduce throughput slightly due to the overhead of the additional structures and tasks. We suggest trying
    different values of -U to find the optimal one for your production environment.


    named Features
    3a) Minimize logging. Query logging is expensive (can cost you 20% or more of your throughput) so don’t do it unless you are using the logs for something. Logging with dnstap is lower impact, but still fairly expensive. Don’t run in debug mode
    unless necessary.

    3b) Use named.conf option minimal-responses yes; to reduce the amount of work that named needs to do to assemble the query response as well as reducing the amount of outbound traffic

    3c) Disable synth-from-dnssec. While this seemed like a good idea, it turns out, in practice it does not improve performance.

    3d) Tune your zone transfers. (https://kb.isc.org/docs/aa-00726 <https://kb.isc.org/docs/aa-00726>)
    When tuning the behavior of the primary, there are several factors that you can control:

    - The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)

    - Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)

    - Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)

    The most important options to focus on are transfers-out, serial-query-rate, tcp-clients and tcp-listen-queue.

    4e) If you use RPZ, consider using qnane-wait-recurse. We have had issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very large RPZ zones.

    4f) Consider enabling prefetch on your resolver, unless you are running 9.10 (which is EOL) https://kb.isc.org/docs/aa-01122 <https://kb.isc.org/docs/aa-01122>

    Fix your transport network.
    Transport network issues cause BIND to keep retrying, which is a performance drain.
    4a) Disable (in some cases, completely remove in order to prevent ongoing interference) outbound firewalls/packet-filters (particularly that maintain state on connections). These are a frequent cause of problems in the DNS that can cause your DNS server
    to do a lot of extra work.

    4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096. Ensure that your network infrastructure allows transit for and reassembly of fragmented UDP packets (these will be
    large query responses if you are DNSSEC signing)

    4c) Ensure that your network infrastructure allows DNS over TCP.

    4d) Check for, and eliminate any incomplete IPv6 interface set-up (what can go wrong here is that BIND thinks that it can use IPv6 authoritative servers, but actually the sends silently fail, leaving named waiting unnecessarily for responses)

    Any further suggestions, corrections or warnings are very welcome.

    Thank you!
    Vicky

    ---------

    Victoria Risk
    Product Manager
    Internet Systems Consortium
    vicky@isc.org






    <html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">A while ago we created a KB article with tips on how to improve
    your performance with our Kea dhcp server. The tips were fairly obvious to our developers and this was pretty successful. We would like to do something similar for BIND, provide a dozen or so tips for how to maximize your throughput with BIND. However,
    as usual, everything is more complicated with BIND.<div class=""><br class=""></div><div class="">Can those of you who care about performance, who have worked to improve your performance, share some of your suggestions that have the most impact? &nbsp;
    Please also comment if you think any of these ideas below are stupid or dangerous. I have combined advice for resolvers and for authoritative servers, I hope it is clear which is which...<br class=""><div class=""><br class=""></div><div class="">The
    ideas we have fall into four general categories:</div><div class=""><br class=""></div><div class="">System design</div><div class=""><span id="docs-internal-guid-8bd01d59-7fff-de6c-6b62-d43b75bc5624" class=""><span style="font-variant-ligatures: normal;
    font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1a) Use a load balancer</span><span style="font-style: italic; font-variant-ligatures: normal; font-variant-east-asian: normal;
    font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class=""> </span><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;
    " class="">to specialize your resolvers and maximize your cache hit ratio.&nbsp; A load balancer is traditionally designed to spread the traffic out evenly among a pool of servers, but it can also be used to concentrate related queries on one server to
    make its cache as hot as possible. For example, if all queries for domains in .info are sent to one server in a pool, there is a better chance that an answer will be in the cache there.</span></span></div><div class=""><br class=""></div><div class=""><
    span id="docs-internal-guid-a7429f5d-7fff-21f2-d35c-7c59e291531b" class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1b) If you
    have a large authoritative system with many servers, consider dedicating some machines to propagate transfers. These machines, called transfer servers, would not answer client queries, but just send notifies and process IXFR requests.</span></span></div><
    div class=""><span class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class=""><br class=""></span></span></div><div class=""><span class=
    ""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1c) Deploy </span></span><span style="white-space: pre-wrap;" class="">ghost
    secondaries.&nbsp; If you store copies of authoritative zones on resolvers (resolvers as undelegated secondaries), you can avoid querying those authoritative zones. The most obvious uses of this would be mirroring the root zone locally or mirroring your
    own authoritative zones on your resolver.</span></div><div class=""><br class=""></div><div class="">we have other system design ideas that we suspect would help, but we are not sure, so I will wait to see if anyone suggests them.</div><div class=""><br
    class=""></div><div class=""><span class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">OS settings and the system environment</
    span></span></div><div class="">2a) Run on bare metal if possible, not on virtual machines or in the cloud. (any idea how much difference this makes? the only reference we can cite is pretty out of date -&nbsp;<span style="white-space: pre-wrap;" class=""
    <a href="https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf" class="">https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf</a> )</span></div><div class=""><
    br class=""></div><div class="">2b) Consider using with-tuning-large. (<span style="white-space: pre-wrap;" class=""><a href="https://kb.isc.org/docs/aa-01314" class="">https://kb.isc.org/docs/aa-01314</a>) </span>This is a compile time option, so not
    something you can switch on and off during production.&nbsp;</div><div class=""><br class=""></div><div class="">2c) Consider which&nbsp;<span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-
    align: baseline; white-space: pre-wrap;" class="">R/W lock choice you want to use - </span><span style="text-decoration: underline; color: rgb(17, 85, 204); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal;
    text-decoration-skip: none; vertical-align: baseline; white-space: pre-wrap;" class=""><a href="https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named" class="">https://kb.isc.org/docs/choosing-a-read-write-lock-
    implementation-to-use-with-named</a> </span><span style="caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">For the highest tested query rates (&gt; 100,000 queries per second), pthreads read-write locks with hyper-threading&nbsp;</span><em
    style="caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34); box-sizing: border-box;" class="">enabled</em><span style="caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">&nbsp;</span><span style="caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">seem to be the best-performing choice by far.</span></div><div class=""><span style="caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class=""><br class=""></span></div><div class=""><span style="caret-color: rgb(34, 34, 34);
    color: rgb(34, 34, 34);" class="">2d) Pay attention to your choice of NIC cards. We have found wide variations in their performance. (Can anyone suggest what specifically to look for?)</span></div><div class=""><span style="caret-color: rgb(34, 34, 34);
    color: rgb(34, 34, 34);" class=""><br class=""></span></div><div class=""><span style="caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">2e) Make sure your socket send buffers are big enough. (not sure if this is obsolete advice, do we need
    to tell people how to tell if their buffers are causing delays?)</span></div><div class=""><br class=""></div><div class="">2f)&nbsp;<span id="docs-internal-guid-8d50db57-7fff-f45a-7f4d-9bbec5aebc28" class=""><span style="font-variant-ligatures: normal;
    font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">When the number of CPUs is very large (32 or more), the increase in UDP listeners may not provide any performance improvement and
    might actually reduce throughput slightly due to the overhead of the additional structures and tasks. We suggest trying different values of -U to find the optimal one for your production environment.</span></span></div><div class=""><span style="white-
    space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="">named Features</span></div><div class=""><span style="
    white-space: pre-wrap;" class="">3a) Minimize logging. Query logging is expensive (can cost you 20% or more of your throughput) so don’t do it unless you are using the logs for something. Logging with dnstap is lower impact, but still fairly expensive.
    </span><span style="white-space: pre-wrap;" class="">Don’t run in debug mode unless necessary. </span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="
    ">3b) </span><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">Use named.conf option minimal-responses yes; to reduce the amount of work that named needs to do to assemble the query response as well as reducing the amount of outbound
    traffic</span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="">3c) </span><span style="white-space: pre-wrap;" class="">Disable synth-from-dnssec.
    While this seemed like a good idea, it turns out, in practice it does not improve performance.</span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="">
    3d) Tune your zone transfers. </span><span style="white-space: pre-wrap;" class=""> (</span><a href="https://kb.isc.org/docs/aa-00726" style="white-space: pre-wrap;" class="">https://kb.isc.org/docs/aa-00726</a><span style="white-space: pre-wrap;" class="
    ">)</span></div><div class=""><p style="box-sizing: border-box; margin: 0px 0px 1rem; padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">When tuning the behavior of the primary, there are several factors that you can control:</
    <p style="box-sizing: border-box; margin: 0px 0px 1rem; padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">- The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)</p><p style="box-
    sizing: border-box; margin: 0px 0px 1rem; padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">- Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)</p><p style="box-sizing: border-
    box; margin: 0px 0px 1rem; padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">- Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)</p><p style="box-sizing: border-box; margin: 0px 0px
    1rem; padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34, 34, 34);" class="">The most important options to focus on are transfers-out, serial-query-rate, tcp-clients and tcp-listen-queue.</p></div><div class="">4e) If you use RPZ, consider using
    qnane-wait-recurse. We have had issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very large RPZ zones.&nbsp;</div><div class=""><br class=""></div><div class="">4f)
    Consider enabling prefetch on your resolver, unless you are running 9.10 (which is EOL)&nbsp;<a href="https://kb.isc.org/docs/aa-01122" class="">https://kb.isc.org/docs/aa-01122</a></div><div class=""><br class=""></div><div class=""><span style="white-
    space: pre-wrap;" class="">Fix your transport network.&nbsp;</span></div><div class=""><span style="white-space: pre-wrap;" class="">Transport network issues cause BIND to keep retrying, which is a performance drain.</span></div><div class=""><span id="
    docs-internal-guid-86e034a7-7fff-6820-9bb1-bcad17499827" class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class=
    "">4a) Disable (in some cases, completely remove in order to prevent ongoing interference) outbound firewalls/packet-filters (particularly that maintain state on connections). These are a frequent cause of problems in the DNS that can cause your DNS
    server to do a lot of extra work. </span></span></div><div class=""><span class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space:
    pre-wrap;" class=""><br class=""></span></span></div><div class=""><span id="docs-internal-guid-a2400cb3-7fff-8adf-a4da-1d499f82fd2f" class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-
    variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096. </span></span><span style="color:
    rgb(34, 34, 34); white-space: pre-wrap;" class="">Ensure that your network infrastructure allows transit for and reassembly of fragmented UDP packets (these will be large query responses if you are DNSSEC signing)</span></div><div class=""><span style="
    color: rgb(34, 34, 34); white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">4c) </span><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">Ensure
    that your network infrastructure allows DNS over TCP.</span></div><div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;"
    class="">4d) </span><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">Check for, and eliminate any incomplete
    IPv6 interface set-up (what can go wrong here is that BIND </span><span style="color: rgb(34, 34, 34); font-style: italic; font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-
    space: pre-wrap;" class="">thinks</span><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class=""> that it can use IPv6
    authoritative servers, but actually the sends silently fail, leaving named waiting unnecessarily for responses)</span></div><div class=""><div class=""><br class=""></div></div><div class=""><span style="white-space: pre-wrap;" class="">Any further
    suggestions, corrections or warnings are very welcome. </span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="">Thank you!</span></div><div class=""><
    span style="white-space: pre-wrap;" class="">Vicky</span></div><div class=""><span style="white-space: pre-wrap;" class=""><br class=""></span></div><div class=""><span style="white-space: pre-wrap;" class="">---------</span></div><div class=""><span
    style="white-space: pre-wrap;" class=""><br class=""></span><div class="">
    <div style="color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-
    break: after-white-space;" class=""><div style="color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-
    nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Victoria Risk</div><div class="">Product Manager</div><div class="">Internet Systems Consortium</div><div class=""><a href="mailto:vicky@isc.org" class="">vicky@isc.org</a></
    <div class=""><br class=""></div></div><br class="Apple-interchange-newline"></div><br class="Apple-interchange-newline"><br class="Apple-interchange-newline">
    </div>
    <br class=""></div></div></body></html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Browne, Stuart@21:1/5 to Victoria Risk on Wed Jul 8 02:41:39 2020
    To: bind-users@lists.isc.org (bind-users)

    SnVzdCBvbmUgcXVpY2sgb25lIGJlZm9yZSBJIHJ1biBvZmYgdG8gbHVuY2ggd2l0aCByZWdhcmRz IHRvIHNlY3Rpb24gMjoNCg0KLSBUcnkgdG8gYXZvaWQgY3Jvc3NpbmcgTlVNQSBib3VuZGFyaWVz LiBBdCBoaWdoIHRocm91Z2hwdXQsIHRoZSBjb250ZXh0IHN3aXRjaGluZyBhbmQgZmFyIG1lbW9y eSBjYWxscyBraWxscyBwZXJmb3JtYW5jZS4NCg0KU3R1YXJ0DQoNCkZyb206IGJpbmQtdXNlcnMg PGJpbmQtdXNlcnMtYm91bmNlc0BsaXN0cy5pc2Mub3JnPiBvbiBiZWhhbGYgb2YgVmljdG9yaWEg UmlzayA8dmlja3lAaXNjLm9yZz4NCkRhdGU6IFdlZG5lc2RheSwgOCBKdWx5IDIwMjAgYXQgMTE6 NTgNClRvOiBiaW5kLXVzZXJzIDxiaW5kLXVzZXJzQGxpc3RzLmlzYy5vcmc+DQpTdWJqZWN0OiBS ZXF1ZXN0IGZvciByZXZpZXcgb2YgcGVyZm9ybWFuY2UgYWR2aWNlDQoNCkEgd2hpbGUgYWdvIHdl IGNyZWF0ZWQgYSBLQiBhcnRpY2xlIHdpdGggdGlwcyBvbiBob3cgdG8gaW1wcm92ZSB5b3VyIHBl cmZvcm1hbmNlIHdpdGggb3VyIEtlYSBkaGNwIHNlcnZlci4gVGhlIHRpcHMgd2VyZSBmYWlybHkg b2J2aW91cyB0byBvdXIgZGV2ZWxvcGVycyBhbmQgdGhpcyB3YXMgcHJldHR5IHN1Y2Nlc3NmdWwu IFdlIHdvdWxkIGxpa2UgdG8gZG8gc29tZXRoaW5nIHNpbWlsYXIgZm9yIEJJTkQsIHByb3ZpZGUg YSBkb3plbiBvciBzbyB0aXBzIGZvciBob3cgdG8gbWF4aW1pemUgeW91ciB0aHJvdWdocHV0IHdp dGggQklORC4gSG93ZXZlciwgYXMgdXN1YWwsIGV2ZXJ5dGhpbmcgaXMgbW9yZSBjb21wbGljYXRl ZCB3aXRoIEJJTkQuDQoNCkNhbiB0aG9zZSBvZiB5b3Ugd2hvIGNhcmUgYWJvdXQgcGVyZm9ybWFu Y2UsIHdobyBoYXZlIHdvcmtlZCB0byBpbXByb3ZlIHlvdXIgcGVyZm9ybWFuY2UsIHNoYXJlIHNv bWUgb2YgeW91ciBzdWdnZXN0aW9ucyB0aGF0IGhhdmUgdGhlIG1vc3QgaW1wYWN0PyDCoFBsZWFz ZSBhbHNvIGNvbW1lbnQgaWYgeW91IHRoaW5rIGFueSBvZiB0aGVzZSBpZGVhcyBiZWxvdyBhcmUg c3R1cGlkIG9yIGRhbmdlcm91cy4gSSBoYXZlIGNvbWJpbmVkIGFkdmljZSBmb3IgcmVzb2x2ZXJz IGFuZCBmb3IgYXV0aG9yaXRhdGl2ZSBzZXJ2ZXJzLCBJIGhvcGUgaXQgaXMgY2xlYXIgd2hpY2gg aXMgd2hpY2guLi4NCg0KVGhlIGlkZWFzIHdlIGhhdmUgZmFsbCBpbnRvIGZvdXIgZ2VuZXJhbCBj YXRlZ29yaWVzOg0KDQpTeXN0ZW0gZGVzaWduDQoxYSkgVXNlIGEgbG9hZCBiYWxhbmNlciB0byBz cGVjaWFsaXplIHlvdXIgcmVzb2x2ZXJzIGFuZCBtYXhpbWl6ZSB5b3VyIGNhY2hlIGhpdCByYXRp by7CoCBBIGxvYWQgYmFsYW5jZXIgaXMgdHJhZGl0aW9uYWxseSBkZXNpZ25lZCB0byBzcHJlYWQg dGhlIHRyYWZmaWMgb3V0IGV2ZW5seSBhbW9uZyBhIHBvb2wgb2Ygc2VydmVycywgYnV0IGl0IGNh biBhbHNvIGJlIHVzZWQgdG8gY29uY2VudHJhdGUgcmVsYXRlZCBxdWVyaWVzIG9uIG9uZSBzZXJ2 ZXIgdG8gbWFrZSBpdHMgY2FjaGUgYXMgaG90IGFzIHBvc3NpYmxlLiBGb3IgZXhhbXBsZSwgaWYg YWxsIHF1ZXJpZXMgZm9yIGRvbWFpbnMgaW4gLmluZm8gYXJlIHNlbnQgdG8gb25lIHNlcnZlciBp biBhIHBvb2wsIHRoZXJlIGlzIGEgYmV0dGVyIGNoYW5jZSB0aGF0IGFuIGFuc3dlciB3aWxsIGJl IGluIHRoZSBjYWNoZSB0aGVyZS4NCg0KMWIpIElmIHlvdSBoYXZlIGEgbGFyZ2UgYXV0aG9yaXRh dGl2ZSBzeXN0ZW0gd2l0aCBtYW55IHNlcnZlcnMsIGNvbnNpZGVyIGRlZGljYXRpbmcgc29tZSBt YWNoaW5lcyB0byBwcm9wYWdhdGUgdHJhbnNmZXJzLiBUaGVzZSBtYWNoaW5lcywgY2FsbGVkIHRy YW5zZmVyIHNlcnZlcnMsIHdvdWxkIG5vdCBhbnN3ZXIgY2xpZW50IHF1ZXJpZXMsIGJ1dCBqdXN0 IHNlbmQgbm90aWZpZXMgYW5kIHByb2Nlc3MgSVhGUiByZXF1ZXN0cy4NCg0KDQoxYykgRGVwbG95 IGdob3N0IHNlY29uZGFyaWVzLsKgIElmIHlvdSBzdG9yZSBjb3BpZXMgb2YgYXV0aG9yaXRhdGl2 ZSB6b25lcyBvbiByZXNvbHZlcnMgKHJlc29sdmVycyBhcyB1bmRlbGVnYXRlZCBzZWNvbmRhcmll cyksIHlvdSBjYW4gYXZvaWQgcXVlcnlpbmcgdGhvc2UgYXV0aG9yaXRhdGl2ZSB6b25lcy4gVGhl IG1vc3Qgb2J2aW91cyB1c2VzIG9mIHRoaXMgd291bGQgYmUgbWlycm9yaW5nIHRoZSByb290IHpv bmUgbG9jYWxseSBvciBtaXJyb3JpbmcgeW91ciBvd24gYXV0aG9yaXRhdGl2ZSB6b25lcyBvbiB5 b3VyIHJlc29sdmVyLg0KDQp3ZSBoYXZlIG90aGVyIHN5c3RlbSBkZXNpZ24gaWRlYXMgdGhhdCB3 ZSBzdXNwZWN0IHdvdWxkIGhlbHAsIGJ1dCB3ZSBhcmUgbm90IHN1cmUsIHNvIEkgd2lsbCB3YWl0 IHRvIHNlZSBpZiBhbnlvbmUgc3VnZ2VzdHMgdGhlbS4NCg0KT1Mgc2V0dGluZ3MgYW5kIHRoZSBz eXN0ZW0gZW52aXJvbm1lbnQNCjJhKSBSdW4gb24gYmFyZSBtZXRhbCBpZiBwb3NzaWJsZSwgbm90 IG9uIHZpcnR1YWwgbWFjaGluZXMgb3IgaW4gdGhlIGNsb3VkLiAoYW55IGlkZWEgaG93IG11Y2gg ZGlmZmVyZW5jZSB0aGlzIG1ha2VzPyB0aGUgb25seSByZWZlcmVuY2Ugd2UgY2FuIGNpdGUgaXMg cHJldHR5IG91dCBvZiBkYXRlIC3CoGh0dHBzOi8vdXJsZGVmZW5zZS5jb20vdjMvX19odHRwczov aW5kaWNvLmRucy1vYXJjLm5ldC9ldmVudC8xOS9jb250cmlidXRpb25zLzIzNC9hdHRhY2htZW50 cy8yMTcvNDExL0ROU19wZXJmX09BUkNfQXByXzE0LnBkZl9fOyEhTjE0SG5CSEYhcmstUmZ6UjBj aHc4bVRvR01XQXdRQUZfV2lpWEtaTTNLWG9sM1dSOFlQeXRQb0lfY1d5TmU1QlpfcnNFcWRWN1Q5 U0lRMU0kICkNCg0KMmIpIENvbnNpZGVyIHVzaW5nIHdpdGgtdHVuaW5nLWxhcmdlLiAoaHR0cHM6 Ly91cmxkZWZlbnNlLmNvbS92My9fX2h0dHBzOi9rYi5pc2Mub3JnL2RvY3MvYWEtMDEzMTRfXzsh IU4xNEhuQkhGIXJrLVJmelIwY2h3OG1Ub0dNV0F3UUFGX1dpaVhLWk0zS1hvbDNXUjhZUHl0UG9J X2NXeU5lNUJaX3JzRXFkVjd1ZlNNYm5VJCkgVGhpcyBpcyBhIGNvbXBpbGUgdGltZSBvcHRpb24s IHNvIG5vdCBzb21ldGhpbmcgeW91IGNhbiBzd2l0Y2ggb24gYW5kIG9mZiBkdXJpbmcgcHJvZHVj dGlvbi7CoA0KDQoyYykgQ29uc2lkZXIgd2hpY2jCoFIvVyBsb2NrIGNob2ljZSB5b3Ugd2FudCB0 byB1c2UgLSBodHRwczovL3VybGRlZmVuc2UuY29tL3YzL19faHR0cHM6L2tiLmlzYy5vcmcvZG9j cy9jaG9vc2luZy1hLXJlYWQtd3JpdGUtbG9jay1pbXBsZW1lbnRhdGlvbi10by11c2Utd2l0aC1u YW1lZF9fOyEhTjE0SG5CSEYhcmstUmZ6UjBjaHc4bVRvR01XQXdRQUZfV2lpWEtaTTNLWG9sM1dS OFlQeXRQb0lfY1d5TmU1QlpfcnNFcWRWN21WVlVnNEEkIEZvciB0aGUgaGlnaGVzdCB0ZXN0ZWQg cXVlcnkgcmF0ZXMgKD4gMTAwLDAwMCBxdWVyaWVzIHBlciBzZWNvbmQpLCBwdGhyZWFkcyByZWFk LXdyaXRlIGxvY2tzIHdpdGggaHlwZXItdGhyZWFkaW5nwqBlbmFibGVkwqBzZWVtIHRvIGJlIHRo ZSBiZXN0LXBlcmZvcm1pbmcgY2hvaWNlIGJ5IGZhci4NCg0KDQoyZCkgUGF5IGF0dGVudGlvbiB0 byB5b3VyIGNob2ljZSBvZiBOSUMgY2FyZHMuIFdlIGhhdmUgZm91bmQgd2lkZSB2YXJpYXRpb25z IGluIHRoZWlyIHBlcmZvcm1hbmNlLiAoQ2FuIGFueW9uZSBzdWdnZXN0IHdoYXQgc3BlY2lmaWNh bGx5IHRvIGxvb2sgZm9yPykNCg0KDQoyZSkgTWFrZSBzdXJlIHlvdXIgc29ja2V0IHNlbmQgYnVm ZmVycyBhcmUgYmlnIGVub3VnaC4gKG5vdCBzdXJlIGlmIHRoaXMgaXMgb2Jzb2xldGUgYWR2aWNl LCBkbyB3ZSBuZWVkIHRvIHRlbGwgcGVvcGxlIGhvdyB0byB0ZWxsIGlmIHRoZWlyIGJ1ZmZlcnMg YXJlIGNhdXNpbmcgZGVsYXlzPykNCg0KMmYpwqBXaGVuIHRoZSBudW1iZXIgb2YgQ1BVcyBpcyB2 ZXJ5IGxhcmdlICgzMiBvciBtb3JlKSwgdGhlIGluY3JlYXNlIGluIFVEUCBsaXN0ZW5lcnMgbWF5 IG5vdCBwcm92aWRlIGFueSBwZXJmb3JtYW5jZSBpbXByb3ZlbWVudCBhbmQgbWlnaHQgYWN0dWFs bHkgcmVkdWNlIHRocm91Z2hwdXQgc2xpZ2h0bHkgZHVlIHRvIHRoZSBvdmVyaGVhZCBvZiB0aGUg YWRkaXRpb25hbCBzdHJ1Y3R1cmVzIGFuZCB0YXNrcy4gV2Ugc3VnZ2VzdCB0cnlpbmcgZGlmZmVy ZW50IHZhbHVlcyBvZiAtVSB0byBmaW5kIHRoZSBvcHRpbWFsIG9uZSBmb3IgeW91ciBwcm9kdWN0 aW9uIGVudmlyb25tZW50Lg0KDQoNCg0KDQpuYW1lZCBGZWF0dXJlcw0KM2EpIE1pbmltaXplIGxv Z2dpbmcuIFF1ZXJ5IGxvZ2dpbmcgaXMgZXhwZW5zaXZlIChjYW4gY29zdCB5b3UgMjAlIG9yIG1v cmUgb2YgeW91ciB0aHJvdWdocHV0KSBzbyBkb27igJl0IGRvIGl0IHVubGVzcyB5b3UgYXJlIHVz aW5nIHRoZSBsb2dzIGZvciBzb21ldGhpbmcuIExvZ2dpbmcgd2l0aCBkbnN0YXAgaXMgbG93ZXIg aW1wYWN0LCBidXQgc3RpbGwgZmFpcmx5IGV4cGVuc2l2ZS4gRG9u4oCZdCBydW4gaW4gZGVidWcg bW9kZSB1bmxlc3MgbmVjZXNzYXJ5LiANCg0KDQozYikgVXNlIG5hbWVkLmNvbmYgb3B0aW9uIG1p bmltYWwtcmVzcG9uc2VzIHllczsgdG8gcmVkdWNlIHRoZSBhbW91bnQgb2Ygd29yayB0aGF0IG5h bWVkIG5lZWRzIHRvIGRvIHRvIGFzc2VtYmxlIHRoZSBxdWVyeSByZXNwb25zZSBhcyB3ZWxsIGFz IHJlZHVjaW5nIHRoZSBhbW91bnQgb2Ygb3V0Ym91bmQgdHJhZmZpYw0KDQoNCjNjKSBEaXNhYmxl IHN5bnRoLWZyb20tZG5zc2VjLiBXaGlsZSB0aGlzIHNlZW1lZCBsaWtlIGEgZ29vZCBpZGVhLCBp dCB0dXJucyBvdXQsIGluIHByYWN0aWNlIGl0IGRvZXMgbm90IGltcHJvdmUgcGVyZm9ybWFuY2Uu DQoNCg0KM2QpIFR1bmUgeW91ciB6b25lIHRyYW5zZmVycy4gKGh0dHBzOi8vdXJsZGVmZW5zZS5j b20vdjMvX19odHRwczova2IuaXNjLm9yZy9kb2NzL2FhLTAwNzI2X187ISFOMTRIbkJIRiFyay1S ZnpSMGNodzhtVG9HTVdBd1FBRl9XaWlYS1pNM0tYb2wzV1I4WVB5dFBvSV9jV3lOZTVCWl9yc0Vx ZFY3S183LVZuUSQpDQpXaGVuIHR1bmluZyB0aGUgYmVoYXZpb3Igb2YgdGhlIHByaW1hcnksIHRo ZXJlIGFyZSBzZXZlcmFsIGZhY3RvcnMgdGhhdCB5b3UgY2FuIGNvbnRyb2w6DQotIFRoZSByYXRl IG9mIG5vdGlmaWNhdGlvbnMgb2YgY2hhbmdlcyB0byBzZWNvbmRhcnkgc2VydmVycyAoc2VyaWFs LXF1ZXJ5LXJhdGUgYW5kIG5vdGlmeS1kZWxheSkNCi0gTGltaXRzIG9uIGNvbmN1cnJlbnQgem9u ZSB0cmFuc2ZlcnMgKHRyYW5zZmVycy1vdXQsIHRjcC1jbGllbnRzLCB0Y3AtbGlzdGVuLXF1ZXVl LCByZXNlcnZlZC1zb2NrZXRzKQ0KLSBFZmZpY2llbmN5L21hbmFnZW1lbnQgb3B0aW9ucyAobWF4 LXRyYW5zZmVyLXRpbWUtb3V0LCBtYXgtdHJhbnNmZXItaWRsZS1vdXQsIHRyYW5zZmVyLWZvcm1h dCkNClRoZSBtb3N0IGltcG9ydGFudCBvcHRpb25zIHRvIGZvY3VzIG9uIGFyZSB0cmFuc2ZlcnMt b3V0LCBzZXJpYWwtcXVlcnktcmF0ZSwgdGNwLWNsaWVudHMgYW5kIHRjcC1saXN0ZW4tcXVldWUu DQo0ZSkgSWYgeW91IHVzZSBSUFosIGNvbnNpZGVyIHVzaW5nIHFuYW5lLXdhaXQtcmVjdXJzZS4g V2UgaGF2ZSBoYWQgaXNzdWVzIHdpdGggUlBaIHRyYW5zZmVycyBpbXBhY3RpbmcgcXVlcnkgcGVy Zm9ybWFuY2UgaW4gcmVzb2x2ZXJzLiBJbiBnZW5lcmFsLCBtb3JlIHNtYWxsZXIgUlBaIHpvbmVz IHdpbGwgdHJhbnNmZXIgZmFzdGVyIHRoYW4gYSBmZXcgdmVyeSBsYXJnZSBSUFogem9uZXMuwqAN Cg0KNGYpIENvbnNpZGVyIGVuYWJsaW5nIHByZWZldGNoIG9uIHlvdXIgcmVzb2x2ZXIsIHVubGVz cyB5b3UgYXJlIHJ1bm5pbmcgOS4xMCAod2hpY2ggaXMgRU9MKcKgaHR0cHM6Ly91cmxkZWZlbnNl LmNvbS92My9fX2h0dHBzOi9rYi5pc2Mub3JnL2RvY3MvYWEtMDExMjJfXzshIU4xNEhuQkhGIXJr LVJmelIwY2h3OG1Ub0dNV0F3UUFGX1dpaVhLWk0zS1hvbDNXUjhZUHl0UG9JX2NXeU5lNUJaX3Jz RXFkVjcxNEFzbmtFJA0KDQpGaXggeW91ciB0cmFuc3BvcnQgbmV0d29yay7CoA0KVHJhbnNwb3J0 IG5ldHdvcmsgaXNzdWVzIGNhdXNlIEJJTkQgdG8ga2VlcCByZXRyeWluZywgd2hpY2ggaXMgYSBw ZXJmb3JtYW5jZSBkcmFpbi4NCjRhKSBEaXNhYmxlIChpbiBzb21lIGNhc2VzLCBjb21wbGV0ZWx5 IHJlbW92ZSBpbiBvcmRlciB0byBwcmV2ZW50IG9uZ29pbmcgaW50ZXJmZXJlbmNlKSBvdXRib3Vu ZCBmaXJld2FsbHMvcGFja2V0LWZpbHRlcnMgKHBhcnRpY3VsYXJseSB0aGF0IG1haW50YWluIHN0 YXRlIG9uIGNvbm5lY3Rpb25zKS4gVGhlc2UgYXJlIGEgZnJlcXVlbnQgY2F1c2Ugb2YgcHJvYmxl bXMgaW4gdGhlIEROUyB0aGF0IGNhbiBjYXVzZSB5b3VyIEROUyBzZXJ2ZXIgdG8gZG8gYSBsb3Qg b2YgZXh0cmEgd29yay4gDQoNCg0KNGIpIFNldCBhbiBhcHByb3ByaWF0ZSBNVFUgZm9yIHlvdXIg bmV0d29yay4gRW5zdXJlIHRoYXQgeW91ciBuZXR3b3JrIGluZnJhc3RydWN0dXJlIHN1cHBvcnRz IEVETlMgYW5kIGxhcmdlIFVEUCByZXNwb25zZXMgdXAgdG8gNDA5Ni4gRW5zdXJlIHRoYXQgeW91 ciBuZXR3b3JrIGluZnJhc3RydWN0dXJlIGFsbG93cyB0cmFuc2l0IGZvciBhbmQgcmVhc3NlbWJs eSBvZiBmcmFnbWVudGVkIFVEUCBwYWNrZXRzICh0aGVzZSB3aWxsIGJlIGxhcmdlIHF1ZXJ5IHJl c3BvbnNlcyBpZiB5b3UgYXJlIEROU1NFQyBzaWduaW5nKQ0KDQoNCjRjKSBFbnN1cmUgdGhhdCB5 b3VyIG5ldHdvcmsgaW5mcmFzdHJ1Y3R1cmUgYWxsb3dzIEROUyBvdmVyIFRDUC4NCg0KDQo0ZCkg Q2hlY2sgZm9yLCBhbmQgZWxpbWluYXRlIGFueSBpbmNvbXBsZXRlIElQdjYgaW50ZXJmYWNlIHNl dC11cCAod2hhdCBjYW4gZ28gd3JvbmcgaGVyZSBpcyB0aGF0IEJJTkQgdGhpbmtzIHRoYXQgaXQg Y2FuIHVzZSBJUHY2IGF1dGhvcml0YXRpdmUgc2VydmVycywgYnV0IGFjdHVhbGx5IHRoZSBzZW5k cyBzaWxlbnRseSBmYWlsLCBsZWF2aW5nIG5hbWVkIHdhaXRpbmcgdW5uZWNlc3NhcmlseSBmb3Ig cmVzcG9uc2VzKQ0KDQpBbnkgZnVydGhlciBzdWdnZXN0aW9ucywgY29ycmVjdGlvbnMgb3Igd2Fy bmluZ3MgYXJlIHZlcnkgd2VsY29tZS4gDQoNCg0KVGhhbmsgeW91IQ0KVmlja3kNCg0KDQotLS0t LS0tLS0NCg0KDQpWaWN0b3JpYSBSaXNrDQpQcm9kdWN0IE1hbmFnZXINCkludGVybmV0IFN5c3Rl bXMgQ29uc29ydGl1bQ0KbWFpbHRvOnZpY2t5QGlzYy5vcmcNCg0KDQoNCg0KDQo=

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Thurston@21:1/5 to Victoria Risk on Wed Jul 8 08:39:00 2020
    On 7/7/2020 5:57 PM, Victoria Risk wrote:
    A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to
    our developers and this was pretty successful. We would like to do
    something similar for BIND, provide a dozen or so tips for how to
    maximize your throughput with BIND. However, as usual, everything is
    more complicated with BIND.

    This is an excellent idea.

    If it comes to fruition, I ask there be some guidance offered on when
    such optimizations are useful. I've seen places where such a guide-sheet
    is followed when the guidelines were suitable for a business with 10X or
    100X the traffic the customer sees.

    That is, a configuration which benefits an organization seeing 100,000
    qps may be excessively complex (or brittle) for one seeing 100 qps.

    --
    Do things because you should, not just because you can.

    John Thurston 907-465-8591
    John.Thurston@alaska.gov
    Department of Administration
    State of Alaska



    Can those of you who care about performance, who have worked to improve
    your performance, share some of your suggestions that have the most
    impact?  Please also comment if you think any of these ideas below are stupid or dangerous. I have combined advice for resolvers and for authoritative servers, I hope it is clear which is which...

    The ideas we have fall into four general categories:

    System design
    1a) Use a load balancerto specialize your resolvers and maximize your
    cache hit ratio.  A load balancer is traditionally designed to spread
    the traffic out evenly among a pool of servers, but it can also be used
    to concentrate related queries on one server to make its cache as hot as possible. For example, if all queries for domains in .info are sent to
    one server in a pool, there is a better chance that an answer will be in
    the cache there.

    1b) If you have a large authoritative system with many servers, consider dedicating some machines to propagate transfers. These machines, called transfer servers, would not answer client queries, but just send
    notifies and process IXFR requests.

    1c) Deploy ghost secondaries.  If you store copies of authoritative
    zones on resolvers (resolvers as undelegated secondaries), you can avoid querying those authoritative zones. The most obvious uses of this would
    be mirroring the root zone locally or mirroring your own authoritative
    zones on your resolver.

    we have other system design ideas that we suspect would help, but we are
    not sure, so I will wait to see if anyone suggests them.

    OS settings and the system environment
    2a) Run on bare metal if possible, not on virtual machines or in the
    cloud. (any idea how much difference this makes? the only reference we
    can cite is pretty out of date - https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf
    <https://urldefense.com/v3/__https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYfEBpbu8w$>
    )

    2b) Consider using with-tuning-large. (https://kb.isc.org/docs/aa-01314 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-01314__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYdvKmJFZQ$>)
    This is a compile time option, so not something you can switch on and
    off during production.

    2c) Consider which R/W lock choice you want to use - https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named
    <https://urldefense.com/v3/__https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYftHIt-qg$>
    For the highest tested query rates (> 100,000 queries per second),
    pthreads read-write locks with hyper-threading /enabled/seem to be the best-performing choice by far.

    2d) Pay attention to your choice of NIC cards. We have found wide
    variations in their performance. (Can anyone suggest what specifically
    to look for?)

    2e) Make sure your socket send buffers are big enough. (not sure if this
    is obsolete advice, do we need to tell people how to tell if their
    buffers are causing delays?)

    2f) When the number of CPUs is very large (32 or more), the increase in
    UDP listeners may not provide any performance improvement and might
    actually reduce throughput slightly due to the overhead of the
    additional structures and tasks. We suggest trying different values of
    -U to find the optimal one for your production environment.


    named Features
    3a) Minimize logging. Query logging is expensive (can cost you 20% or
    more of your throughput) so don’t do it unless you are using the logs
    for something. Logging with dnstap is lower impact, but still fairly expensive. Don’t run in debug mode unless necessary.

    3b) Use named.conf option minimal-responses yes; to reduce the amount of
    work that named needs to do to assemble the query response as well as reducing the amount of outbound traffic

    3c) Disable synth-from-dnssec. While this seemed like a good idea, it
    turns out, in practice it does not improve performance.

    3d) Tune your zone transfers. (https://kb.isc.org/docs/aa-00726 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-00726__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYe98KMFqg$>)

    When tuning the behavior of the primary, there are several factors that
    you can control:

    - The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)

    - Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)

    - Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)

    The most important options to focus on are transfers-out,
    serial-query-rate, tcp-clients and tcp-listen-queue.

    4e) If you use RPZ, consider using qnane-wait-recurse. We have had
    issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very
    large RPZ zones.

    4f) Consider enabling prefetch on your resolver, unless you are running
    9.10 (which is EOL) https://kb.isc.org/docs/aa-01122 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-01122__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYcf-H7ZBg$>

    Fix your transport network.
    Transport network issues cause BIND to keep retrying, which is a
    performance drain.
    4a) Disable (in some cases, completely remove in order to prevent
    ongoing interference) outbound firewalls/packet-filters (particularly
    that maintain state on connections). These are a frequent cause of
    problems in the DNS that can cause your DNS server to do a lot of extra
    work.

    4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096. Ensure
    that your network infrastructure allows transit for and reassembly of fragmented UDP packets (these will be large query responses if you are
    DNSSEC signing)

    4c) Ensure that your network infrastructure allows DNS over TCP.

    4d) Check for, and eliminate any incomplete IPv6 interface set-up (what
    can go wrong here is that BIND thinksthat it can use IPv6 authoritative servers, but actually the sends silently fail, leaving named waiting unnecessarily for responses)

    Any further suggestions, corrections or warnings are very welcome.

    Thank you!
    Vicky

    ---------

    Victoria Risk
    Product Manager
    Internet Systems Consortium
    vicky@isc.org <mailto:vicky@isc.org>






    _______________________________________________
    Please visit https://urldefense.com/v3/__https://lists.isc.org/mailman/listinfo/bind-users__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYflfQafZw$ to unsubscribe from this list

    ISC funds the development of this software with paid support subscriptions. Contact us at https://urldefense.com/v3/__https://www.isc.org/contact/__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYd9ITf9ow$ for more
    information.


    bind-users mailing list
    bind-users@lists.isc.org https://urldefense.com/v3/__https://lists.isc.org/mailman/listinfo/bind-users__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYflfQafZw$


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chuck Aurora@21:1/5 to Victoria Risk on Wed Jul 8 13:06:05 2020
    On 2020-07-07 20:57, Victoria Risk wrote:
    A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to
    our developers and this was pretty successful. We would like to do
    something similar for BIND, provide a dozen or so tips for how to
    maximize your throughput with BIND. However, as usual, everything is
    more complicated with BIND.
    [big snip]
    Any further suggestions, corrections or warnings are very welcome.

    Vicky, I'd suggest separating these performance tips into two separate articles: authoritative and recursive. Lumping both together is going
    to create more confusion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Havard Eidnes@21:1/5 to All on Thu Jul 9 22:25:05 2020
    Copy: bind-users@lists.isc.org

    OS settings and the system environment
    ...
    2e) Make sure your socket send buffers are big enough. (not
    sure if this is obsolete advice, do we need to tell people how
    to tell if their buffers are causing delays?)

    2e#1) Make sure your UDP socket *receive* buffers are big enough.
    If on BSD, monitor for "dropped due to full socket buffers"
    count in "netstat -s" output, and tune accordingly. Note that
    this may be a symptom of mis-tuning of other parts of BIND,
    causing excessive CPU usage, which may contribute to this
    problem.

    BTW, unbound has configuration options ("so-rcvbuf" / "so-sndbuf")
    to tune these for only the name server; when I earlier looked for
    something similar in BIND I could not find a corresponding option,
    so had to do a system-wide tuning via sysctl, which isn't ideal, but
    solved the problem in my case.

    named Features
    3a) Minimize logging. Query logging is expensive (can cost you
    20% or more of your throughput) so don't do it unless you
    are using the logs for something. Logging with dnstap is
    lower impact, but still fairly expensive. Don't run in
    debug mode unless necessary.

    3a#1) Do not configure BIND with --enable-querytrace. It most
    probably doesn't do what you might think it does, and is a
    major drag on performance.

    See above under the new "2e#1" for a possible symptom...

    4b) Set an appropriate MTU for your network. Ensure that your
    network infrastructure supports EDNS and large UDP responses up
    to 4096. Ensure that your network infrastructure allows transit
    for and reassembly of fragmented UDP packets (these will be
    large query responses if you are DNSSEC signing)

    Well, isn't the major goal of DNS Flag Day 2020 to eliminate
    fragmentation for various reasons (some of them security-related),
    and recommends to set EDNS buffer size to 1232 instead of letting it
    be the present default of BIND of 4096?

    Best regards,

    - Håvard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Timothe Litt@21:1/5 to Victoria Risk on Fri Jul 10 08:01:46 2020
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --lEbwKlqmQVLxTa0uYvWHusXYlE3Nimweh
    Content-Type: multipart/alternative;
    boundary="------------0191682D136EA53E2FBCF6BC"
    Content-Language: en-US

    This is a multi-part message in MIME format. --------------0191682D136EA53E2FBCF6BC
    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    These suggestions - like most performance articles - are oriented toward achieving the highest performance with large configurations.  E.g. "How
    big can/should you go to support big loads?"

    That's useful for many users.  But there are also many people who run
    smaller operations, where the goal is to provide adequate (or even
    exceptional) performance with a minimum footprint. When BIND is one of
    many services, overall performance can be improved by minimizing BIND's resource requirements.  This is also true in embedded applications,
    where footprint matters.

    So a discussion about how to optimize for the smaller cases - what do
    you trade-off?  What knobs can one turn down - and how far? would be a
    useful part of or complement to the proposed article.  E.g. "How small can/should you go when your loads are smaller?"

    FWIW, a wizard - even just a spreadsheet - that encapsulates known
    performance results might also be useful.  E.g. Given a processor,
    number/size of zones, query rate, & type, produce a memory size, disk &
    network I/O rates, and starting configuration parameters... Obviously,
    this could become arbitrarily complicated, but a simple spreadsheet with configuration (hardware & software) and performance data that's
    searchable would give people a good starting point.  Especially if it's real-world. (It can be challenging to map artificial
    "performance"/stress tests done in a development/verification
    environment to the real world...)  While full automation can be fun,
    it's amazing how much one can get out of a spreadsheet with/autofilter. 
    (For the next level, pivot tables and/or charts...)

    Timothe Litt
    ACM Distinguished Engineer
    --------------------------
    This communication may not represent the ACM or my employer's views,
    if any, on the matters discussed.

    On 07-Jul-20 21:57, Victoria Risk wrote:
    A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to
    our developers and this was pretty successful. We would like to do
    something similar for BIND, provide a dozen or so tips for how to
    maximize your throughput with BIND. However, as usual, everything is
    more complicated with BIND.

    Can those of you who care about performance, who have worked to
    improve your performance, share some of your suggestions that have the
    most impact?  Please also comment if you think any of these ideas
    below are stupid or dangerous. I have combined advice for resolvers
    and for authoritative servers, I hope it is clear which is which...

    The ideas we have fall into four general categories:

    System design
    1a) Use a load balancerto specialize your resolvers and maximize your
    cache hit ratio.  A load balancer is traditionally designed to spread
    the traffic out evenly among a pool of servers, but it can also be
    used to concentrate related queries on one server to make its cache as
    hot as possible. For example, if all queries for domains in .info are
    sent to one server in a pool, there is a better chance that an answer
    will be in the cache there.

    1b) If you have a large authoritative system with many servers,
    consider dedicating some machines to propagate transfers. These
    machines, called transfer servers, would not answer client queries,
    but just send notifies and process IXFR requests.
    1c) Deploy ghost secondaries.  If you store copies of authoritative
    zones on resolvers (resolvers as undelegated secondaries), you can
    avoid querying those authoritative zones. The most obvious uses of
    this would be mirroring the root zone locally or mirroring your own authoritative zones on your resolver.

    we have other system design ideas that we suspect would help, but we
    are not sure, so I will wait to see if anyone suggests them.

    OS settings and the system environment
    2a) Run on bare metal if possible, not on virtual machines or in the
    cloud. (any idea how much difference this makes? the only reference we
    can cite is pretty out of date - https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf
    )

    2b) Consider using with-tuning-large.
    (https://kb.isc.org/docs/aa-01314) This is a compile time option, so
    not something you can switch on and off during production. 

    2c) Consider which R/W lock choice you want to use - https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named
    For the highest tested query rates (> 100,000 queries per second),
    pthreads read-write locks with hyper-threading /enabled/ seem to be
    the best-performing choice by far.

    2d) Pay attention to your choice of NIC cards. We have found wide
    variations in their performance. (Can anyone suggest what specifically
    to look for?)

    2e) Make sure your socket send buffers are big enough. (not sure if
    this is obsolete advice, do we need to tell people how to tell if
    their buffers are causing delays?)

    2f) When the number of CPUs is very large (32 or more), the increase
    in UDP listeners may not provide any performance improvement and might actually reduce throughput slightly due to the overhead of the
    additional structures and tasks. We suggest trying different values of
    -U to find the optimal one for your production environment.
    named Features
    3a) Minimize logging. Query logging is expensive (can cost you 20% or
    more of your throughput) so don’t do it unless you are using the logs
    for something. Logging with dnstap is lower impact, but still fairly expensive. Don’t run in debug mode unless necessary.
    3b) Use named.conf option minimal-responses yes; to reduce the amount
    of work that named needs to do to assemble the query response as well
    as reducing the amount of outbound traffic
    3c) Disable synth-from-dnssec. While this seemed like a good idea, it
    turns out, in practice it does not improve performance.
    3d) Tune your zone transfers. (https://kb.isc.org/docs/aa-00726)

    When tuning the behavior of the primary, there are several factors
    that you can control:

    - The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)

    - Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)

    - Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)

    The most important options to focus on are transfers-out,
    serial-query-rate, tcp-clients and tcp-listen-queue.

    4e) If you use RPZ, consider using qnane-wait-recurse. We have had
    issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very
    large RPZ zones. 

    4f) Consider enabling prefetch on your resolver, unless you are
    running 9.10 (which is EOL) https://kb.isc.org/docs/aa-01122

    Fix your transport network. 
    Transport network issues cause BIND to keep retrying, which is a
    performance drain.
    4a) Disable (in some cases, completely remove in order to prevent
    ongoing interference) outbound firewalls/packet-filters (particularly
    that maintain state on connections). These are a frequent cause of
    problems in the DNS that can cause your DNS server to do a lot of
    extra work.
    4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096.
    Ensure that your network infrastructure allows transit for and
    reassembly of fragmented UDP packets (these will be large query
    responses if you are DNSSEC signing)
    4c) Ensure that your network infrastructure allows DNS over TCP.
    4d) Check for, and eliminate any incomplete IPv6 interface set-up
    (what can go wrong here is that BIND thinksthat it can use IPv6
    authoritative servers, but actually the sends silently fail, leaving
    named waiting unnecessarily for responses)

    Any further suggestions, corrections or warnings are very welcome.
    Thank you!
    Vicky
    ---------
    Victoria Risk
    Product Manager
    Internet Systems Consortium
    vicky@isc.org <mailto:vicky@isc.org>






    --------------0191682D136EA53E2FBCF6BC
    Content-Type: text/html; charset=utf-8
    Content-Transfer-Encoding: quoted-printable

    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <p>These suggestions - like most performance articles - are oriented
    toward achieving the highest performance with large
    configurations.  E.g. "How big can/should you go to support big
    loads?"<br>
    </p>
    <p>That's useful for many users.  But there are also many people who
    run smaller operations, where the goal is to provide adequate (or
    even exceptional) performance with a minimum footprint. When BIND
    is one of many services, overall performance can be improved by
    minimizing BIND's resource requirements.  This is also true in
    embedded applications, where footprint matters.<br>
    </p>
    <p>So a discussion about how to optimize for the smaller cases -
    what do you trade-off?  What knobs can one turn down - and how
    far? would be a useful part of or complement to the proposed
    article.  E.g. "How small can/should you go when your loads are
    smaller?"</p>
    <p>FWIW, a wizard - even just a spreadsheet - that encapsulates
    known performance results might also be useful.  E.g. Given a
    processor, number/size of zones, query rate, &amp; type, produce a
    memory size, disk &amp; network I/O rates, and starting
    configuration parameters... Obviously, this could become
    arbitrarily complicated, but a simple spreadsheet with
    configuration (hardware &amp; software) and performance data
    that's searchable would give people a good starting point. 
    Especially if it's real-world. (It can be challenging to map
    artificial "performance"/stress tests done in a
    development/verification environment to the real world...)  While
    full automation can be fun, it's amazing how much one can get out
    of a spreadsheet with/autofilter.  (For the next level, pivot
    tables and/or charts...)<br>
    </p>
    <pre class="moz-signature" cols="72">Timothe Litt
    ACM Distinguished Engineer
    --------------------------
    This communication may not represent the ACM or my employer's views,
    if any, on the matters discussed.
    </pre>
    <div class="moz-cite-prefix">On 07-Jul-20 21:57, Victoria Risk
    wrote:<br>
    </div>
    <blockquote type="cite"
    cite="mid:%3C3A0A6DF0-828F-49A5-83DF-8118FD663522@isc.org%3E">
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    A while ago we created a KB article with tips on how to improve
    your performance with our Kea dhcp server. The tips were fairly
    obvious to our developers and this was pretty successful. We would
    like to do something similar for BIND, provide a dozen or so tips
    for how to maximize your throughput with BIND. However, as usual,
    everything is more complicated with BIND.
    <div class=""><br class="">
    </div>
    <div class="">Can those of you who care about performance, who
    have worked to improve your performance, share some of your
    suggestions that have the most impact?  Please also comment if
    you think any of these ideas below are stupid or dangerous. I
    have combined advice for resolvers and for authoritative
    servers, I hope it is clear which is which...<br class="">
    <div class=""><br class="">
    </div>
    <div class="">The ideas we have fall into four general
    categories:</div>
    <div class=""><br class="">
    </div>
    <div class="">System design</div>
    <div class=""><span
    id="docs-internal-guid-8bd01d59-7fff-de6c-6b62-d43b75bc5624"
    class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1a) Use a load balancer</span><span style="font-style: italic;
    font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class=""> </span><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-
    position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">to specialize your resolvers and maximize your cache hit ratio.  A load balancer is traditionally designed to spread the traffic out evenly among a pool of servers, but it can
    also be used to concentrate related queries on one server to make its cache as hot as possible. For example, if all queries for domains in .info are sent to one server in a pool, there is a better chance that an answer will be in the cache there.</span></
    span></div>
    <div class=""><br class="">
    </div>
    <div class=""><span
    id="docs-internal-guid-a7429f5d-7fff-21f2-d35c-7c59e291531b"
    class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1b) If you have a large authoritative system with many servers,
    consider dedicating some machines to propagate transfers. These machines, called transfer servers, would not answer client queries, but just send notifies and process IXFR requests.</span></span></div>
    <div class=""><span class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">
    </span></span></div>
    <div class=""><span class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">1c) Deploy </span></span><span style="white-space:
    pre-wrap;" class="">ghost secondaries.  If you store copies of authoritative zones on resolvers (resolvers as undelegated secondaries), you can avoid querying those authoritative zones. The most obvious uses of this would be mirroring the root zone
    locally or mirroring your own authoritative zones on your resolver.</span></div>
    <div class=""><br class="">
    </div>
    <div class="">we have other system design ideas that we suspect
    would help, but we are not sure, so I will wait to see if
    anyone suggests them.</div>
    <div class=""><br class="">
    </div>
    <div class=""><span class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">OS settings and the system environment</span></
    span></div>
    <div class="">2a) Run on bare metal if possible, not on virtual
    machines or in the cloud. (any idea how much difference this
    makes? the only reference we can cite is pretty out of date - <span style="white-space: pre-wrap;" class=""><a href="https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf" class="" moz-do-not-send="
    true">https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf</a> )</span></div>
    <div class=""><br class="">
    </div>
    <div class="">2b) Consider using with-tuning-large. (<span style="white-space: pre-wrap;" class=""><a href="https://kb.isc.org/docs/aa-01314" class="" moz-do-not-send="true">https://kb.isc.org/docs/aa-01314</a>) </span>This
    is a compile time option, so not something you can switch on
    and off during production. </div>
    <div class=""><br class="">
    </div>
    <div class="">2c) Consider which <span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">R/W lock choice you want to use - </span><
    span style="text-decoration: underline; color: rgb(17, 85, 204); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; text-decoration-skip: none; vertical-align: baseline; white-space: pre-wrap;" class=""><a
    href="https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named" class="" moz-do-not-send="true">https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named</a> </span><span
    style="caret-color: rgb(34, 34, 34); color: rgb(34, 34,
    34);" class="">For the highest tested query rates (&gt;
    100,000 queries per second), pthreads read-write locks with
    hyper-threading </span><em style="caret-color: rgb(34, 34,
    34); color: rgb(34, 34, 34); box-sizing: border-box;"
    class="">enabled</em><span style="caret-color: rgb(34, 34,
    34); color: rgb(34, 34, 34);" class=""> </span><span
    style="caret-color: rgb(34, 34, 34); color: rgb(34, 34,
    34);" class="">seem to be the best-performing choice by far.</span></div>
    <div class=""><span style="caret-color: rgb(34, 34, 34); color:
    rgb(34, 34, 34);" class=""><br class="">
    </span></div>
    <div class=""><span style="caret-color: rgb(34, 34, 34); color:
    rgb(34, 34, 34);" class="">2d) Pay attention to your choice
    of NIC cards. We have found wide variations in their
    performance. (Can anyone suggest what specifically to look
    for?)</span></div>
    <div class=""><span style="caret-color: rgb(34, 34, 34); color:
    rgb(34, 34, 34);" class=""><br class="">
    </span></div>
    <div class=""><span style="caret-color: rgb(34, 34, 34); color:
    rgb(34, 34, 34);" class="">2e) Make sure your socket send
    buffers are big enough. (not sure if this is obsolete
    advice, do we need to tell people how to tell if their
    buffers are causing delays?)</span></div>
    <div class=""><br class="">
    </div>
    <div class="">2f) <span
    id="docs-internal-guid-8d50db57-7fff-f45a-7f4d-9bbec5aebc28"
    class=""><span style="font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">When the number of CPUs is very large (32 or more), the increase
    in UDP listeners may not provide any performance improvement and might actually reduce throughput slightly due to the overhead of the additional structures and tasks. We suggest trying different values of -U to find the optimal one for your production
    environment.</span></span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">named Features</span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">3a) Minimize logging. Query logging is expensive (can cost you 20% or more of your throughput) so don’t do it unless you are using the logs for something. Logging with dnstap is lower
    impact, but still fairly expensive. </span><span style="white-space: pre-wrap;" class="">Don’t run in debug mode unless necessary. </span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">3b) </span><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">Use named.conf option minimal-responses yes; to reduce the amount of work that named needs to do to
    assemble the query response as well as reducing the amount of outbound traffic</span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">3c) </span><span style="white-space: pre-wrap;" class="">Disable synth-from-dnssec. While this seemed like a good idea, it turns out, in practice it does not improve performance.</span></

    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">3d) Tune your zone transfers. </span><span style="white-space: pre-wrap;" class=""> (</span><a href="https://kb.isc.org/docs/aa-00726" style="white-space: pre-wrap;" class="" moz-do-not-
    send="true">https://kb.isc.org/docs/aa-00726</a><span style="white-space: pre-wrap;" class="">)</span></div>
    <div class="">
    <p style="box-sizing: border-box; margin: 0px 0px 1rem;
    padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">When tuning the behavior of the primary,
    there are several factors that you can control:</p>
    <p style="box-sizing: border-box; margin: 0px 0px 1rem;
    padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">- The rate of notifications of changes to
    secondary servers (serial-query-rate and notify-delay)</p>
    <p style="box-sizing: border-box; margin: 0px 0px 1rem;
    padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">- Limits on concurrent zone transfers
    (transfers-out, tcp-clients, tcp-listen-queue,
    reserved-sockets)</p>
    <p style="box-sizing: border-box; margin: 0px 0px 1rem;
    padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">- Efficiency/management options
    (max-transfer-time-out, max-transfer-idle-out,
    transfer-format)</p>
    <p style="box-sizing: border-box; margin: 0px 0px 1rem;
    padding: 0px; caret-color: rgb(34, 34, 34); color: rgb(34,
    34, 34);" class="">The most important options to focus on
    are transfers-out, serial-query-rate, tcp-clients and
    tcp-listen-queue.</p>
    </div>
    <div class="">4e) If you use RPZ, consider using
    qnane-wait-recurse. We have had issues with RPZ transfers
    impacting query performance in resolvers. In general, more
    smaller RPZ zones will transfer faster than a few very large
    RPZ zones. </div>
    <div class=""><br class="">
    </div>
    <div class="">4f) Consider enabling prefetch on your resolver,
    unless you are running 9.10 (which is EOL) <a
    href="https://kb.isc.org/docs/aa-01122" class=""
    moz-do-not-send="true">https://kb.isc.org/docs/aa-01122</a></div>
    <div class=""><br class="">
    </div>
    <div class=""><span style="white-space: pre-wrap;" class="">Fix your transport network. </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">Transport network issues cause BIND to keep retrying, which is a performance drain.</span></div>
    <div class=""><span
    id="docs-internal-guid-86e034a7-7fff-6820-9bb1-bcad17499827"
    class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">4a) Disable (in some cases, completely
    remove in order to prevent ongoing interference) outbound firewalls/packet-filters (particularly that maintain state on connections). These are a frequent cause of problems in the DNS that can cause your DNS server to do a lot of extra work. </span></
    span></div>
    <div class=""><span class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">
    </span></span></div>
    <div class=""><span
    id="docs-internal-guid-a2400cb3-7fff-8adf-a4da-1d499f82fd2f"
    class=""><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">4b) Set an appropriate MTU for your
    network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096. </span></span><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">Ensure that your network infrastructure allows transit for and
    reassembly of fragmented UDP packets (these will be large query responses if you are DNSSEC signing)</span></div>
    <div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">
    </span></div>
    <div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">4c) </span><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">Ensure that your network infrastructure allows DNS over TCP.</span></div>
    <div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">
    </span></div>
    <div class=""><span style="color: rgb(34, 34, 34); white-space: pre-wrap;" class="">4d) </span><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal; vertical-align:
    baseline; white-space: pre-wrap;" class="">Check for, and eliminate any incomplete IPv6 interface set-up (what can go wrong here is that BIND </span><span style="color: rgb(34, 34, 34); font-style: italic; font-variant-ligatures: normal; font-variant-
    east-asian: normal; font-variant-position: normal; vertical-align: baseline; white-space: pre-wrap;" class="">thinks</span><span style="color: rgb(34, 34, 34); font-variant-ligatures: normal; font-variant-east-asian: normal; font-variant-position: normal;
    vertical-align: baseline; white-space: pre-wrap;" class=""> that it can use IPv6 authoritative servers, but actually the sends silently fail, leaving named waiting unnecessarily for responses)</span></div>
    <div class="">
    <div class=""><br class="">
    </div>
    </div>
    <div class=""><span style="white-space: pre-wrap;" class="">Any further suggestions, corrections or warnings are very welcome. </span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">Thank you!</span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">Vicky</span></div>
    <div class=""><span style="white-space: pre-wrap;" class=""> </span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">---------</span></div>
    <div class=""><span style="white-space: pre-wrap;" class="">
    </span>
    <div class="">
    <div style="color: rgb(0, 0, 0); letter-spacing: normal;
    text-align: start; text-indent: 0px; text-transform: none;
    white-space: normal; word-spacing: 0px;
    -webkit-text-stroke-width: 0px; word-wrap: break-word;
    -webkit-nbsp-mode: space; -webkit-line-break:
    after-white-space;" class="">
    <div style="color: rgb(0, 0, 0); letter-spacing: normal;
    text-align: start; text-indent: 0px; text-transform:
    none; white-space: normal; word-spacing: 0px;
    -webkit-text-stroke-width: 0px; word-wrap: break-word;
    -webkit-nbsp-mode: space; -webkit-line-break:
    after-white-space;" class="">
    <div class="">Victoria Risk</div>
    <div class="">Product Manager</div>
    <div class="">Internet Systems Consortium</div>
    <div class=""><a href="mailto:vicky@isc.org" class=""
    moz-do-not-send="true">vicky@isc.org</a></div>
    <div class=""><br class="">
    </div>
    </div>
    <br class="Apple-interchange-newline">
    </div>
    <br class="Apple-interchange-newline">
    <br class="Apple-interchange-newline">
    </div>
    <br class="">
    </div>
    </div>
    </blockquote>
    </body>
    </html>

    --------------0191682D136EA53E2FBCF6BC--

    --lEbwKlqmQVLxTa0uYvWHusXYlE3Nimweh--

    -----BEGIN PGP SIGNATURE-----

    iQEzBAEBCAAdFiEE0UvvF0GpbrNhifE5DTaRiR4XoSQFAl8IWKoACgkQDTaRiR4X oSSeAwf9HdGDNUUmi4GHTlgKuvbjbbSdzKuwAfw8q91KLhGkqko8o++WQUFYqow/ XNr+H8DX33qRiHS3FWiliJ6vEeRuZhp0eLlBlPhEim7wtx3AmKqDhDDDbswfKjVF x4RAgf620flVwevGDiGAbPgBQM/BqQl0bOCIgI2BHpDwuHepe20L0CQblTmCem8l aWbKK1ZH3bk/l6R8iguEodDtJuaw2bYm9J2R9hq3uj0zEw17+ufHfEUmDXSkpyLb TS3nY9PZMPral/zT9j54UDQWJRutV43Mg395XRCOod3gAbfxiyyFaaDnGNU+KkFy ma0fdzCan89D/1t4KhRD0KGx/RMZXw==
    =4nLJ
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Niall O'Reilly@21:1/5 to Havard Eidnes on Wed Jul 29 10:55:48 2020
    Copy: bind-users@lists.isc.org

    On 9 Jul 2020, at 21:25, Havard Eidnes via bind-users wrote:

    2e#1) Make sure your UDP socket *receive* buffers are big enough.
    If on BSD, monitor for "dropped due to full socket buffers"
    count in "netstat -s" output, and tune accordingly. Note that
    this may be a symptom of mis-tuning of other parts of BIND,
    causing excessive CPU usage, which may contribute to this
    problem.

    I'm seeing some instances of "dropped due to no socket" on my FreeBSD
    systems where my resolvers run.

    I'm wondering

    - whether and how I can address this with tuning, and also
    - whether I'm wandering out of scope for this list.

    Thanks in anticipation and/or apologies.
    Niall

    <!DOCTYPE html>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/xhtml; charset=utf-8">
    </head>
    <body>
    <div style="font-family:sans-serif"><div style="white-space:normal">
    <p dir="auto">On 9 Jul 2020, at 21:25, Havard Eidnes via bind-users wrote:</p>

    </div>
    <div style="white-space:normal"><blockquote style="border-left:2px solid #777; color:#777; margin:0 0 5px; padding-left:5px"><p dir="auto">2e#1) Make sure your UDP socket *receive* buffers are big enough.<br>
    If on BSD, monitor for "dropped due to full socket buffers"<br>
    count in "netstat -s" output, and tune accordingly. Note that<br>
    this may be a symptom of mis-tuning of other parts of BIND,<br>
    causing excessive CPU usage, which may contribute to this<br>
    problem.</p>
    </blockquote></div>
    <div style="white-space:normal">

    <p dir="auto">I'm seeing some instances of "dropped due to no socket" on my FreeBSD<br>
    systems where my resolvers run.</p>

    <p dir="auto">I'm wondering</p>


    <li>whether and how I can address this with tuning, and also</li>
    <li>whether I'm wandering out of scope for this list.</li>
    </ul>

    <p dir="auto">Thanks in anticipation and/or apologies.<br>
    Niall</p>
    </div>
    </div>
    </body>
    </html>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)