• File descriptor hard limit is now bumped to the kernel max

    From Luca Boccassi@21:1/5 to All on Thu Jun 6 12:40:01 2024
    Hi,

    PSA: as of systemd/256~rc3-3 the open file descriptors hard limit is
    bumped early at boot from 1048576 to the max value that the kernel
    allows, which on amd64 is currently 1073741816.

    (I meant to send this last week but it fell off the wagon and
    languished in the draft folder, sorry!)

    This allows modern applications to use as many file descriptors as they
    want, since they are an extremely cheap resource nowadays, and it's
    more important than ever given that, for example, process tracking is
    switching over to PID FDs for security reasons. Please note that the
    soft limit still is 1024, as that's what legacy syscalls like select()
    can handle.

    The last time this was tried some packages were still not ready, so it
    was patched out to let them be fixed. Enough time has passed now, and
    it's time to let any unknown leftover just break and be fixed. In all
    known cases, the buggy pattern was to manually iterate over the hard
    limit and close every FD one by one, which is completely unnecessary
    since kernel 5.9 (bullseye/oldstable) since the close_range() syscall
    is available, that can do it in one fell swoop. Any packages still
    doing the iteration manually need to switch to close_range, which is
    very simple to use and documented at:

    https://man7.org/linux/man-pages/man2/close_range.2.html

    Please enjoy your extra file descriptors responsibly.

    --
    Kind regards,
    Luca Boccassi

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEErCSqx93EIPGOymuRKGv37813JB4FAmZhkZYACgkQKGv37813 JB6hQg//ekLirEM33bJ/Xhh+FJJoEO8kRmK6e+RnPhC+a13HshGPUJXpBU3G/RPU ZdhZDorJb/Vqiac8RG0HipIdQHw4vBrjiWPfwQUF82wprIDvVfgF+RKvWvOuVJ2l ejqHPIRJWMo50HYpTGauYAfbYiIMHMJR3lRRdWx+Pg5nsJwjbU5x5JBaQYfPrC6w pajtX6xF62x/0VtI4I3JhQA81w6lF+wdD+rXl30BA5XbvXrEx3n4Y29YWe6kPK5g SJYOGVTSG4Y2FiCp6EeKBeXpQJlP5PdQCCvlp55DH8ON3rUQbj5PZqBCuD2VF+kk XS/u4uBgf/yraE/JllG2dFgZABCd3Q6OeJpOqUUq3uhjqoAlpG5ZNyb4joipET3Z ZacoWtuSQI8QzMd24fjAHrli4TxcpW3vbM09q62k2AWX8ouE8Qq7fdNL8I4A9KDJ YSpggM6SeBVylEN3AYly1ecFaVViwsMYSILEnGvKMlQih8qSDFVtG6KWLtZMpMqM AUidIw1m/YZ6Txbv/jYJjqI2mrd27a1zv8uzyLJv/p1YZFddMj6YewxBYqbXzu3c vPbtaGZ/6phJRGgWmqEWTh7ZJhdQydvqtdhq84hwl4JGeaIFg0iDbec3bhKsC34Y qLiuf8sUnYCLy8zG48v5Yq8nh63b/H71HnPHxhyVsXnQNAsCuvc=
    =Vt9o
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco d'Itri@21:1/5 to Luca Boccassi on Thu Jun 6 15:30:01 2024
    On Jun 06, Luca Boccassi <bluca@debian.org> wrote:

    The last time this was tried some packages were still not ready, so it
    was patched out to let them be fixed. Enough time has passed now, and
    it's time to let any unknown leftover just break and be fixed. In all
    known cases, the buggy pattern was to manually iterate over the hard
    limit and close every FD one by one, which is completely unnecessary
    since kernel 5.9 (bullseye/oldstable) since the close_range() syscall
    is available, that can do it in one fell swoop. Any packages still
    I missed the venerable inn 1.x at the time, and I never noticed that it allocates some data structures for all available fds. Apparently this
    worked well enough for 1M file descriptors, but not for 1G. :-)

    The solution was easy enough: https://salsa.debian.org/md/inn/-/blob/master/debian/patches/limit_getfdcount

    --
    ciao,
    Marco

    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQQnKUXNg20437dCfobLPsM64d7XgQUCZmG30gAKCRDLPsM64d7X geDGAP96IlRSCMrwBwv3LZd9gArjQduYpM6AJa1o+A4Hwy6bVgEAvXx8okT//hWQ FyjVBUUqmH6onpyzfaJPJpaqBuFkiwA=
    =qEk1
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Marco d'Itri on Thu Jun 6 16:50:02 2024
    On Thu, 06 Jun 2024 at 15:21:22 +0200, Marco d'Itri wrote:
    On Jun 06, Luca Boccassi <bluca@debian.org> wrote:
    The last time this was tried some packages were still not ready, so it
    was patched out to let them be fixed.

    I missed the venerable inn 1.x at the time, and I never noticed that it allocates some data structures for all available fds. Apparently this
    worked well enough for 1M file descriptors, but not for 1G. :-)

    The solution was easy enough: https://salsa.debian.org/md/inn/-/blob/master/debian/patches/limit_getfdcount

    I believe the change Luca describes is increasing rlim_max (hard limit)
    but not rlim_cur (soft limit), and the code touched by that patch is
    looking at rlim_cur, so it should be unaffected anyway - unless some larger component is raising rlim_cur.

    Raising rlim_cur is also a problem for anything that relies on select(2),
    which can only represent the first 1024 fds (based on FD_SETSIZE).

    In (sufficiently) legacy-free code that can promise that it only uses more scalable mechanisms like poll/epoll, close_range, and dynamically-sized
    data structures, getting the benefit of this change requires locally
    raising rlim_cur to match rlim_max, like e.g. dbus-daemon does.

    However, if programs that raise the soft limit run subprocesses that are outside their control, then they should also drop rlim_cur back to 1024
    for those subprocesses, like dbus-daemon does for activated services.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco d'Itri@21:1/5 to Simon McVittie on Thu Jun 6 18:40:01 2024
    On Jun 06, Simon McVittie <smcv@debian.org> wrote:

    I believe the change Luca describes is increasing rlim_max (hard limit)
    but not rlim_cur (soft limit), and the code touched by that patch is
    looking at rlim_cur, so it should be unaffected anyway - unless some larger component is raising rlim_cur.
    Something did, because inn would start reporting ~1G available fds and
    then explode, and that patch solved the issue. :-)

    --
    ciao,
    Marco

    -----BEGIN PGP SIGNATURE-----

    iHUEABYIAB0WIQQnKUXNg20437dCfobLPsM64d7XgQUCZmHmMwAKCRDLPsM64d7X gW3LAQD9M6wcd/Nn1l41aEnojHy34jqxhvRgzbeKGo59LkrNfwD/bo3vNZb86rDx 1yZNRNnZwzBsOWP5x0oNC8YuQVE8WwM=
    =DQpA
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Simon McVittie on Thu Jun 6 21:10:01 2024
    Simon McVittie <smcv@debian.org> writes:
    On Thu, 06 Jun 2024 at 18:39:15 +0200, Marco d'Itri wrote:

    Something did, because inn would start reporting ~1G available fds and
    then explode, and that patch solved the issue. :-)

    It might be worthwhile to try to track down what larger component did
    this, because inheriting a larger rlim_cur without opt-in can also break users of select(2) as described in <https://0pointer.net/blog/file-descriptor-limits.html>.

    I took a quick look at the old INN source and didn't see anything obvious.
    I was half-expecting it to do something like set the soft limit to the
    hard limit (that sounds like a very INN sort of thing to do), but if so, I couldn't find it in a quick search.

    --
    Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Marco d'Itri on Thu Jun 6 20:40:01 2024
    On Thu, 06 Jun 2024 at 18:39:15 +0200, Marco d'Itri wrote:
    On Jun 06, Simon McVittie <smcv@debian.org> wrote:
    I believe the change Luca describes is increasing rlim_max (hard limit)
    but not rlim_cur (soft limit), and the code touched by that patch is looking at rlim_cur, so it should be unaffected anyway - unless some larger component is raising rlim_cur.

    Something did, because inn would start reporting ~1G available fds and
    then explode, and that patch solved the issue. :-)

    It might be worthwhile to try to track down what larger
    component did this, because inheriting a larger rlim_cur
    without opt-in can also break users of select(2) as described in <https://0pointer.net/blog/file-descriptor-limits.html>.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Hommey@21:1/5 to Marco d'Itri on Thu Jun 6 21:40:01 2024
    On Thu, Jun 06, 2024 at 06:39:15PM +0200, Marco d'Itri wrote:
    On Jun 06, Simon McVittie <smcv@debian.org> wrote:

    I believe the change Luca describes is increasing rlim_max (hard limit)
    but not rlim_cur (soft limit), and the code touched by that patch is looking at rlim_cur, so it should be unaffected anyway - unless some larger component is raising rlim_cur.
    Something did, because inn would start reporting ~1G available fds and
    then explode, and that patch solved the issue. :-)

    There are conditions under which the 1024 limit doesn't apply, like
    running in a docker container.

    Mike

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mourad De Clerck@21:1/5 to All on Fri Jun 14 14:40:01 2024
    PSA: as of systemd/256~rc3-3 the open file descriptors hard limit is
    bumped early at boot from 1048576 to the max value that the kernel
    allows, which on amd64 is currently 1073741816.

    Hi,

    It seems some proprietary software (the JetBrains IDEs) has some
    problems with this change.

    See for instance: https://youtrack.jetbrains.com/issue/IJPL-156522

    While I wait for them to fix this on their end, what's the best way to
    revert this to the original behaviour on my machine?

    I would think:

    echo "fs.nr_open = 1048576" > /etc/sysctl.d/99-max-fds.conf

    … would do the trick, but "ulimit -Hn" reports 524288.

    Something to do with DefaultLimitNOFILE=1024:524288 maybe? But
    overriding that didn't work.

    Thanks,

    --
    Mourad De Clerck

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Luca Boccassi@21:1/5 to Mourad De Clerck on Fri Jun 14 14:40:01 2024
    On Fri, 14 Jun 2024 at 13:21, Mourad De Clerck <mourad@aquazul.com> wrote:

    PSA: as of systemd/256~rc3-3 the open file descriptors hard limit is
    bumped early at boot from 1048576 to the max value that the kernel
    allows, which on amd64 is currently 1073741816.

    Hi,

    It seems some proprietary software (the JetBrains IDEs) has some
    problems with this change.

    See for instance: https://youtrack.jetbrains.com/issue/IJPL-156522

    While I wait for them to fix this on their end, what's the best way to
    revert this to the original behaviour on my machine?

    I would think:

    echo "fs.nr_open = 1048576" > /etc/sysctl.d/99-max-fds.conf

    … would do the trick, but "ulimit -Hn" reports 524288.

    Something to do with DefaultLimitNOFILE=1024:524288 maybe? But
    overriding that didn't work.

    For user instances the link you shared has a workaround, it has to do
    with PAM limits, that should work.
    Please keep the pressure on the upstream project to fix that bug as
    well. Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)