• disk space on ullmann.d.o / UDD cron disabled

    From Adam D. Barratt@21:1/5 to All on Sat Aug 14 12:30:01 2021
    Hi,

    ullmann.debian.org, the host for udd.d.o, filled its PostgreSQL
    partition overnight. DSA added another 5GB of space, but within a few
    hours around 3GB of that had already been used.

    Looking at the PostgreSQL logs, it appears that auto-cleanup of deleted
    records in some of the tables hasn't been working correctly for a
    while, which may be adding to the space usage.

    For the moment, I've disabled all of the "udd" user's crontab, to try
    and stop the rate of disk increase, and allow the auto-cleanup a chance
    to run and hopefully free some more space.

    Regards,

    Adam
    for DSA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mattia Rizzolo@21:1/5 to Adam D. Barratt on Thu Aug 19 11:40:02 2021
    On Sat, Aug 14, 2021 at 11:20:47AM +0100, Adam D. Barratt wrote:
    ullmann.debian.org, the host for udd.d.o, filled its PostgreSQL
    partition overnight. DSA added another 5GB of space, but within a few
    hours around 3GB of that had already been used.

    Uh.

    Looking at the PostgreSQL logs, it appears that auto-cleanup of deleted records in some of the tables hasn't been working correctly for a
    while, which may be adding to the space usage.

    I see (also through the #debian-admin logs) that the space got under
    control after a few runs of the autovacuum.

    For the moment, I've disabled all of the "udd" user's crontab, to try
    and stop the rate of disk increase, and allow the auto-cleanup a chance
    to run and hopefully free some more space.

    I hadn't had a chance to investiage anything, so I'm not re-enabling
    rudd.
    But I've re-cronned some bits that I'm positive should not cause any
    trouble.

    --
    regards,
    Mattia Rizzolo

    GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`.
    More about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'`
    Debian QA page: https://qa.debian.org/developer.php?login=mattia `-

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEi3hoeGwz5cZMTQpICBa54Yx2K60FAmEeJk0ACgkQCBa54Yx2 K60Eww//Z+FjfH9qQkvpFap3/qXRQxDmQxc+W7oj7tm/4YL5LkbfVxATcMtC87g1 vNs8hsJsTmJUeHSRzaoNgu3ym+3xm7pOOANR3N7UHkjDz2YhoKzLsILQdcBJL/Ct ts9a3wQ/0JlfDYsS2gNTVpwDv2H/VjH7eVzqRj+KLXDxCItIlIKr1Q3cUB8d84Cm pOpKob3WJuQgQK/c0GRBCxxEMRV4uq3dyk+FT0mY+uZxDniA8CYZz832pY+TmEQK UkHKYUHkR3cX5hzO5O2AY33HrWLZmlJcm35+M+h45P4Q2kqpfuef3jHR6ai8c71t 1MDDhwW/Wg7kTmwjmKO5mS4GuuVXrp9KAhbjt0oHyLbon5QU+wbZZF9b4dKp7ag7 PH/a9b6CivCVtWp0W4owWHYC5s5kUtKfVAtEDjNZAZiUUvWD9NF5rSkbayHHX47V WNAcd5jrgEfUDUwCW+YoMnyukPo7Z03J8+XmkqvqKv28nN9CwSevechE/4gd2fBT rgo/nOI0FG5ONQuMnizreg4HCnkBWNumJtYFwtQAIQYTsiYpczN
  • From Adam D. Barratt@21:1/5 to Mattia Rizzolo on Thu Aug 19 17:40:01 2021
    On Thu, 2021-08-19 at 11:37 +0200, Mattia Rizzolo wrote:
    On Sat, Aug 14, 2021 at 11:20:47AM +0100, Adam D. Barratt wrote:
    ullmann.debian.org, the host for udd.d.o, filled its PostgreSQL
    partition overnight. DSA added another 5GB of space, but within a
    few hours around 3GB of that had already been used.

    Uh.

    That's similar to the reaction I had. :-)

    Looking at the PostgreSQL logs, it appears that auto-cleanup of
    deleted
    records in some of the tables hasn't been working correctly for a
    while, which may be adding to the space usage.

    I see (also through the #debian-admin logs) that the space got under
    control after a few runs of the autovacuum.

    It's certainly better, but still a fair way above where things were a
    week ago.

    Judging from
    https://munin.debian.org/debian.org/ullmann.debian.org/df.html , the
    partition was around 50% used for the past month or so (~25-30GB).
    Starting on Thursday night, there was a gradual but sustained increase
    in usage until it filled in the early hours of Saturday morning and the additional 5GB was added.

    We're now at ~65%, _after_ the rounds of cleanup (both manual and
    automatic).

    For the moment, I've disabled all of the "udd" user's crontab, to
    try
    and stop the rate of disk increase, and allow the auto-cleanup a
    chance
    to run and hopefully free some more space.

    I hadn't had a chance to investiage anything, so I'm not re-enabling
    rudd.
    But I've re-cronned some bits that I'm positive should not cause any
    trouble.

    Thanks for the update.

    Regards,

    Adam

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lucas Nussbaum@21:1/5 to Adam D. Barratt on Mon Aug 23 15:10:02 2021
    On 19/08/21 at 16:29 +0100, Adam D. Barratt wrote:
    On Thu, 2021-08-19 at 11:37 +0200, Mattia Rizzolo wrote:
    On Sat, Aug 14, 2021 at 11:20:47AM +0100, Adam D. Barratt wrote:
    ullmann.debian.org, the host for udd.d.o, filled its PostgreSQL
    partition overnight. DSA added another 5GB of space, but within a
    few hours around 3GB of that had already been used.

    Uh.

    That's similar to the reaction I had. :-)

    Looking at the PostgreSQL logs, it appears that auto-cleanup of
    deleted
    records in some of the tables hasn't been working correctly for a
    while, which may be adding to the space usage.

    I see (also through the #debian-admin logs) that the space got under control after a few runs of the autovacuum.

    It's certainly better, but still a fair way above where things were a
    week ago.

    Judging from
    https://munin.debian.org/debian.org/ullmann.debian.org/df.html , the partition was around 50% used for the past month or so (~25-30GB).
    Starting on Thursday night, there was a gradual but sustained increase
    in usage until it filled in the early hours of Saturday morning and the additional 5GB was added.

    We're now at ~65%, _after_ the rounds of cleanup (both manual and
    automatic).

    My guess would be that it's a combination of:
    1/ the release of bullseye (and the addition of bookworm), which
    explains the bump mid-august (there's a similar bump for the release of
    Ubuntu 21.04)

    2/ the usage pattern of postgresql by UDD, with large transactions that mass-delete/insert: maybe autovacuum are not sufficient in that case,
    and some fragmentation remains.

    I ran a VACUUM FULL ANALYZE to clean up all stale data, and got from
    ~41 GB used down to 13 GB. (for example, bugs_usertags went from 10 GB to
    25 MB).

    I think that the lesson here is that from time to time, a VACUUM FULL
    ANALYZE is needed...



    Regarding disk usage of /srv (the question was raised on IRC):

    The breakdown is:
    15G /srv
    9.8G /srv/udd.debian.org
    3,4G /srv/udd.debian.org/udd
    3.2G /srv/udd.debian.org/udd/web/dumps/: DB dumps to export
    to udd-mirror. already compressed.
    2,9G /srv/udd.debian.org/testing-status: I don't remember the role
    of those files. I xz'ed those that weren't. Now down to
    1.6G.
    1,6G /srv/udd.debian.org/mirrors: local temporary mirrors, no problem here
    other: home directories?

    I suspect that there's someone with a large homedir, but I can't explore
    this.

    Someone made backups on Aug 11th in /srv/udd.debian.org/udd/web/dumps/,
    they could be removed.



    All in all, I think that we can re-cron everything and watch how things
    go. Adam, Mattia, what do you think?

    Lucas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lucas Nussbaum@21:1/5 to All on Mon Aug 23 21:00:04 2021
    For those not following IRC:
    the cron jobs have been re-enabled, and everything looks back in order.

    Lucas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)