Forum: >>> Magnum BBS <<<

[gentoo-user] Jobs and load-average

From Peter Humphrey@21:1/5 to All on Wed Feb 15 11:00:01 2023

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter =?ISO-8859-1?Q?B=F6hm?=@21:1/5 to All on Wed Feb 15 12:40:01 2023

Am Mittwoch, 15. Februar 2023, 10:56:22 CET schrieb Peter Humphrey:

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

Maybe you are interested in this wiki article:

https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times

Regards,
Peter

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Wed Feb 15 12:40:01 2023

On Wednesday, 15 February 2023 09:56:22 GMT Peter Humphrey wrote:

EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>

That should have been:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --keep-going --nospinner"

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael@21:1/5 to All on Wed Feb 15 13:12:17 2023

On Wednesday, 15 February 2023 11:31:49 GMT Peter Böhm wrote:

Am Mittwoch, 15. Februar 2023, 10:56:22 CET schrieb Peter Humphrey:

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>

The above determine how may ebuilds will be emerged in parallel. If you are rebuilding your whole system with hundreds of packages stacking up to be emerged, then having as high as 16 packages being emerged in parallel could be advantageous.

MAKEOPTS="-j16"

This determines how many MAKE jobs will run in parallel in any one emerge. Large packages like chromium will benefit from maximising the number of jobs here, as long as you have enough RAM.

Given you have 24 threads and your RAM is 64GB, you should be able to ratchet this up to -j24, but not if you specify a high EMERGE_DEFAULT_OPTS at the same time, or if the compiler is eating up more than 2G per process.

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

Since you have specified up to 16 parallel emerges and each one could run up to 16 MAKE jobs, you can understand why you would soon find loads escalating and your machine becoming unresponsive.

You should consider what is more important for you, emerging as many packages in parallel as possible, or emerging any one large package as fast as possible.

Two extreme examples would be setting EMERGE_DEFAULT_OPTS at "--jobs 24", with MAKEOPTS at "-j1", or conversely setting EMERGE_DEFAULT_OPTS at "--jobs 1", with MAKEOPTS at "-j24".

On my old and slow laptop with only 4 threads and 16G of RAM my priority is to finish large packages faster. I leave EMERGE_DEFAULT_OPTS unset, while specifying MAKEOPTS="-j5 -l4.8". This uses a job number I determined by trial and error, building ffmpeg repeatedly by progressively increasing the number for -j from 1 to 12. The two faster times were achieved with -j5 and -j10, which aligns with the old myth of using CPU+1.

Regarding RAM being used being ~2G per MAKE job, this fluctuates with successive compiler versions. I have seen up to 3.4G of RAM per process, while emerging chromium. For such huge packages which cause excessive swapping, unresponsiveness and thrashing of disk, I limit the MAKEOPTS to 3 in package.env.

Maybe you are interested in this wiki article:

https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times

Regards,
Peter

I'd start by reading the suggestions in this article first, which is a good introduction to the concepts involved:

https://wiki.gentoo.org/wiki/MAKEOPTS
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmPs2jEACgkQseqq9sKV ZxlcDxAAqJJz/lnc4lM+XWorMQR9/OMsznX6ZcAb/n+JmKlOFcCCPEOriUQ8L7W4 cowx+2/BuMSd8niDtMmUh0rf8nGp2U6Cjr263CuPcavr5yvh3ZtjL6L+EivktyE+ EL+bQkoDG0sR2mUbvbAjmd0h5KcyohvTfUB13+2pFnQVi1NKusgEx9dXSWreFNJc A5/1OpNkgZGmwhGzlpjdLVZge+Jl9Pqu1JQJyom8FYvo6EU2SkVv8WFosaVGksEf BxVzQ3yWP7h5n609C9ZBjorJh1yBaoBUZbAb2ZSOBT3SXGEUnRJP4R2w41nDUtCx ExhdsceHLyp5wVKqQHvh9rI6Uju8+7Zm3vexjP+hN+p9Kzz3llIuoUEyrWQehwpy 0h5f6SVCV/bX1SpSeyX9txT0KUpm8jb6Tl6V2SwganzKVPoUt3f0pZRYxtYAax86 Xz/MKkVuP9GgxkYHFOsyMGtaYrtZLFLjz0Ylf9Uwq9uCX90fFYnidVjhatSMcy/p p9VamraidpLfQ4nNz5iJaP4vHN6mQQ+3g+Q66QU2lCLJiXa+Bar5pfXJvB6GoSCm K7dCxyqNF71Q/ayAKRcQU0Ri7I2ImR9NsXZeW+OWbFnKdNpW7WBsl0R51Zsmyc6s 9vqDRNmhwgbbUYwlni0+OMSbnJhyCHVGm6pdB+ci3krDDF0mWo0=
=XnFM
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich Freeman@21:1/5 to peter@prh.myzen.co.uk on Wed Feb 15 14:20:02 2023

On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Keep in mind that you need to consider available RAM and not just
total RAM. Run free under the conditions where you typically run
emerge and see how much available memory it displays. Depending on
what you have running it could be much lower than 64GB.

Beyond that, unfortunately this is hard to deal with beyond just
figuring out what needs more RAM and making exceptions in package.env.

Also, RAM pressure could also come from the build directory if it is
on tmpfs, which of course many of us use.

Some packages that I build with either a greatly reduced -j setting or
a non-tmpfs build directory are:
sys-cluster/ceph
dev-python/scipy
dev-python/pandas
app-office/calligra
net-libs/nodejs
dev-qt/qtwebengine
dev-qt/qtwebkit
dev-lang/spidermonkey
www-client/chromium
app-office/libreoffice
sys-devel/llvm
dev-lang/rust (I use the rust binary these days as this has gotten
really out of hand)
x11-libs/gtk+

These are just packages I've had issues with at some point, and it is
possible that some of these packages no longer use as much memory
today.

--
Rich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Wed Feb 15 15:40:01 2023

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:

On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk>

wrote:

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Keep in mind that you need to consider available RAM and not just
total RAM. Run free under the conditions where you typically run
emerge and see how much available memory it displays. Depending on
what you have running it could be much lower than 64GB.

Beyond that, unfortunately this is hard to deal with beyond just
figuring out what needs more RAM and making exceptions in package.env.

Also, RAM pressure could also come from the build directory if it is
on tmpfs, which of course many of us use.

Some packages that I build with either a greatly reduced -j setting or
a non-tmpfs build directory are:
sys-cluster/ceph
dev-python/scipy
dev-python/pandas
app-office/calligra
net-libs/nodejs
dev-qt/qtwebengine
dev-qt/qtwebkit
dev-lang/spidermonkey
www-client/chromium
app-office/libreoffice
sys-devel/llvm
dev-lang/rust (I use the rust binary these days as this has gotten
really out of hand)
x11-libs/gtk+

These are just packages I've had issues with at some point, and it is possible that some of these packages no longer use as much memory
today.

Thank you all. I can see what I'm doing better now. (Politicians aren't the only ones who can be ambiguous!)

I'll start by picking up the point I'd missed - putting MAKEOPTS in package.env.

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Wed Feb 15 16:40:01 2023

On Wednesday, 15 February 2023 15:12:28 GMT Michael wrote:

You can have both a generic MAKEOPTS in make.conf, which suits your base
case of emerge operations and will not cause your PC to explode when
combined with EMERGE_DEFAULT_OPTS, as well as package specific MAKEOPTS in package.env to finely tune individual package requirements.

Yes, I assumed so, and I've now set it up that way.

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael@21:1/5 to All on Wed Feb 15 15:12:28 2023

On Wednesday, 15 February 2023 14:31:25 GMT Peter Humphrey wrote:

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:

On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk>

wrote:

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Keep in mind that you need to consider available RAM and not just
total RAM. Run free under the conditions where you typically run
emerge and see how much available memory it displays. Depending on
what you have running it could be much lower than 64GB.

Beyond that, unfortunately this is hard to deal with beyond just
figuring out what needs more RAM and making exceptions in package.env.

Also, RAM pressure could also come from the build directory if it is
on tmpfs, which of course many of us use.

Some packages that I build with either a greatly reduced -j setting or
a non-tmpfs build directory are:
sys-cluster/ceph
dev-python/scipy
dev-python/pandas
app-office/calligra
net-libs/nodejs
dev-qt/qtwebengine
dev-qt/qtwebkit
dev-lang/spidermonkey
www-client/chromium
app-office/libreoffice
sys-devel/llvm
dev-lang/rust (I use the rust binary these days as this has gotten
really out of hand)
x11-libs/gtk+

These are just packages I've had issues with at some point, and it is possible that some of these packages no longer use as much memory
today.

Thank you all. I can see what I'm doing better now. (Politicians aren't the only ones who can be ambiguous!)

I'll start by picking up the point I'd missed - putting MAKEOPTS in package.env.

You can have both a generic MAKEOPTS in make.conf, which suits your base case of emerge operations and will not cause your PC to explode when combined with EMERGE_DEFAULT_OPTS, as well as package specific MAKEOPTS in package.env to finely tune individual package requirements.
-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEXqhvaVh2ERicA8Ceseqq9sKVZxkFAmPs9lwACgkQseqq9sKV ZxkzZRAAh6tm13HGQypp5ziRydxjvD+A8uk5M/fNP7Yu0i9fU6A1U69makIwN5/L GYAJaS3g+po/ut6HF6OjeavBcI5MYiEaW5MgKS2om1eRthKObQDHcxSY37RfVQQB Vu/GOypw8eN/1bPlbBXjQr/3xJ96JUj6cSxwM6UsR8fjpKPrk6zexroYibNudrh7 cEtzCteJJO4ezPGw8do1N9LkTD7jmETVirl27kmQmrIdUwYvA3r+SH109HcG4BkR jU/QhbIAQiwmvCZ4foQWFZxd69wnB2uIhOIk9Vukr26w+8hJVeNMBKy5ipBqo5PP siKzjh6Jopl5KY5lwo8C1vtW9oxhxsdQ0TTz1/RmTSD/VY4U2U44dV4r/N+YkvwE wNzJlFH8h46npG78T4SMcuepr5ByiQOOpHVgGsHozJiyoQRF+wnfMTd9cZRJcRV/ l3rlOvsdat6aYd6g2fS0ymgSNSVHtgM96wREiuhkYMKbB0WeIm8QL1Bdhpl7N/Rb vKSZM7g/gc0Gkfzjcs+xBhelkiycCayNeWC7zjdWeyTbpBxkWThELwgQ0yW2QU68 SI6UKVI513/o0kg9Pue/vz831FFHuoomS8FShMyN5xsR5FimOrV8ewOIIv9b5/ko kGjnDn9sBIFKF3izjIIAFUgpKOhF/LuYBsGhpfaHRuaAv7ohe64=
=a6j5
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From J. Roeleveld@21:1/5 to All on Thu Feb 16 08:20:01 2023

On Wednesday, February 15, 2023 10:56:22 AM CET Peter Humphrey wrote:

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

One other item I missed in the replies:
"--load-average" is also a valid option for make.

If you want to keep the load down, I would suggest adding this to MAKEOPTS as well:

MAKEOPTS="--jobs=16 --load-average=32"

I write the options out full because I had some weird errors in the past because the "-j" wasn't handled correctly at some point.

--
Joost

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Thu Feb 16 11:00:01 2023

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Yes, I was aware of that, but why didn't --load-average=32 take precedence?

8

Some packages that I build with either a greatly reduced -j setting or
a non-tmpfs build directory are:
sys-cluster/ceph
dev-python/scipy
dev-python/pandas
app-office/calligra
net-libs/nodejs
dev-qt/qtwebengine
dev-qt/qtwebkit
dev-lang/spidermonkey
www-client/chromium
app-office/libreoffice
sys-devel/llvm
dev-lang/rust (I use the rust binary these days as this has gotten
really out of hand)
x11-libs/gtk+

Thanks for the list, Rich.

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andreas Fink@21:1/5 to Peter Humphrey on Thu Feb 16 11:40:01 2023

On Thu, 16 Feb 2023 09:53:30 +0000
Peter Humphrey <peter@prh.myzen.co.uk> wrote:

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Yes, I was aware of that, but why didn't --load-average=32 take precedence?

This only means that emerge would not schedule additional package job
(where a package job means something like `emerge gcc`) when load
average > 32, howwever if a job is scheduled it's running, independently
of the current load.
While having it in MAKEOPTS, it would be handled by the make system,
which schedules single build jobs, and would stop scheduling additional
jobs, when the load is too high.

Extreme case:
emerge chromium firefox qtwebengine
--> your load when you do this is pretty much close to 0, i.e. all 3
packages are being merged simultaneously and each will be built with
-j16.
I.e. for a long time you will have about 3*16=48 single build jobs
running in parallel, i.e. you should see a load going towards 48, when
you do not have anything in your MAKEOPTS.

Cheers
Andreas

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich Freeman@21:1/5 to finkandreas@web.de on Thu Feb 16 13:30:02 2023

On Thu, Feb 16, 2023 at 5:32 AM Andreas Fink <finkandreas@web.de> wrote:

On Thu, 16 Feb 2023 09:53:30 +0000
Peter Humphrey <peter@prh.myzen.co.uk> wrote:

Yes, I was aware of that, but why didn't --load-average=32 take precedence?

This only means that emerge would not schedule additional package job
(where a package job means something like `emerge gcc`) when load
average > 32, howwever if a job is scheduled it's running, independently
of the current load.
While having it in MAKEOPTS, it would be handled by the make system,
which schedules single build jobs, and would stop scheduling additional
jobs, when the load is too high.

Extreme case:
emerge chromium firefox qtwebengine
--> your load when you do this is pretty much close to 0, i.e. all 3
packages are being merged simultaneously and each will be built with
-j16.
I.e. for a long time you will have about 3*16=48 single build jobs
running in parallel, i.e. you should see a load going towards 48, when
you do not have anything in your MAKEOPTS.

TL;DR - the load-average option results in underdamping, as a result
of the delay with the measurement of load average.

Keep in mind that load averages are averages and have a time lag, and
compilers that are swapping like crazy can run for a fairly long time.
So you will probably have fairly severe oscillation in the load if
swapping is happening. If your load is under 32, each of your 16
parallel makes, even if running with the limit in MAKEOPTS, will feel
free to launch another 256 jobs, because it will take seconds for the
1 minute load average to creep above 32. At that point you have WAY
more than 32 tasks running and if they're swapping then half of the
processes on your system are probably going to start blocking. So now
make (if configured in MAKEOPTS) will hold off on launching anything,
but it could take minutes for those swapping compiler jobs to complete
the amount of work that would normally take a few seconds. Then as
those processes eventually start terminating (assuming you don't get
OOM killing or PANICs) your load will start dropping, until eventually
it gets back below 32, at which point all those make processes that
are just sitting around will wake up and fire off another 50 gcc
instances or whatever they get up to before the brakes come back on.

The load average setting is definitely useful and I would definitely
set it, but when the issue is swapping it doesn't go far enough. Make
has no idea how much memory a gcc process will require. Since that is
the resource likely causing problems it is hard to efficiently max out
your cores without actually accounting for memory use. The best I've
been able to do is just set things conservatively so it never gets out
of control, and underutilizes CPU in the process. Often it is only
parts of a build that even have issues - something big like chromium
might have 10,000 tasks that would run fine with -j16 or whatever, but
then there is this one part where the jobs all want a ton of RAM and
you need to run just that one part at a lower setting.

--
Rich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Thu Feb 16 14:40:01 2023

On Thursday, 16 February 2023 12:23:52 GMT Rich Freeman wrote:

8 Much useful detail.

That all makes perfect sense, and is what I'd assumed, but it's good to have
it confirmed.

The load average setting is definitely useful and I would definitely
set it, but when the issue is swapping it doesn't go far enough. Make
has no idea how much memory a gcc process will require. Since that is
the resource likely causing problems it is hard to efficiently max out
your cores without actually accounting for memory use. The best I've
been able to do is just set things conservatively so it never gets out
of control, and underutilizes CPU in the process. Often it is only
parts of a build that even have issues - something big like chromium
might have 10,000 tasks that would run fine with -j16 or whatever, but
then there is this one part where the jobs all want a ton of RAM and
you need to run just that one part at a lower setting.

I've just looked at 'man make', from which it's clear that -j = --jobs, and that both those and --load-average are passed to /usr/bin/make, presumably untouched unless portage itself has identically named variables. So I wonder how feasible it might be for make to incorporate its own checks to ensure that the load average is not exceeded. I am not a programmer (not for at least 35 years, anyway), so I have to leave any such suggestion to the experts.

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich Freeman@21:1/5 to peter@prh.myzen.co.uk on Thu Feb 16 15:30:16 2023

On Thu, Feb 16, 2023 at 8:39 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:

I've just looked at 'man make', from which it's clear that -j = --jobs, and that both those and --load-average are passed to /usr/bin/make, presumably untouched unless portage itself has identically named variables. So I wonder how feasible it might be for make to incorporate its own checks to ensure that
the load average is not exceeded. I am not a programmer (not for at least 35 years, anyway), so I have to leave any such suggestion to the experts.

Well, if we just want to have a fun discussion here are my thoughts.
However, the complexity vs usefulness outside of Gentoo is such that I
don't see it happening.

For the most typical use case - a developer building the same thing
over and over (which isn't Gentoo), then make could cache info on
resources consumed, and use that to make more educated decisions about
how many tasks to launch. That wouldn't help us at all, but it would
help the typical make user. However, the typical make user can just
tune things in other ways.

It isn't going to be possible for make to estimate build complexity in
any practical way. Halting problem aside maybe you could build in
some smarts looking at the program being executed and its arguments,
but it would be a big mess.

Something make could do is tune the damping a bit. It could gradually
increase the number of jobs it runs and watch the load average, and
gradually scale it up appropriately, and gradually scale down if CPU
is the issue, or rapidly scale down if swap is the issue. If swapping
is detected it could even suspend most of the tasks it has spawned and
then gradually continue them as other tasks finish to recover from
this condition. However, this isn't going to work as well if portage
is itself spawning parallel instances of make - they'd have to talk to
each other or portage would somehow need to supervise things.

A way of thinking about it is that when you have portage spawning
multiple instances of make, that is a bit like adding gain to the --load-average MAKEOPTS. So each instance of make independently looks
at load average and takes action. So you have an output (compilers
that create load), then you sample that load with a time-weighted
average, and then you apply gain to this average, and then use that as feedback. That's basically a recipe for out of control oscillation.
You need to add damping and get rid of the gain.

Disclaimer: I'm not an engineer and I suspect a real engineer would be
able to add a bit more insight.

Really though the issue is that this is the sort of thing that only
impacts Gentoo and so nobody else is likely to solve this problem for
us.

--
Rich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andreas Fink@21:1/5 to Rich Freeman on Thu Feb 16 16:20:01 2023

On Thu, 16 Feb 2023 09:24:08 -0500
Rich Freeman <rich0@gentoo.org> wrote:

On Thu, Feb 16, 2023 at 8:39 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:

I've just looked at 'man make', from which it's clear that -j = --jobs, and that both those and --load-average are passed to /usr/bin/make, presumably untouched unless portage itself has identically named variables. So I wonder
how feasible it might be for make to incorporate its own checks to ensure that
the load average is not exceeded. I am not a programmer (not for at least 35
years, anyway), so I have to leave any such suggestion to the experts.

Well, if we just want to have a fun discussion here are my thoughts.
However, the complexity vs usefulness outside of Gentoo is such that I
don't see it happening.

For the most typical use case - a developer building the same thing
over and over (which isn't Gentoo), then make could cache info on
resources consumed, and use that to make more educated decisions about
how many tasks to launch. That wouldn't help us at all, but it would
help the typical make user. However, the typical make user can just
tune things in other ways.

It isn't going to be possible for make to estimate build complexity in
any practical way. Halting problem aside maybe you could build in
some smarts looking at the program being executed and its arguments,
but it would be a big mess.

Something make could do is tune the damping a bit. It could gradually increase the number of jobs it runs and watch the load average, and
gradually scale it up appropriately, and gradually scale down if CPU
is the issue, or rapidly scale down if swap is the issue. If swapping
is detected it could even suspend most of the tasks it has spawned and
then gradually continue them as other tasks finish to recover from
this condition. However, this isn't going to work as well if portage
is itself spawning parallel instances of make - they'd have to talk to
each other or portage would somehow need to supervise things.

A way of thinking about it is that when you have portage spawning
multiple instances of make, that is a bit like adding gain to the --load-average MAKEOPTS. So each instance of make independently looks
at load average and takes action. So you have an output (compilers
that create load), then you sample that load with a time-weighted
average, and then you apply gain to this average, and then use that as feedback. That's basically a recipe for out of control oscillation.
You need to add damping and get rid of the gain.

Disclaimer: I'm not an engineer and I suspect a real engineer would be
able to add a bit more insight.

Really though the issue is that this is the sort of thing that only
impacts Gentoo and so nobody else is likely to solve this problem for
us.

Given all your explenation and my annoyance a couple of years ago, I
hacked a little helper that sits between make and spawned build jobs.
Basically what annoyed me is the fact that chromium would compile for
hours and then fail, because it would need more memory than memory
available, and this would fail the whole build.
One possible solution is to reduce the number of build jobs to e.g. -j1
for chromium, but this is stupid because 99% of the time -j16 would
work just fine.

So I hacked a bit around, and came up with little helper&watcher. The
helper would limit spawning new jobs to SOME_LIMIT, and when load
is too high (e.g.g I am doing other work on the PC, that's not
under emerge's control). The watcher kills memory hungry build jobs,
once memory usage higher than 90%, tells the helper to stop spawning new
jobs, waits until the helper reports that no more build jobs are
running and then respawns the memory hungry build job (i.e. the memory
hungry build job will run essentially as if -j1 was specified)

This way I can mix emerge --jobs=HIGH_NUMBER and make
-jOTHER_HIGH_NUMBER, and it wouldn't affect the system, because the
total number of actual build jobs is controlled by the helper, and would
never go beyond SOME_LIMIT, even if HIGH_NUMBER*OTHER_HIGH_NUMBER > SOME_LIMIT.

I never published this anywhere, but if there's interest in it, I can
probably upload it somewhere, but I had the feeling that it's quite
hacky and not worth publishing. Also I was never sure if I break emerge
in some way, because it's very low-level, but now it's running since
more than a year without any emerge failure due to this hijacking.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Peter Humphrey@21:1/5 to All on Tue Apr 11 13:10:01 2023

On Wednesday, 15 February 2023 09:56:22 BST Peter Humphrey wrote:

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that is, we should divide our RAM size by 2 to get the maximum number of simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet- unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better values to suit my 24 threads and 64GB RAM?

Thanks all for your contributions.

I've settled on the following, after some experimenting:

EMERGE_DEFAULT_OPTS="--jobs --autounmask=n --quiet-unmerge-warn --keep-going --nospinner"
MAKEOPTS="-j24"

I've stopped using disk space for /var/tmp/portage, even for the biggest packages, because (a) it causes a huge increase in compilation time, even on a SATA SSD, and (b) I've never seen an OOM anyway.

So what if the system load goes high? It's only the number of processes ready for execution at any instant. I imagine the kernel is effective in guarding its own memory spaces.

--
Regards,
Peter.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Michal Wronka
  Wed May 29 19:38:12 2024
  from Wroclaw, Poland via SSH
- Cronus
  Thu May 30 00:22:53 2024
  from Provo, Ut via SSH
- Cronus
  Thu May 30 00:09:15 2024
  from Provo, Ut via SSH
- Cronus
  Thu May 30 00:04:41 2024
  from Provo, Ut via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	303
Nodes:	16 (3 / 13)
Uptime:	86:47:24
Calls:	6,811
Calls today:	3
Files:	12,328
Messages:	5,401,758

[gentoo-user] Jobs and load-average

Who's Online

Recent Visitors

System Info