Às 18:27 de 27/09/21, Marco Moock escreveu:
Does it also happen with the nouveau driver?I can't use the nouveau driver because I need the computer (a laptop)
Does it happen in the live system?
Does it happen with another graphics card or with that card in
another computer?
for AI deep learning with tensorflow GPU.
Às 18:27 de 27/09/21, Marco Moock escreveu:
Does it also happen with the nouveau driver?I can't use the nouveau driver because I need the computer (a laptop)
Does it happen in the live system?
Does it happen with another graphics card or with that card in another computer?
for AI deep learning with tensorflow GPU.
Am Mon, 27 Sep 2021 18:35:37 +0100
schrieb Paulo da Silva <p_d_a_s_i_l_v_a_ns@nonetnoaddress.pt>:
Às 18:27 de 27/09/21, Marco Moock escreveu:You can try if it does not happen with nouveau, maybe in the live system, it doesn't have nvidia-470 installed
Does it also happen with the nouveau driver?I can't use the nouveau driver because I need the computer (a laptop)
Does it happen in the live system?
Does it happen with another graphics card or with that card in
another computer?
for AI deep learning with tensorflow GPU.
Does it also happen with the nouveau driver?
Does it happen in the live system?
Does it happen with another graphics card or with that card in another computer?
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
Às 18:38 de 27/09/21, Marco Moock escreveu:...
I wander if there is some kind of script or configuration that forces
the logs not to be buffered. I'll search in the internet ...
Maybe run a memcheck, and an fsck of the
entire disk surface?
On 27/09/2021 18:22, Paulo da Silva wrote:Almost for sure ...
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
Sounds as though it might be hardware.
surface?I am running btrfs and I use scrub after the freezes. Never had an error
On 27/09/2021 20.02, Paulo da Silva wrote:I would thank you very much if you could find them.
Às 18:38 de 27/09/21, Marco Moock escreveu:...
I wander if there is some kind of script or configuration that forces
the logs not to be buffered. I'll search in the internet ...
Yes. You can send kernel logs directly to another machine via ethernet,
or even better if available, serial port.
Directly from the kernel, mind.
I may be able to locate information later, if you are interested. Hidden
deep in my bug reports somewhere.
machine I used for this, it is on another city.
Às 19:36 de 27/09/21, Java Jive escreveu:
On 27/09/2021 18:22, Paulo da Silva wrote:Almost for sure ...
From time to time - may be a month or a couple of hours - my computer >>> completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
Sounds as though it might be hardware.
....
Maybe run a memcheck,
How? In my boot menu there is no such option :-(
and an fsck of the entire disk
surface?I am running btrfs and I use scrub after the freezes. Never had an error
on my SSD.
Also smartctl -a only reports one error for a long time
Error Information Log Entries: 1
Paulo da Silva wrote:Certainly no. For example the last time I just left the computer
Às 18:27 de 27/09/21, Marco Moock escreveu:
Does it also happen with the nouveau driver?I can't use the nouveau driver because I need the computer (a laptop)
Does it happen in the live system?
Does it happen with another graphics card or with that card in
another computer?
for AI deep learning with tensorflow GPU.
Do you correlate the failure, with any particular
activity on the machine ?
For example, a more mundane activity on a computer,
is the usage of modern Firefox. While the user is
not viewing a web page, Firefox seems to leak memory
until all available memory in Ring 3 is used up.
But Linux has Out of Memory (OOM) killer, for the
handling of memory exhaustion that way. The system
should not freeze because Firefox happens to be
running.
Whereas, I don't know what happens, if a GPU that
uses shared memory, happens to request more and
more RAM for some GPU activity. An NVidia GPU is
more likely to have its own memory chips, and be
less likely to cause resource exhaustion on its own.
Try running "nvidia-smi" in a terminal window,
selecting the option to have it update the
screen repetitively (like "top" in a sense), and
watch resource consumption listed there. If you're
running the NVidia driver, that program should be
installed for you.
You could run "top" in one terminal window (using
the information near the top of top, for resource info).
And run "nvidia-smi" in a second window, to watch
for dwindling NVidia resources.
I am using kubuntu 20.04.
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
After restart the journalctl -b -b1 shows nothing at the freeze time.
Is there a way to get some information on what this is happening?
Às 19:36 de 27/09/21, Java Jive escreveu:
and an fsck of the entire disk
surface?I am running btrfs and I use scrub after the freezes. Never had an error
on my SSD.
Also smartctl -a only reports one error for a long time
Error Information Log Entries: 1
On 27/09/2021 21.29, Paulo da Silva wrote:
Às 19:36 de 27/09/21, Java Jive escreveu:
and an fsck of the entire disk
surface?I am running btrfs and I use scrub after the freezes. Never had an error
on my SSD.
Also smartctl -a only reports one error for a long time
Error Information Log Entries: 1
You should do a smartctl short test, then a long test.
Às 23:58 de 27/09/21, Carlos E. R. escreveu:
On 27/09/2021 21.29, Paulo da Silva wrote:It doesn't work for SSD, at least for mine.
Às 19:36 de 27/09/21, Java Jive escreveu:
and an fsck of the entire disk
surface?I am running btrfs and I use scrub after the freezes. Never had an error >>> on my SSD.
Also smartctl -a only reports one error for a long time
Error Information Log Entries: 1
You should do a smartctl short test, then a long test.
Only smartctl -a /dev/...
# smartctl -t long /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-88-generic] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
NVMe device successfully opened
Use 'smartctl -a' (or '-x') to print SMART (and more) information
On 27/09/2021 19.22, Paulo da Silva wrote:
I am using kubuntu 20.04.
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
Does your HDD led flash at lot?
If so, I would bet my money on that Plasma5 has leaked memory, in such
case the following bug could be of interest for your: https://bugs.kde.org/show_bug.cgi?id=436061
There ain't much you can do about this, the machine is too occupied with swapping that you won't be able to ssh to the machine. It could be wise
to disable swap and those get the kernel to kill a random process and hopefully it is plasmashell. I have had times when plasmashell has taken
58G of RAM and it's no other option than reboot the computer.
After restart the journalctl -b -b1 shows nothing at the freeze time.
Tend to be difficult to write to file when system under heavy load.
Is there a way to get some information on what this is happening?
For me it was more to try to be notified before it's get too bad, like logging the output from top* once every five minutes and that way be
able to see memory usage.
* for example use: top -b -n 1 >> /path/to/file/where/you/want/to/log
I would thank you very much if you could find them.
I am searching the internet for this stuff but so far I only found
trivial suggestions about logs.
On 2021-09-27, J.O. Aho <user@example.net> wrote:
On 27/09/2021 19.22, Paulo da Silva wrote:
I am using kubuntu 20.04.
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I wouod bet on a hardware problem. No warning. random. eg, the power
supply voltage could drop briefly. The system has not way of recording
it.
On 2021-09-27, J.O. Aho <user@example.net> wrote:
On 27/09/2021 19.22, Paulo da Silva wrote:
I am using kubuntu 20.04.
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I wouod bet on a hardware problem. No warning. random. eg, the power
supply voltage could drop briefly. The system has not way of recording
it.
Buy a new computer.
Às 19:34 de 27/09/21, Carlos E. R. escreveu:
On 27/09/2021 20.02, Paulo da Silva wrote:I would thank you very much if you could find them.
Às 18:38 de 27/09/21, Marco Moock escreveu:...
I wander if there is some kind of script or configuration that forces
the logs not to be buffered. I'll search in the internet ...
Yes. You can send kernel logs directly to another machine via ethernet,
or even better if available, serial port.
Directly from the kernel, mind.
I may be able to locate information later, if you are interested.
Hidden deep in my bug reports somewhere.
I am searching the internet for this stuff but so far I only found
trivial suggestions about logs.
Try with the section "dynamic configuration" from the netconsole.txt doc.
On 09/27/2021 12:32 PM, Paulo da Silva wrote:
I would thank you very much if you could find them.
I am searching the internet for this stuff but so far I only found
trivial suggestions about logs.
It's fairly trivial to setup another computer as a remote syslog server
and ship your laptop logs to that.
William Unruh wrote:
On 2021-09-27, J.O. Aho <user@example.net> wrote:
On 27/09/2021 19.22, Paulo da Silva wrote:
I am using kubuntu 20.04.
From time to time - may be a month or a couple of hours - my computer >>>> completely freezes. Everything stops. The screen shows the last image. >>>> Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I wouod bet on a hardware problem. No warning. random. eg, the power
supply voltage could drop briefly. The system has not way of recording
it.
Buy a new computer.
Among enthusiasts, it is popular to stock a spare
power supply. You can fit your spare supply and
retest, and see if that theory holds water.
Right now, the junk room sports a Seasonic S12
as the "designated hitter".
Running Prime95 (statically compiled Linux version
in "Just Testing" mode), while using the existing
supply, is an acceptance test. It tests machine
cooling is adequate (run something lmsensors based,
to see whether temp overshoots, while you're waiting
for the machine to shut off on CPU THERMTRIP). It draws
max CPU power. My machine, wall power climbs to 180W
while running that CPU integrity test.
https://www.mersenne.org/download/
If you have NVidia driver, you can add in a graphics
test if you want, but I don't have anything for that
in mind. I have a CUDA app, but it would be a pig
to set up due to libs and so on. On my machine, running
the graphics test case while Prime95 is running, raises
machine power to 360W (on a 550W PSU). Modern video
cards have a power limiter, and they also have a
status indicator in software, indicating which limiter is limiting
GPU performance. Running NVENC or NVDEC for example,
the card won't use more than 1/3rd of max power.
Normally, my machine power level doesn't go past 200W
without testing assistance like that. 360W to 400W loading,
is via synthetic (unlikely) tests.
*******
Haswell CPUs, at the time, some power supplies would
become unstable at low load, leading to "Haswell certified"
power supplies. But the most likely reason for that
to happen, was the existence of some older supplies
that have (on the label), a row of numbers for
"minimum consumption". No supply created in at least
the last ten years, has that row of numbers on the label.
The absolute worst situation of that type, is there
existed one supply, where the 12V rail needed 25% loading
to remain stable. So if the rail was 40 amps, the label would
read: Naturally, I was careful to never buy a supply
with the two-row MIN/MAX labeling, as it's an admission
of "stupid" in design. You would always be looking over
your shoulder, if you bought the one on the left.
Ancient supply label Modern supply label (zero amps is OK) ... +12V
Min 10A ... +12V Max 40A Max 40A
With lots of computer hardware today, such a guarantee
could not be met in the form of min loading. The idle current
could easily drop below 10A for example. Some modern supplies
have met the "0 amps" requirement, by having a 5W or 10W
load inside the PSU for the purpose of meeting open circuit
stability requirements. It's unlikely an 80+ supply is
doing that.
And here, stability does not mean "oscillation",
stability means remaining in regulation, 12V +/- 5%. If
unloaded, a "MIN/MAX" supply might deviate past 5% by a bit.
12V only gets in trouble, if it drops below 11V, as an example of
how far it can be pushed on overload. Burning might result
(hard drive clamp device activates) at around +15V or so.
There's a bit of headroom on +12V on the high side. Some
other rails don't have that luxury.
A multimeter is recommended, if checking voltages. Do not
trust the ACPI-calibrated voltage readouts for this. The
multimeter might be accurate to around 2% or so. And be careful
with the multimeter probes - one of those modern 1200W supplies,
if you happened to short +12V, it would not be pretty. They
live for the chance to melt wiring. While in theory, individual
wire looms have 20A limiters (PSU shuts off), you don't
want to be testing the cheapness of the company making
the supply, even if you've paid $150 for it. In some ways,
the behavior of the supply, is not adequately captured in
the affixed labeling scheme (specifically, OC protection).
There's been at least one, where it didn't appear
there was adequate loom protection.
In terms of noise patterns, supplies have "ripple". This might
be in the 0.02 to 0.05V range or so. The output capacitors
determine how fast the rail can change instantaneously.
This is a really old schematic now, for PSU education,
but it still illustrates the design principles.
There's 1000uF on the +12V rail for example. Supplies
typically can have 4000-5000 more uF added to the rail
at the load, before it affects oscillation stability.
Precise information of that nature, is hard to get
from a manufacturer, but the designer is aware of
the issue. You can't put 250,000uF across a PC PSU.
http://www.pavouk.org/hw/en_atxps.html
The ATX supply "pushes" but does not "pull". It is
not an op amp or linear amplifier. If the supply
deviates due to transient loading, it likely does
not respond well to energy dumped back into the
supply. Motherboards don't generally do that.
Only one regulator in the whole PC is push/pull. And
that's the regulator for the DIMM terminator resistors,
where the current flow magnitude can be in the +2 amps
to -2 amps range (bus all 0's, bus all 1's). The
regulator must sink the -2 amps, in order to precisely
maintain the terminators at the correct voltage
(otherwise, your PC may suffer the "Photoshop bug").
Most other regulators are the "push only" variety.
A 7805 is a push only regulator. It's not intended
to sink backward current flow.
Summary: I doubt it is the PSU, but... that's why we
test stuff.
Later in the thread I believe OP said it was a laptop. Swapping PSU not
an option. One thing that most likely the cause especially on a laptop
is heat-related hardware issue. Laptops make this a more difficult issue
to deal with, but depending on the laptop I would open 'er up and at
least blow out all the dust. The Dell and Lenovo I have is a simple
process, some other brands, not so much. Looking at the mb caps and
crusty corrosion... My old Latitude D-820 was a lap-roster with nVidia
GPU that was notorious for GPU meltdowns. I cleaned and remount heap
pipes several times and avoided that fate.
Jonathan N. Little wrote:
Later in the thread I believe OP said it was a laptop. Swapping PSU not
an option. One thing that most likely the cause especially on a laptop
is heat-related hardware issue. Laptops make this a more difficult issue
to deal with, but depending on the laptop I would open 'er up and at
least blow out all the dust. The Dell and Lenovo I have is a simple
process, some other brands, not so much. Looking at the mb caps and
crusty corrosion... My old Latitude D-820 was a lap-roster with nVidia
GPU that was notorious for GPU meltdowns. I cleaned and remount heap
pipes several times and avoided that fate.
Laptops are less debug-able.
The posts I read referred to a "computer".
Setting up a serial port, is the best way
to determine if it is really frozen. I prefer
the SuperIO serial port type, to USB serial.
I use this on the boot line of my newest computer:
console=ttyS0,57600n8
I have a serial cable that runs from the other
machine, over to this machine, where I can monitor it.
The nice thing about ttyS0, is it never moves,
whereas if you use USB serial adapters, you
don't know what the identifier for it is. Maybe
plugging in some other stuff, upsets your debug port.
Not that most people like serial ports, but
I like it. Gets the job done. Works good when
the HID stops working on a setup.
Hi all!
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
On Mon, 27 Sep 2021 20:32:19 +0100, Paulo da Silva wrote:...
Às 19:34 de 27/09/21, Carlos E. R. escreveu:
On 27/09/2021 20.02, Paulo da Silva wrote:
Às 18:38 de 27/09/21, Marco Moock escreveu:...
I wander if there is some kind of script or configuration that forces
the logs not to be buffered. I'll search in the internet ...
Yes. You can send kernel logs directly to another machine via ethernet,
or even better if available, serial port.
Directly from the kernel, mind.
Found the bug report :-)
Hi all!
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
On 01/10/2021 05.05, Paulo da Silva wrote:I tried that. No luck!
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I
tried this few times last couple of days.
Notice that all these problems are relatively recent.
If you mean the thermal issue, maybe you need to restart thermald after
you wake up from suspension. It's not unknown that some programs do not
work well with suspension.
I would keep an eye open for how much memory plasmashell uses, if youYes, I had several issues with plasmashell in all my computers :-( . I
see it creep over 1G, then it's time to restart it with "plasmashell --replace". Running top/htop once in a while should be ok.
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I tried this few times last couple of days.
Notice that all these problems are relatively recent.
Às 06:29 de 01/10/21, J.O. Aho escreveu:
On 01/10/2021 05.05, Paulo da Silva wrote:I tried that. No luck!
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I >>> tried this few times last couple of days.
Notice that all these problems are relatively recent.
If you mean the thermal issue, maybe you need to restart thermald after
you wake up from suspension. It's not unknown that some programs do not
work well with suspension.
I also played with some configurations, namely giving priority to freqs. because I know that lowering them causes the zone0 temp. to drop quickly. BTW, in the meanwhile I remembered that the freeze problem also
occurred, at least once, with the system in "Slow Mode" this half of
max. freq. and powersave governor. That's why I suspect of something
related with the GPU - HW or SW.
Yes, I had several issues with plasmashell in all my computers :-( . I
I would keep an eye open for how much memory plasmashell uses, if you
see it creep over 1G, then it's time to restart it with "plasmashell
--replace". Running top/htop once in a while should be ok.
have a script to handle them. I don't remember now what it does. Just
keeps working :-)
Thanks
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
First let me explain how I use this computer.
I have a starting (boot) command - cpupower - to set the max. freq. to
2.5 GHz. It's max. value used to be 5.1GHz. Also set the governor to powersave. Let me call this Slow Mode.
This causes my computer to be quiet with very low RPM on both fans. When unplugged from the charger they are most of the time at 0 RPM. Since it
is quite fast, this isn't noticeable.
When I need power, which rarely happens - training AIs or processing
large amount of data, I set it to the max. freq. and governor to
performance. I also do this for my FS :-) Let me call this Fast Mode.
Now, about this problem ...
1. I configured nvidia to ondemand. The freeze problem never occurred anymore. But since it could not occur for a month or more, its
inconclusive yet. Anyway, from lots of things I have being reading it is
very likely that BIOS, for some reason, can't cool something going wrong
and just freezes the computer. So, no logs. Once again, inconclusive.
2. A new problem
When in Fast Mode, using a job with fullcpu causes a shutdown.
This time there is a log entry:
"thermal thermal_zone0: critical temperature reached (110 C), shutting down" So, I tried to analyze the problem.
I monitored zone0 temperature and could see it goes up until 102C. Then
the computer initiates the emergency shutdown. So, the monitor gets
probably killed. Notice that the critical temp. for this zone is 100C.
I tried again but when the temp. of zone0 reached 99C I put the fans in
boost mode (max. speed) and the temperature dropped and got stable at 97C.
I tried this again, but now I just put the computer in Slow Mode. The
temp. drops to 40-50C!
So, why neither thermald or even the BIOS use these resources to drop
the temperature? In fact the fans rotate at higher speed but do not
reach the 6k RPM of boost mode. I tried several configurations for
thermald, including give priority to acting on freqs. No success. It
seems that thermald doesn't seem to care at all with its configs.
Finally
=======
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I tried this few times last couple of days.
Notice that all these problems are relatively recent.
By the way ... in windows this problem does not occur.
So:
SW problem, after an upgrade perahps? HW problem? Both?
I feel myself lost ...
As soon as I get some time, I'm thinking to install a new distro in a different partition and see what happens there.
Until there, before I start a CPU intensive job I need to reboot before.
Not bad ... :-)
Thank you to all who responded and for any further comments or suggestions. Paulo
I am getting an occasional freeze as well. Yesterday, in the midst of a Google Meet seminar I was delivering!. iAlmost Complete freeze on my end. No keys worked, screen frozen. Except that the people watching could still
hear me and see me and I could hear them. Alt-ctrl-F2 worked, so Linux was still running in
the background. I could not figure out how to unfreeze the google-meet
full screen and had to do the power button thingy. Of course then
another bug showed up. -- I sometimes run my laptop with a desktop
monitor attached. Often the second or third time I reboot, the system
seems to get completely confused and look for that second monitor as the default after it is run the "new hardware" search. It then times out (90
sec) on starting up akonidia(?) and then another 30 sec pause.starting
up something else, and spew out many pages of error/waring stuff befor
the boot process finished. So it took almost 5 min to reboot in the
midst of my seminar. Sheesh.
(Dell XPS13- 9360 machine, onboard Intel video
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)
Mageia 8, kernel
Linux planet 5.10.37-server-2.mga8 #1 SMP Mon May 17 17:44:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
So yes, something in Linux is having problems freezing the system. I
suspect the video driver in my case.
On 10/1/2021 1:42 AM, Paulo da Silva wrote:Ah, this explains why in ondemand the GPU temperature still rises when
Às 06:29 de 01/10/21, J.O. Aho escreveu:
On 01/10/2021 05.05, Paulo da Silva wrote:I tried that. No luck!
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts
again!!! I
tried this few times last couple of days.
Notice that all these problems are relatively recent.
If you mean the thermal issue, maybe you need to restart thermald after
you wake up from suspension. It's not unknown that some programs do not
work well with suspension.
I also played with some configurations, namely giving priority to freqs.
because I know that lowering them causes the zone0 temp. to drop quickly.
BTW, in the meanwhile I remembered that the freeze problem also
occurred, at least once, with the system in "Slow Mode" this half of
max. freq. and powersave governor. That's why I suspect of something
related with the GPU - HW or SW.
Yes, I had several issues with plasmashell in all my computers :-( . I
I would keep an eye open for how much memory plasmashell uses, if you
see it creep over 1G, then it's time to restart it with "plasmashell
--replace". Running top/htop once in a while should be ok.
have a script to handle them. I don't remember now what it does. Just
keeps working :-)
Thanks
From a hardware perspective, some subsystems share power envelope
because they're in the same package (Intel CPU and Intel HD 630).
Or, they can share a common heatpipe, which means if one gets
hot, both get hot (Intel CPU and NVidia GPU chip share heatpipe).
The NVidia chip, should have an NVidia driver which controlsIt's supposed that thermald takes actions to low the temperature. Per
frequency and voltage as a function of "what limit you're hitting".
On something like Furmark, you would be power limited. Maybe
the GPU driver throttles (turns down clock) when the chip gets
too warm. And this means, you could even be in a situation where
a railed or turboed CPU causes the GPU to slow down.
It's beyond my pay scale, to balance all these things, but fromAs I said before, thermald should have taken actions at 90C. This is the
the looks of it, some feedback loop in your laptop is not working
as expected. When the CPU goes above 100C, it should start throttling.
The NVidia chip should have a throttle temperature too. And the
NVidia throttle point should take the GPU temperature measurement
error into account.
https://wiki.archlinux.org/title/NVIDIA/Tips_and_tricks
Às 14:16 de 01/10/21, William Unruh escreveu:
I am getting an occasional freeze as well. Yesterday, in the midst of a
Google Meet seminar I was delivering!.
... iAlmost Complete freeze on my end. NoI never had any problem until recently - a couple of months or so. My computer expired the 2 yrs warranty in June :-)
keys worked, screen frozen. Except that the people watching could still
hear me and see me and I could hear them. Alt-ctrl-F2 worked, so Linux was still running in
the background. I could not figure out how to unfreeze the google-meet
full screen and had to do the power button thingy. Of course then
another bug showed up. -- I sometimes run my laptop with a desktop
monitor attached. Often the second or third time I reboot, the system
seems to get completely confused and look for that second monitor as the
default after it is run the "new hardware" search. It then times out (90
sec) on starting up akonidia(?) and then another 30 sec pause.starting
up something else, and spew out many pages of error/waring stuff befor
the boot process finished. So it took almost 5 min to reboot in the
midst of my seminar. Sheesh.
(Dell XPS13- 9360 machine, onboard Intel video
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)
Mageia 8, kernel
Linux planet 5.10.37-server-2.mga8 #1 SMP Mon May 17 17:44:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
So yes, something in Linux is having problems freezing the system. I
suspect the video driver in my case.
Unfortunately in my case there is nothing working after the freezes.
Even when it happens while listen to music, the sound entered in a +-1
second loop.
On 10/1/21 9:12 AM, Paulo da Silva wrote:
Às 14:16 de 01/10/21, William Unruh escreveu:
I am getting an occasional freeze as well. Yesterday, in the midst of a
Google Meet seminar I was delivering!.
Interesting. I also had that happen yesterday, on MX-Linux 19. Luckily
it was 20 minutes before the meeting.
My tiny ARM-based Arch-Linux box also has the occasional failure. It
just freezes and the CPU gets hot. Sometimes it happens after a few
days, sometimes after a month. No cues in the log. So I gave that one up.
... iAlmost Complete freeze on my end. NoI never had any problem until recently - a couple of months or so. My
keys worked, screen frozen. Except that the people watching could still
hear me and see me and I could hear them. Alt-ctrl-F2 worked, so
Linux was still running in
the background. I could not figure out how to unfreeze the google-meet
full screen and had to do the power button thingy. Of course then
another bug showed up. -- I sometimes run my laptop with a desktop
monitor attached. Often the second or third time I reboot, the system
seems to get completely confused and look for that second monitor as the >>> default after it is run the "new hardware" search. It then times out (90 >>> sec) on starting up akonidia(?) and then another 30 sec pause.starting
up something else, and spew out many pages of error/waring stuff befor
the boot process finished. So it took almost 5 min to reboot in the
midst of my seminar. Sheesh.
(Dell XPS13- 9360 machine, onboard Intel video
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core
Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620
(rev 02)
Mageia 8, kernel
Linux planet 5.10.37-server-2.mga8 #1 SMP Mon May 17 17:44:38 UTC
2021 x86_64 x86_64 x86_64 GNU/Linux
So yes, something in Linux is having problems freezing the system. I
suspect the video driver in my case.
computer expired the 2 yrs warranty in June :-)
It would not help you anyhow with an OS crash problem.Here I got that CPU situation lots of times.
Regarding the overtemp I assume you have looked whether there is one particular software that is very wasteful with processor resources. I
had that with a morse code reading software so I no longer use it, and
don't need it anymore.
If something reaches a temperature limit with the fan fully blasting
that is suspicious. I had that about two years ago and then I found the reason. We had adopted a dog and his fine hair got in there. So I had to reduce my PC fan cleaning intervals.
Unfortunately in my case there is nothing working after the freezes.
Even when it happens while listen to music, the sound entered in a +-1
second loop.
That almost cannot be hardware.
Às 23:44 de 01/10/21, Joerg escreveu:
On 10/1/21 9:12 AM, Paulo da Silva wrote:Here I got that CPU situation lots of times.
Às 14:16 de 01/10/21, William Unruh escreveu:
I am getting an occasional freeze as well. Yesterday, in the midst of a >>>> Google Meet seminar I was delivering!.
Interesting. I also had that happen yesterday, on MX-Linux 19. Luckily
it was 20 minutes before the meeting.
My tiny ARM-based Arch-Linux box also has the occasional failure. It
just freezes and the CPU gets hot. Sometimes it happens after a few
days, sometimes after a month. No cues in the log. So I gave that one up.
... iAlmost Complete freeze on my end. NoI never had any problem until recently - a couple of months or so. My
keys worked, screen frozen. Except that the people watching could still >>>> hear me and see me and I could hear them. Alt-ctrl-F2 worked, so
Linux was still running in
the background. I could not figure out how to unfreeze the google-meet >>>> full screen and had to do the power button thingy. Of course then
another bug showed up. -- I sometimes run my laptop with a desktop
monitor attached. Often the second or third time I reboot, the system
seems to get completely confused and look for that second monitor as the >>>> default after it is run the "new hardware" search. It then times out (90 >>>> sec) on starting up akonidia(?) and then another 30 sec pause.starting >>>> up something else, and spew out many pages of error/waring stuff befor >>>> the boot process finished. So it took almost 5 min to reboot in the
midst of my seminar. Sheesh.
(Dell XPS13- 9360 machine, onboard Intel video
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core
Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620
(rev 02)
Mageia 8, kernel
Linux planet 5.10.37-server-2.mga8 #1 SMP Mon May 17 17:44:38 UTC
2021 x86_64 x86_64 x86_64 GNU/Linux
So yes, something in Linux is having problems freezing the system. I
suspect the video driver in my case.
computer expired the 2 yrs warranty in June :-)
It would not help you anyhow with an OS crash problem.
Regarding the overtemp I assume you have looked whether there is one
particular software that is very wasteful with processor resources. I
had that with a morse code reading software so I no longer use it, and
don't need it anymore.
If something reaches a temperature limit with the fan fully blasting
that is suspicious. I had that about two years ago and then I found the
reason. We had adopted a dog and his fine hair got in there. So I had to
reduce my PC fan cleaning intervals.
I have lots of tasks very CPU/GPU intensive.
Anyway, as soon as I put the PC in Fast mode (max freqs and governor performance) almost anything I do, sometimes even scrolling a browser
page like Fb, causes the fans to rise RPM. ...
... Also they come back to almost
idle relatively fast when I just stop.
Hopefully not. There is one occurrence which doesn't allow me to discard
Unfortunately in my case there is nothing working after the freezes.
Even when it happens while listen to music, the sound entered in a +-1
second loop.
That almost cannot be hardware.
HW: From times to times, the fans go up to big RPM (noisy) for about 1
to 5 seconds and then follow down abruptly.The PC is doing nothing. This
also began to occur lately. As much as I know, is the BIOS that controls
the fans.
Thanks Joerg.
That is strange. When I do lengthy SPICE simulations where the CPU > goes to almost 100% workload the fans remain on full for half a minute or so.
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
First let me explain how I use this computer.
I have a starting (boot) command - cpupower - to set the max. freq. to
2.5 GHz. It's max. value used to be 5.1GHz. Also set the governor to powersave. Let me call this Slow Mode.
This causes my computer to be quiet with very low RPM on both fans. When unplugged from the charger they are most of the time at 0 RPM. Since it
is quite fast, this isn't noticeable.
When I need power, which rarely happens - training AIs or processing
large amount of data, I set it to the max. freq. and governor to
performance. I also do this for my FS :-) Let me call this Fast Mode.
Now, about this problem ...
1. I configured nvidia to ondemand. The freeze problem never occurred anymore. But since it could not occur for a month or more, its
inconclusive yet. Anyway, from lots of things I have being reading it is
very likely that BIOS, for some reason, can't cool something going wrong
and just freezes the computer. So, no logs. Once again, inconclusive.
2. A new problem
When in Fast Mode, using a job with fullcpu causes a shutdown.
This time there is a log entry:
"thermal thermal_zone0: critical temperature reached (110 C), shutting down" So, I tried to analyze the problem.
I monitored zone0 temperature and could see it goes up until 102C. Then
the computer initiates the emergency shutdown. So, the monitor gets
probably killed. Notice that the critical temp. for this zone is 100C.
I tried again but when the temp. of zone0 reached 99C I put the fans in
boost mode (max. speed) and the temperature dropped and got stable at 97C.
I tried this again, but now I just put the computer in Slow Mode. The
temp. drops to 40-50C!
So, why neither thermald or even the BIOS use these resources to drop
the temperature? In fact the fans rotate at higher speed but do not
reach the 6k RPM of boost mode. I tried several configurations for
thermald, including give priority to acting on freqs. No success. It
seems that thermald doesn't seem to care at all with its configs.
Finally
=======
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I tried this few times last couple of days.
Notice that all these problems are relatively recent.
By the way ... in windows this problem does not occur.
So:
SW problem, after an upgrade perahps? HW problem? Both?
I feel myself lost ...
As soon as I get some time, I'm thinking to install a new distro in a different partition and see what happens there.
Until there, before I start a CPU intensive job I need to reboot before.
Not bad ... :-)
Thank you to all who responded and for any further comments or suggestions. Paulo
On 10/1/21 5:52 PM, Paulo da Silva wrote:
Às 23:44 de 01/10/21, Joerg escreveu:
On 10/1/21 9:12 AM, Paulo da Silva wrote:Here I got that CPU situation lots of times.
Às 14:16 de 01/10/21, William Unruh escreveu:
I am getting an occasional freeze as well. Yesterday, in the midst
of a
Google Meet seminar I was delivering!.
Interesting. I also had that happen yesterday, on MX-Linux 19. Luckily
it was 20 minutes before the meeting.
My tiny ARM-based Arch-Linux box also has the occasional failure. It
just freezes and the CPU gets hot. Sometimes it happens after a few
days, sometimes after a month. No cues in the log. So I gave that one
up.
... iAlmost Complete freeze on my end. NoI never had any problem until recently - a couple of months or so. My
keys worked, screen frozen. Except that the people watching could
still
hear me and see me and I could hear them. Alt-ctrl-F2 worked, so
Linux was still running in
the background. I could not figure out how to unfreeze the google-meet >>>>> full screen and had to do the power button thingy. Of course then
another bug showed up. -- I sometimes run my laptop with a desktop
monitor attached. Often the second or third time I reboot, the system >>>>> seems to get completely confused and look for that second monitor
as the
default after it is run the "new hardware" search. It then times
out (90
sec) on starting up akonidia(?) and then another 30 sec pause.starting >>>>> up something else, and spew out many pages of error/waring stuff befor >>>>> the boot process finished. So it took almost 5 min to reboot in the
midst of my seminar. Sheesh.
(Dell XPS13- 9360 machine, onboard Intel video
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core
Processor Host Bridge/DRAM Registers (rev 02)
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 >>>>> (rev 02)
Mageia 8, kernel
Linux planet 5.10.37-server-2.mga8 #1 SMP Mon May 17 17:44:38 UTC
2021 x86_64 x86_64 x86_64 GNU/Linux
So yes, something in Linux is having problems freezing the system. I >>>>> suspect the video driver in my case.
computer expired the 2 yrs warranty in June :-)
It would not help you anyhow with an OS crash problem.
Regarding the overtemp I assume you have looked whether there is one
particular software that is very wasteful with processor resources. I
had that with a morse code reading software so I no longer use it, and
don't need it anymore.
If something reaches a temperature limit with the fan fully blasting
that is suspicious. I had that about two years ago and then I found the
reason. We had adopted a dog and his fine hair got in there. So I had to >>> reduce my PC fan cleaning intervals.
I have lots of tasks very CPU/GPU intensive.
Anyway, as soon as I put the PC in Fast mode (max freqs and governor
performance) almost anything I do, sometimes even scrolling a browser
page like Fb, causes the fans to rise RPM. ...
Can you watch the CPU load percentage when that happens? I keep thatThis has nothing to do with overload. It largely depends on clock
reading on the task bar so I can see when something becomes a MIPS
burner. I do the same with memory usage (mainly to see when Firefox has reached too much memory leakage).
... Also they come back to almost
idle relatively fast when I just stop.
That is strange. When I do lengthy SPICE simulations where the CPU goes
to almost 100% workload the fans remain on full for half a minute or so.
But anyhow, if this huge increase and then decay happens with much less
than 100% CPU load that would point to a mechanical problem. Pet hair in
the fan path, thermal paste under the heatsink dried up, something like
that.
May be. Those situations never occurred anymore! I'm not having pikes ofHopefully not. There is one occurrence which doesn't allow me to discard
Unfortunately in my case there is nothing working after the freezes.
Even when it happens while listen to music, the sound entered in a +-1 >>>> second loop.
That almost cannot be hardware.
HW: From times to times, the fans go up to big RPM (noisy) for about 1
to 5 seconds and then follow down abruptly.The PC is doing nothing. This
also began to occur lately. As much as I know, is the BIOS that controls
the fans.
I don't know much about Ubuntu flavors (using MX-Linux myself) but the
fan speed can also be controlled by the OS, depending on how your
Kubuntu is configured:
https://askubuntu.com/questions/22108/how-to-control-fan-speed
Sometimes hardware (or a BIOS) does this on purpose. For example, my
DOCSIS modem for internet access has a fan that never needs to come on because we never stream movies and stuff like that. Very little work for
the processor. Sometimes the fan still goes to full blast for a few
seconds, then off. I guess they programmed it that way to avoid the fan becoming "caked up" and stuck. Just like with a power generator, you
have to run it once a month or it might not start in a crisis situation.
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
First let me explain how I use this computer.
I have a starting (boot) command - cpupower - to set the max. freq. to
2.5 GHz. It's max. value used to be 5.1GHz. Also set the governor to powersave. Let me call this Slow Mode.
This causes my computer to be quiet with very low RPM on both fans. When unplugged from the charger they are most of the time at 0 RPM. Since it
is quite fast, this isn't noticeable.
When I need power, which rarely happens - training AIs or processing
large amount of data, I set it to the max. freq. and governor to
performance. I also do this for my FS :-) Let me call this Fast Mode.
Now, about this problem ...
1. I configured nvidia to ondemand. The freeze problem never occurred anymore. But since it could not occur for a month or more, its
inconclusive yet. Anyway, from lots of things I have being reading it is
very likely that BIOS, for some reason, can't cool something going wrong
and just freezes the computer. So, no logs. Once again, inconclusive.
2. A new problem
When in Fast Mode, using a job with fullcpu causes a shutdown.
This time there is a log entry:
"thermal thermal_zone0: critical temperature reached (110 C), shutting down" So, I tried to analyze the problem.
I monitored zone0 temperature and could see it goes up until 102C. Then
the computer initiates the emergency shutdown. So, the monitor gets
probably killed. Notice that the critical temp. for this zone is 100C.
I tried again but when the temp. of zone0 reached 99C I put the fans in
boost mode (max. speed) and the temperature dropped and got stable at 97C.
I tried this again, but now I just put the computer in Slow Mode. The
temp. drops to 40-50C!
So, why neither thermald or even the BIOS use these resources to drop
the temperature? In fact the fans rotate at higher speed but do not
reach the 6k RPM of boost mode. I tried several configurations for
thermald, including give priority to acting on freqs. No success. It
seems that thermald doesn't seem to care at all with its configs.
Finally
=======
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I tried this few times last couple of days.
Notice that all these problems are relatively recent.
By the way ... in windows this problem does not occur.
So:
SW problem, after an upgrade perahps? HW problem? Both?
I feel myself lost ...
As soon as I get some time, I'm thinking to install a new distro in a different partition and see what happens there.
Until there, before I start a CPU intensive job I need to reboot before.
Not bad ... :-)
Thank you to all who responded and for any further comments or suggestions. Paulo
On 01/10/2021 05.05, Paulo da Silva wrote:
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
...
First let me explain how I use this computer.
I have a starting (boot) command - cpupower - to set the max. freq. to
2.5 GHz. It's max. value used to be 5.1GHz. Also set the governor to
powersave. Let me call this Slow Mode.
This causes my computer to be quiet with very low RPM on both fans. When
unplugged from the charger they are most of the time at 0 RPM. Since it
is quite fast, this isn't noticeable.
When I need power, which rarely happens - training AIs or processing
large amount of data, I set it to the max. freq. and governor to
performance. I also do this for my FS :-) Let me call this Fast Mode.
Now, about this problem ...
1. I configured nvidia to ondemand. The freeze problem never occurred
anymore. But since it could not occur for a month or more, its
inconclusive yet. Anyway, from lots of things I have being reading it is
very likely that BIOS, for some reason, can't cool something going wrong
and just freezes the computer. So, no logs. Once again, inconclusive.
2. A new problem
When in Fast Mode, using a job with fullcpu causes a shutdown.
This time there is a log entry:
"thermal thermal_zone0: critical temperature reached (110 C), shutting
down"
So, I tried to analyze the problem.
I monitored zone0 temperature and could see it goes up until 102C. Then
the computer initiates the emergency shutdown. So, the monitor gets
probably killed. Notice that the critical temp. for this zone is 100C.
I tried again but when the temp. of zone0 reached 99C I put the fans in
boost mode (max. speed) and the temperature dropped and got stable at
97C.
I tried this again, but now I just put the computer in Slow Mode. The
temp. drops to 40-50C!
So, why neither thermald or even the BIOS use these resources to drop
the temperature? In fact the fans rotate at higher speed but do not
reach the 6k RPM of boost mode. I tried several configurations for
thermald, including give priority to acting on freqs. No success. It
seems that thermald doesn't seem to care at all with its configs.
Finally
=======
If I reboot the computer:
Then it seems OK.
I put it in Fast Mode, execute a fullcpu job and zone0 temp. keeps
stable at 97C!
If I suspend the computer, when restarting the problem starts again!!! I
tried this few times last couple of days.
Notice that all these problems are relatively recent.
By the way ... in windows this problem does not occur.
So:
SW problem, after an upgrade perahps? HW problem? Both?
I feel myself lost ...
As soon as I get some time, I'm thinking to install a new distro in a
different partition and see what happens there.
Until there, before I start a CPU intensive job I need to reboot before.
Not bad ... :-)
Thank you to all who responded and for any further comments or
suggestions.
Paulo
I have used two machines with limited cooling; one is a mini computer
box, fanless (idea is to be put on sitting room by the TV). When it is
doing something intense, it overheats and it throttles the CPU down.
Another is a laptop I prepared for another person, with a relatively
fast processor that can overheat if you demand some job for minutes, and
then it throttles down.
Both seem to be designed for this; be running normally with a small
load, but sprint on demand if the user needs to run something. But they
can not keep up the load for a long time because they have no fan, or a
too small fan.
Now, I did not install any daemon or configure anything, it was the
kernel itself doing it all, our of the box.
Both have only Intel graphics.
The minipc is a "msi CubiN Mini-PC" (I can't find exact model), cpu is "Intel(R) Pentium(R) CPU N3710 @ 1.60GHz" (4 cores)
The laptop is "Lenovo ThinkPad E15 Intel Core i5-10210U/8GB/512GB SSD/15.6"
In both cases I installed openSUSE Leap 15
Hi all!
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
Às 12:07 de 17/10/21, Carlos E.R. escreveu:
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
I have used two machines with limited cooling; one is a mini computer
box, fanless (idea is to be put on sitting room by the TV). When it is
doing something intense, it overheats and it throttles the CPU down.
Another is a laptop I prepared for another person, with a relatively
fast processor that can overheat if you demand some job for minutes, and
then it throttles down.
Both seem to be designed for this; be running normally with a small
load, but sprint on demand if the user needs to run something. But they
can not keep up the load for a long time because they have no fan, or a
too small fan.
Now, I did not install any daemon or configure anything, it was the
kernel itself doing it all, our of the box.
Both have only Intel graphics.
The minipc is a "msi CubiN Mini-PC" (I can't find exact model), cpu is
"Intel(R) Pentium(R) CPU N3710 @ 1.60GHz" (4 cores)
The laptop is "Lenovo ThinkPad E15 Intel Core i5-10210U/8GB/512GB SSD/15.6" >>
In both cases I installed openSUSE Leap 15
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
About Opensuse ... that was the best and more stable distro I have ever
used. I dropped it because the problem of install certain type of SW -
lack of information or packages, and the unavailability of some library sources for development. In debian likes I just need to install <lib name>-dev. One example was libgcrypt20.
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
About Opensuse ... that was the best and more stable distro I have ever
used. I dropped it because the problem of install certain type of SW -
lack of information or packages, and the unavailability of some library sources for development.
On Mon, 27 Sep 2021 18:22:21 +0100
Paulo da Silva <p_d_a_s_i_l_v_a_ns@nonetnoaddress.pt> wrote:
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
sounds like something to do with the ram. Disable XMP.
On 17/10/2021 20.15, Paulo da Silva wrote:
Às 12:07 de 17/10/21, Carlos E.R. escreveu:
Yes! The BIOS and/or the kernel should be enough to avoid temperatures problems. thermald, should at least be a last resource protection.That's the main point, Carlos. Why doesn't my PC (kernel, bios,
I have used two machines with limited cooling; one is a mini computer
box, fanless (idea is to be put on sitting room by the TV). When it is
doing something intense, it overheats and it throttles the CPU down.
Another is a laptop I prepared for another person, with a relatively
fast processor that can overheat if you demand some job for minutes, and >>> then it throttles down.
Both seem to be designed for this; be running normally with a small
load, but sprint on demand if the user needs to run something. But they
can not keep up the load for a long time because they have no fan, or a
too small fan.
Now, I did not install any daemon or configure anything, it was the
kernel itself doing it all, our of the box.
Both have only Intel graphics.
The minipc is a "msi CubiN Mini-PC" (I can't find exact model), cpu is
"Intel(R) Pentium(R) CPU N3710 @ 1.60GHz" (4 cores)
The laptop is "Lenovo ThinkPad E15 Intel Core i5-10210U/8GB/512GB
SSD/15.6"
In both cases I installed openSUSE Leap 15
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
Isengard:~ # ps afx | grep thermal
615 ? I< 0:00 \_ [acpi_thermal_pm]
23830 pts/23 S+ 0:00 \_ grep --color=auto thermal
Isengard:~ #
I'm not running thermald.
About Opensuse ... that was the best and more stable distro I have ever
used. I dropped it because the problem of install certain type of SW -
lack of information or packages, and the unavailability of some library
sources for development. In debian likes I just need to install <lib
name>-dev. One example was libgcrypt20.
What? All sources are available in openSUSE.
http://download.opensuse.org/source/distribution/leap/15.2/repo/oss/src/libgcrypt-1.8.2-lp152.16.8.src.rpm
You just need to activate the sources repo in YaST. If some particular package is missing the source, declare a bug.
If you just need the files to compile some other thing, you need the libname-devel package instead.
http://download.opensuse.org/distribution/leap/15.2/repo/oss/x86_64/libgcrypt-devel-1.8.2-lp152.16.8.x86_64.rpm
On 17/10/2021 20.15, Paulo da Silva wrote:Good idea, but unfortunately it didn't work!
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
I know I did tell you to test to reload the the thermald service and you
said it didn't make any difference, what about
- stop thermald
- rmmod the cpu temp module
- modprobe the cpu temp module
- start thermald
I'm not even sure if you can remove the module.
About Opensuse ... that was the best and more stable distro I have ever
used. I dropped it because the problem of install certain type of SW -
lack of information or packages, and the unavailability of some library
sources for development.
I did run OpenSuSe at my two previous jobs, sure there was shortcoming
with getting packages, but as Carlos already pointed out the dev
packages are in a different repository. And of course you can get hold
of all the SRPMs too in case you want to make some changes to a package.
It's not the distro I would use at home, for me metadistributions has
been more in my taste except the time it takes to build all the packages.
Às 13:26 de 18/10/21, Carlos E.R. escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:Yes! The BIOS and/or the kernel should be enough to avoid temperatures problems. thermald, should at least be a last resource protection.
Às 12:07 de 17/10/21, Carlos E.R. escreveu:
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
I have used two machines with limited cooling; one is a mini computer
box, fanless (idea is to be put on sitting room by the TV). When it is >>>> doing something intense, it overheats and it throttles the CPU down.
Another is a laptop I prepared for another person, with a relatively
fast processor that can overheat if you demand some job for minutes, and >>>> then it throttles down.
Both seem to be designed for this; be running normally with a small
load, but sprint on demand if the user needs to run something. But they >>>> can not keep up the load for a long time because they have no fan, or a >>>> too small fan.
Now, I did not install any daemon or configure anything, it was the
kernel itself doing it all, our of the box.
Both have only Intel graphics.
The minipc is a "msi CubiN Mini-PC" (I can't find exact model), cpu is >>>> "Intel(R) Pentium(R) CPU N3710 @ 1.60GHz" (4 cores)
The laptop is "Lenovo ThinkPad E15 Intel Core i5-10210U/8GB/512GB
SSD/15.6"
In both cases I installed openSUSE Leap 15
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are >>> able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
Isengard:~ # ps afx | grep thermal
615 ? I< 0:00 \_ [acpi_thermal_pm]
23830 pts/23 S+ 0:00 \_ grep --color=auto thermal
Isengard:~ #
I'm not running thermald.
None of them avoid the temperature from rising after suspension!
At least one of them does before any suspension occurred. The
temperature never rises above 97ºC.
Às 14:47 de 18/10/21, J.O. Aho escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:Good idea, but unfortunately it didn't work!
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are >>> able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
I know I did tell you to test to reload the the thermald service and you
said it didn't make any difference, what about
- stop thermald
- rmmod the cpu temp module
- modprobe the cpu temp module
- start thermald
I'm not even sure if you can remove the module.
I managed to remove all thermal related modules and installed them
again. No success! Temp keeps rising until I kill the full cpu test script!
On 19/10/2021 03.14, Paulo da Silva wrote:
Às 14:47 de 18/10/21, J.O. Aho escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:Good idea, but unfortunately it didn't work!
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they
are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
I know I did tell you to test to reload the the thermald service and you >>> said it didn't make any difference, what about
- stop thermald
- rmmod the cpu temp module
- modprobe the cpu temp module
- start thermald
I'm not even sure if you can remove the module.
I managed to remove all thermal related modules and installed them
again. No success! Temp keeps rising until I kill the full cpu test
script!
Take a look at this thread at github: https://github.com/intel/thermal_daemon/issues/268
In the comment https://github.com/intel/thermal_daemon/issues/268#issuecomment-788709112 it's mentioned that the thermald works after suspension after a patched version was used.
As I understand you can increase the debug information to get more info
about what thermald is doing, that could maybe help while trying to
figure it out.
Às 06:55 de 19/10/21, J.O. Aho escreveu:
On 19/10/2021 03.14, Paulo da Silva wrote:I'll try that. Not much hope, however.
Às 14:47 de 18/10/21, J.O. Aho escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:Good idea, but unfortunately it didn't work!
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising? >>>>> At least the sensors are working - I can monitor them and, at least, >>>>> lowering the CPU's freqs result in temps lowering. Also the fans are >>>>> able to go to higher RPM. If I manually put them in boost mode, they >>>>> are
able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
I know I did tell you to test to reload the the thermald service and you >>>> said it didn't make any difference, what about
- stop thermald
- rmmod the cpu temp module
- modprobe the cpu temp module
- start thermald
I'm not even sure if you can remove the module.
I managed to remove all thermal related modules and installed them
again. No success! Temp keeps rising until I kill the full cpu test
script!
Take a look at this thread at github:
https://github.com/intel/thermal_daemon/issues/268
In the comment
https://github.com/intel/thermal_daemon/issues/268#issuecomment-788709112
it's mentioned that the thermald works after suspension after a patched
version was used.
As I understand you can increase the debug information to get more info
about what thermald is doing, that could maybe help while trying to
figure it out.
The patch is included in the last version.
With the version of kubuntu 20.04:
- I have tried --adaptative and --ignore-cpuid--check. It didn't
complain but I could not determine if they are both active.
It should be expectable that the patch was back ported to kubuntu 20.04. Anyway ... I'll try the last version again, but this time with both
switches active, to see what happens.
Às 13:26 de 18/10/21, Carlos E.R. escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:Yes! The BIOS and/or the kernel should be enough to avoid temperatures problems. thermald, should at least be a last resource protection.
Às 12:07 de 17/10/21, Carlos E.R. escreveu:
That's the main point, Carlos. Why doesn't my PC (kernel, bios,
I have used two machines with limited cooling; one is a mini computer
box, fanless (idea is to be put on sitting room by the TV). When it is >>>> doing something intense, it overheats and it throttles the CPU down.
Another is a laptop I prepared for another person, with a relatively
fast processor that can overheat if you demand some job for minutes, and >>>> then it throttles down.
Both seem to be designed for this; be running normally with a small
load, but sprint on demand if the user needs to run something. But they >>>> can not keep up the load for a long time because they have no fan, or a >>>> too small fan.
Now, I did not install any daemon or configure anything, it was the
kernel itself doing it all, our of the box.
Both have only Intel graphics.
The minipc is a "msi CubiN Mini-PC" (I can't find exact model), cpu is >>>> "Intel(R) Pentium(R) CPU N3710 @ 1.60GHz" (4 cores)
The laptop is "Lenovo ThinkPad E15 Intel Core i5-10210U/8GB/512GB
SSD/15.6"
In both cases I installed openSUSE Leap 15
whatever) is unable to control the temperature after suspend/wake?
Besides, why thermald also seems to do anything to stop temp rising?
At least the sensors are working - I can monitor them and, at least,
lowering the CPU's freqs result in temps lowering. Also the fans are
able to go to higher RPM. If I manually put them in boost mode, they are >>> able to stop the temp rising!
Immediately after (re)boot the system never goes above 97ºC!
Isengard:~ # ps afx | grep thermal
615 ? I< 0:00 \_ [acpi_thermal_pm]
23830 pts/23 S+ 0:00 \_ grep --color=auto thermal
Isengard:~ #
I'm not running thermald.
None of them avoid the temperature from rising after suspension!
At least one of them does before any suspension occurred. The
temperature never rises above 97ºC.
Yes, now they are. But they weren't when I needed them.
About Opensuse ... that was the best and more stable distro I have ever
used. I dropped it because the problem of install certain type of SW -
lack of information or packages, and the unavailability of some library
sources for development. In debian likes I just need to install <lib
name>-dev. One example was libgcrypt20.
What? All sources are available in openSUSE.
http://download.opensuse.org/source/distribution/leap/15.2/repo/oss/src/libgcrypt-1.8.2-lp152.16.8.src.rpm
You just need to activate the sources repo in YaST. If some particular
package is missing the source, declare a bug.
If you just need the files to compile some other thing, you need the
libname-devel package instead.
http://download.opensuse.org/distribution/leap/15.2/repo/oss/x86_64/libgcrypt-devel-1.8.2-lp152.16.8.x86_64.rpm
May be I'll give OS a try again.
On 19/10/2021 02.49, Paulo da Silva wrote:
Às 13:26 de 18/10/21, Carlos E.R. escreveu:
On 17/10/2021 20.15, Paulo da Silva wrote:
Às 12:07 de 17/10/21, Carlos E.R. escreveu:
Yes, now they are. But they weren't when I needed them.About Opensuse ... that was the best and more stable distro I have ever >>>> used. I dropped it because the problem of install certain type of SW - >>>> lack of information or packages, and the unavailability of some library >>>> sources for development. In debian likes I just need to install <lib
name>-dev. One example was libgcrypt20.
What? All sources are available in openSUSE.
http://download.opensuse.org/source/distribution/leap/15.2/repo/oss/src/libgcrypt-1.8.2-lp152.16.8.src.rpm
You just need to activate the sources repo in YaST. If some particular
package is missing the source, declare a bug.
If you just need the files to compile some other thing, you need the
libname-devel package instead.
http://download.opensuse.org/distribution/leap/15.2/repo/oss/x86_64/libgcrypt-devel-1.8.2-lp152.16.8.x86_64.rpm
May be I'll give OS a try again.
In the case a source package is missing, just declare a bug.
I saw yesterday this command to zypper:
source-install (si) name...
Install specified source packages and their build dependencies. If the name of a binary package is given, the
corresponding source package is looked up and installed instead.
This command will try to find the newest available versions
of the source packages and uses rpm -i to install them, optionally
together with all the packages that are required to build the source
package. The default location where rpm installs source packages to is /usr/src/packages/{SPECS,SOURCES}, but the values can be changed in your local rpm configuration. In case of doubt try executing rpm --eval "%{_specdir} and %{_sourcedir}".
Note that the source packages must be available in repositories you are using. You can check whether a repository contains
any source packages using the following command:
$ zypper search -t srcpackage -r alias|name|#|URI
$ zypper search -t srcpackage -r alias|name|#|URI
Às 13:21 de 20/10/21, Carlos E.R. escreveu:
OK, let's say I want to give opensuse a try.
Let's say I install it and it still cannot handle my temperature
problem. I need to check this before I go into install and configure all
SW I use. This takes a couple of weeks.
How to delete it?
I know I did it in the past, but just to be sure ... is it:
1. boot into my actual system.
2. do grub-install or grub-install /dev/nvme0n1 (disk)?
3. efibootmgr -B -b <bootnum>?
4. Do I need further cleans in /boot/efi?
OK, let's say I want to give opensuse a try.
Às 23:19 de 20/10/21, Carlos E.R. escreveu:...
On 20/10/2021 19.22, Paulo da Silva wrote:
Às 13:21 de 20/10/21, Carlos E.R. escreveu:
Just one more question I forgot ...Is there a simple way to prepare a pen with r/w permissions from the
Maybe you could try one of the live versions, put it under load, and see
what happens with the temps and the fans. It is not fully reliable, but
it is faster.
iso? I remember to use unetbootin, or something like that, to do it, but
it stopped working at a given point. Since then I have been using dd,
but this makes the pen readonly.
I would like to update the system, make some trivial confs, and install
some sw and it would be nice to make them permanent.
Don't take time with this if you don't know. In the meanwhile I'll
search the net and test on a VM.
On 20/10/2021 19.22, Paulo da Silva wrote:
Às 13:21 de 20/10/21, Carlos E.R. escreveu:
Are you sure? What if I remove that partition content? Doesn't grub needOK, let's say I want to give opensuse a try.
Let's say I install it and it still cannot handle my temperature
problem. I need to check this before I go into install and configure all
SW I use. This takes a couple of weeks.
How to delete it?
I know I did it in the past, but just to be sure ... is it:
1. boot into my actual system.
2. do grub-install or grub-install /dev/nvme0n1 (disk)?
I don't think you need that one.
3. efibootmgr -B -b <bootnum>?
Yes.
4. Do I need further cleans in /boot/efi?
You can erase the directory /boot/efi/EFI/opensuse, and of course the
root partition.
Maybe you could try one of the live versions, put it under load, and seeIs there a simple way to prepare a pen with r/w permissions from the
what happens with the temps and the fans. It is not fully reliable, but
it is faster.
Is there a simple way to prepare a pen with r/w permissions from the
iso? I remember to use unetbootin, or something like that, to do it, but
it stopped working at a given point. Since then I have been using dd,
but this makes the pen readonly.
I would like to update the system, make some trivial confs, and install
some sw and it would be nice to make them permanent.
Don't take time with this if you don't know. In the meanwhile I'll
search the net and test on a VM.
Just one more question I forgot ...
Is it the same to install from the live image or is it better to
download the installer image? I'm asking because I never found a distro
with both images.
On Wed, 20 Oct 2021 21:25:58 -0400, Paulo da SilvaThis is good. I don't know if Opensuse does the same. Most likely not.
Is there a simple way to prepare a pen with r/w permissions from the
iso? I remember to use unetbootin, or something like that, to do it, but >>> it stopped working at a given point. Since then I have been using dd,
but this makes the pen readonly.
I would like to update the system, make some trivial confs, and install
some sw and it would be nice to make them permanent.
Don't take time with this if you don't know. In the meanwhile I'll
search the net and test on a VM.
For Mageia, the isodumper program/package from the Mageia repos. When
writing an
image to a usb stick, with the option to add a persistent partition
selected, it
uses dd to write the image, then adds an ext4 partition to the remaining space with
the label mgalive-persist. The Mageia live iso images look for the
partition, and if
found mounts it as an overlayfs so all changes made, including
installing additional
packages, are stored for later use.
At least my network wifi configuration goes to the new installed system.Just one more question I forgot ...
Is it the same to install from the live image or is it better to
download the installer image? I'm asking because I never found a distro
with both images.
When installing from a live iso, the contents of the iso (all files seen
when
it's booted, not the iso file itself) are copied to the selected/mounted
file
systems. If installing while running in live mode, and selecting the
install
from the running live system, the changes made in live mode, including
those
stored in the mgalive-persist file system, are included.
I expect other distros that support persistence use similar packages and methods.
Às 23:19 de 20/10/21, Carlos E.R. escreveu:
On 20/10/2021 19.22, Paulo da Silva wrote:Are you sure? What if I remove that partition content? Doesn't grub need
Às 13:21 de 20/10/21, Carlos E.R. escreveu:
OK, let's say I want to give opensuse a try.
Let's say I install it and it still cannot handle my temperature
problem. I need to check this before I go into install and configure all >>> SW I use. This takes a couple of weeks.
How to delete it?
I know I did it in the past, but just to be sure ... is it:
1. boot into my actual system.
2. do grub-install or grub-install /dev/nvme0n1 (disk)?
I don't think you need that one.
it? I am asking because I always believed (without fundament) that there
is always a main system for boot.
Is there a simple way to prepare a pen with r/w permissions from the
3. efibootmgr -B -b <bootnum>?
Yes.
4. Do I need further cleans in /boot/efi?
You can erase the directory /boot/efi/EFI/opensuse, and of course the
root partition.
Maybe you could try one of the live versions, put it under load, and see
what happens with the temps and the fans. It is not fully reliable, but
it is faster.
iso? I remember to use unetbootin, or something like that, to do it, but
it stopped working at a given point. Since then I have been using dd,
but this makes the pen readonly.
I would like to update the system, make some trivial confs, and install
some sw and it would be nice to make them permanent.
Don't take time with this if you don't know. In the meanwhile I'll
search the net and test on a VM.
Thanks
Paulo>
On 20/10/2021 18:22, Paulo da Silva wrote:
OK, let's say I want to give opensuse a try.
People just use a live Flash drive to try things. They don't install anything.
On 10/20/2021 7:04 PM, Ordinary Poster wrote:
On 20/10/2021 18:22, Paulo da Silva wrote:
OK, let's say I want to give opensuse a try.
People just use a live Flash drive to try things. They don't install
anything.
Downloaded the 900MB "LiveDVD" one.
https://sjc.edge.kernel.org/opensuse/tumbleweed/iso/openSUSE-Tumbleweed-KDE-Live-x86_64-Snapshot20211016-Media.iso
Shows an install icon, but it will probably
be doing some sort of network install, with
some delays while it gets stuff from the network.
Whereas the 4GB version will at least have a few
files onboard.
For a one-off install, the 900MB might be the answer.
If you think you'll be installing more than once,
then it might be more important to get a larger
piece of media.
This is what I see in a VM, when clicking the Install
icon in the 900MB one.
[Picture]
https://i.postimg.cc/R085hy8s/900-MB-disc-has-install-icon.gif
Às 23:19 de 20/10/21, Carlos E.R. escreveu:
On 20/10/2021 19.22, Paulo da Silva wrote:Are you sure? What if I remove that partition content? Doesn't grub need
Às 13:21 de 20/10/21, Carlos E.R. escreveu:
OK, let's say I want to give opensuse a try.
Let's say I install it and it still cannot handle my temperature
problem. I need to check this before I go into install and configure all >>> SW I use. This takes a couple of weeks.
How to delete it?
I know I did it in the past, but just to be sure ... is it:
1. boot into my actual system.
2. do grub-install or grub-install /dev/nvme0n1 (disk)?
I don't think you need that one.
it? I am asking because I always believed (without fundament) that there
is always a main system for boot.
Maybe you could try one of the live versions, put it under load, and seeIs there a simple way to prepare a pen with r/w permissions from the
what happens with the temps and the fans. It is not fully reliable, but
it is faster.
iso? I remember to use unetbootin, or something like that, to do it, but
it stopped working at a given point. Since then I have been using dd,
but this makes the pen readonly.
I would like to update the system, make some trivial confs, and install
some sw and it would be nice to make them permanent.
Don't take time with this if you don't know. In the meanwhile I'll
search the net and test on a VM.
On 21/10/2021 06.56, Paul wrote:
This is what I see in a VM, when clicking the Install
icon in the 900MB one.
[Picture]
https://i.postimg.cc/R085hy8s/900-MB-disc-has-install-icon.gif
If that's the "Tumbleweed-KDE-Live" you can just cancel the install and use the system as is, no installation.
On 21/10/2021 06.56, Paul wrote:
On 10/20/2021 7:04 PM, Ordinary Poster wrote:
On 20/10/2021 18:22, Paulo da Silva wrote:
OK, let's say I want to give opensuse a try.
People just use a live Flash drive to try things. They don't install
anything.
Downloaded the 900MB "LiveDVD" one.
https://sjc.edge.kernel.org/opensuse/tumbleweed/iso/openSUSE-Tumbleweed-KDE-Live-x86_64-Snapshot20211016-Media.iso
That one is intended to be run as is, on the USB stick, without
installation, although installation is possible. There should be a KDE version, another GNome, another XFCE, and another dedicated to rescue
work (the later two might be the same one).
All of them are intended to copy with dd from the image to the USB
device (say, /dev/sdb), destroying all the partitions (creates new
ones). On the first run they create a read/write partition where you can
save files. It is possible to add some packages with zypper (not the
kernel, though).
Don't try to "make them bootable", that would destroy them. Just copy to
the stick, unmodified, with dd or dedicated programs (as described in
the openSUSE wiki).
Then there are two other images, one of about 4GB (the DVD) and another
mall one for network install. Those are the pure installation images,
can not be "run". That is, of course they boot and run but what you get
has only the purpose of installation.
Shows an install icon, but it will probably
be doing some sort of network install, with
some delays while it gets stuff from the network.
Whereas the 4GB version will at least have a few
files onboard.
For a one-off install, the 900MB might be the answer.
If you think you'll be installing more than once,
then it might be more important to get a larger
piece of media.
This is what I see in a VM, when clicking the Install
icon in the 900MB one.
[Picture]
https://i.postimg.cc/R085hy8s/900-MB-disc-has-install-icon.gif
If that's the "Tumbleweed-KDE-Live" you can just cancel the install and
use the system as is, no installation.
On 10/21/2021 9:10 AM, Carlos E. R. wrote:
On 21/10/2021 06.56, Paul wrote:
This is what I see in a VM, when clicking the Install
icon in the 900MB one.
[Picture]
https://i.postimg.cc/R085hy8s/900-MB-disc-has-install-icon.gif
If that's the "Tumbleweed-KDE-Live" you can just cancel the install and use the system as is, no installation.
When a person wants to run a specific graphics driver,
an install comes in handy for that case. Even a USB stick
with persistence would do, but persistence easily exhausts
the 4GB formulation, and it helps to have a larger
casper-rw than that. I think Rufus can do that (rufus.ie).
You might need a specific graphics driver, to get a machine
hot enough to tip over.
Paul
Hi all!
From time to time - may be a month or a couple of hours - my computer completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
The current situation:
1. The sporadic "fan jets" are from the normal fan. Not the GPU one.
2. The "freezes" origin still unknown. Now I am almost sure that it does
not come from the BIOS. In fact, during a freeze, there was one
occurrence of several continuous "fan jets".
3. A couple of "fan jets" also occurred once while in the grub menu!
4. The uncontrollable rising of temperature of /sys/devices/virtual/thermal/thermal_zone0/temp at full cpu after suspend/wake also occurs with opensuse leap 15.3 live. Before
suspending, that temperature is kept stable at 97ºC.
I tried to use several kernels available in kubuntu, including an intel version 5.13, but I was unable to get them boot in graphic mode - nvidia
470. Some more ... time and I'll try it without Nvidia drivers.
Thanks for your attention.
Paulo
On 10/29/21 16:36, Paulo da Silva wrote:
Às 18:22 de 27/09/21, Paulo da Silva escreveu:Have you opened the case and used compressed air to get the dust out?
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
The current situation:
1. The sporadic "fan jets" are from the normal fan. Not the GPU one.
2. The "freezes" origin still unknown. Now I am almost sure that it does
not come from the BIOS. In fact, during a freeze, there was one
occurrence of several continuous "fan jets".
3. A couple of "fan jets" also occurred once while in the grub menu!
4. The uncontrollable rising of temperature of
/sys/devices/virtual/thermal/thermal_zone0/temp at full cpu after
suspend/wake also occurs with opensuse leap 15.3 live. Before
suspending, that temperature is kept stable at 97ºC.
I tried to use several kernels available in kubuntu, including an intel
version 5.13, but I was unable to get them boot in graphic mode - nvidia
470. Some more ... time and I'll try it without Nvidia drivers.
Thanks for your attention.
Paulo
How long has the CPU been in place under the heat sink. The grease or thermal paste used can dry out and lose heat conductivity.
Às 01:33 de 30/10/21, Bobbie Sellers escreveu:
On 10/29/21 16:36, Paulo da Silva wrote:Of course it is very likely there are some problems with the
Às 18:22 de 27/09/21, Paulo da Silva escreveu:Have you opened the case and used compressed air to get the dust out?
Hi all!
From time to time - may be a month or a couple of hours - my computer >>>> completely freezes. Everything stops. The screen shows the last image. >>>> Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
The current situation:
1. The sporadic "fan jets" are from the normal fan. Not the GPU one.
2. The "freezes" origin still unknown. Now I am almost sure that it does >>> not come from the BIOS. In fact, during a freeze, there was one
occurrence of several continuous "fan jets".
3. A couple of "fan jets" also occurred once while in the grub menu!
4. The uncontrollable rising of temperature of
/sys/devices/virtual/thermal/thermal_zone0/temp at full cpu after
suspend/wake also occurs with opensuse leap 15.3 live. Before
suspending, that temperature is kept stable at 97ºC.
I tried to use several kernels available in kubuntu, including an intel
version 5.13, but I was unable to get them boot in graphic mode - nvidia >>> 470. Some more ... time and I'll try it without Nvidia drivers.
Thanks for your attention.
Paulo
How long has the CPU been in place under the heat sink. The grease or >> thermal paste used can dry out and lose heat conductivity.
sensors/cooling system. But what I do not understand is why the
temperature gets controlled, by the kernel perhaps, before first
suspension and not after waking from suspension!
I have tried Opensuse and Clear linux. All have the same problem.
I have written a small script that successfully controls the temperature
just changing the CPU's freqs. I don't know how to act on the other
cooling systems. thermald, which was supposed to do this, fails miserabilly.
Regards.
Paulo
Às 18:22 de 27/09/21, Paulo da Silva escreveu:
Hi all!
From time to time - may be a month or a couple of hours - my computer
completely freezes. Everything stops. The screen shows the last image.
Not even the cursor moves. No keyboard key works including the
Alt-PrtScreen keys, like REISUB.
I need to press the power on/off button for 5 secs to restart it.
After restart the journalctl -b -b1 shows nothing at the freeze time.
I changed my NVIDIA driver to 470. I also tried to put the driver in
ondemand status. No success. Sooner or later it freezes.
Is there a way to get some information on what this is happening?
I am using kubuntu 20.04.
Thank you.
The current situation:
1. The sporadic "fan jets" are from the normal fan. Not the GPU one.
2. The "freezes" origin still unknown. Now I am almost sure that it does
not come from the BIOS. In fact, during a freeze, there was one
occurrence of several continuous "fan jets".
3. A couple of "fan jets" also occurred once while in the grub menu!
4. The uncontrollable rising of temperature of /sys/devices/virtual/thermal/thermal_zone0/temp at full cpu after suspend/wake also occurs with opensuse leap 15.3 live. Before
suspending, that temperature is kept stable at 97ºC.
I tried to use several kernels available in kubuntu, including an intel version 5.13, but I was unable to get them boot in graphic mode - nvidia
470. Some more ... time and I'll try it without Nvidia drivers.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (3 / 13) |
Uptime: | 69:59:41 |
Calls: | 6,656 |
Calls today: | 2 |
Files: | 12,200 |
Messages: | 5,332,146 |
Posted today: | 1 |