I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s | 1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel: Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel: Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I recently replaced the power supply thinking it might be the cause.
But although freezes seem to be occuring less frequently, it is still occasionally freezing. (It seems to be occuring every two or three
months).
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s |
1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]:
dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache. Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]:
dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel:
Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP
SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel:
Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows nothingsignificant that I can see just before the freeze.
On 7/28/23 03:16, William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
This reminds me of the issue I have had with running two plasma5
sessions, after a while they will just eat up the memory and when all is gone, then everything freezes, you can't login from remote and local
login takes too long from entering username to entering password that
the login is canceled.
I would run a memory check once in a while, for example use crontab to
run this one liner:
date && ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | head -n 11 &&
echo "----" >> /var/log/memusage.log
Then you can see if there is a process that grows.
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s | 1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache. Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel: Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel: Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I recently replaced the power supply thinking it might be the cause.
But although freezes seem to be occuring less frequently, it is still occasionally freezing. (It seems to be occuring every two or three
months).
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s | 1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache. Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel: Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel: Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I recently replaced the power supply thinking it might be the cause.
But although freezes seem to be occuring less frequently, it is still occasionally freezing. (It seems to be occuring every two or three
months).
On Fri, 28 Jul 2023 10:27:47 +0200, J.O. Aho wrote:
On 7/28/23 03:16, William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
This reminds me of the issue I have had with running two plasma5
sessions, after a while they will just eat up the memory and when all is
gone, then everything freezes, you can't login from remote and local
login takes too long from entering username to entering password that
the login is canceled.
I would run a memory check once in a while, for example use crontab to
run this one liner:
date && ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | head -n 11 &&
echo "----" >> /var/log/memusage.log
Then you can see if there is a process that grows.
Cute, and with just a little bit of scripting/coding you could automate
the check by flagging any line with value above some watermark for cpu
and memory percentage.
If it was me, I would print any greater than 1.0 for ether one.
Since values have a decimal I would guess that I would have to
use bc to test greater than watermark value if using bash.
In a multi-node set up I send an email to LAN admin.
On Fri, 28 Jul 2023 01:16:54 -0000 (UTC), William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s | 1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created. >>
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel: Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel: Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I recently replaced the power supply thinking it might be the cause.
But although freezes seem to be occuring less frequently, it is still
occasionally freezing. (It seems to be occuring every two or three
months).
Off hand it seems like the cpu gets into a tight loop and quits processing interrupts.
I would install lm_sensors, configure/run lm_sensors, sensord and enable core dump.
I have also modified ~/.bash_profile to check for core dump files and
uses xmessage to provide a pop up if any are found.
On 7/28/23 03:16, William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
This reminds me of the issue I have had with running two plasma5
sessions, after a while they will just eat up the memory and when all is gone, then everything freezes, you can't login from remote and local
login takes too long from entering username to entering password that
the login is canceled.
I would run a memory check once in a while, for example use crontab to
run this one liner:
date && ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | head -n 11 &&
echo "----" >> /var/log/memusage.log
Then you can see if there is a process that grows.
On 2023-07-28, Bit Twister <BitTwister@mouse-potato.com> wrote:
On Fri, 28 Jul 2023 01:16:54 -0000 (UTC), William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
Updated Mga8. Kernel 5.15.88-desktop-1.mga8.
The only thing in the journalctl logs from just before is
-------------------------------
Jul 24 23:22:34 tunnel.physics.ubc.ca dnf[2573748]: teams 8.9 kB/s | 1.5 kB 00:00
Jul 24 23:22:35 tunnel.physics.ubc.ca dnf[2573748]: Metadata cache created. >>>
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Succeeded.
Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: Finished dnf makecache. >>> Jul 24 23:22:35 tunnel.physics.ubc.ca systemd[1]: dnf-makecache.service: Consumed 1.527s CPU time.
Jul 24 23:28:00 tunnel.physics.ubc.ca kernel: Shorewall:sshd-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=183.106.205.242 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=56 ID=64660 PROTO=TCP SPT=31651 DPT=22 WINDOW=16988 RES=0x00 SYN URGP=0
Jul 24 23:35:31 tunnel.physics.ubc.ca kernel: Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
-- Reboot --
---------------------------------------
I recently replaced the power supply thinking it might be the cause.
But although freezes seem to be occuring less frequently, it is still
occasionally freezing. (It seems to be occuring every two or three
months).
Off hand it seems like the cpu gets into a tight loop and quits processing >> interrupts.
I would install lm_sensors, configure/run lm_sensors, sensord and enable core dump.
I have also modified ~/.bash_profile to check for core dump files and
uses xmessage to provide a pop up if any are found.
OK, Here are the last two sensord reports just before the freeze
I do not see anything out of the ordinary here. The last one (23:28:55) occured. The freeze ( as inferred from the last entry into /var/log/syslog occured aroung 23:35:31
(
--------------------------------
Jul 24 23:35:31 tunnel kernel: [1181287.585321] Shorewall:net-fw:DROP:IN=eno1 OUT= MAC=4c:ed:fb:c2:2a:f3:a0:ab:1b:88:6e:58:08:00 SRC=185.225.74.53 DST=192.168.0.3 LEN=40 TOS=0x00 PREC=0x00 TTL=244 ID=54321 PROTO=TCP SPT=33231 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0
Jul 25 09:13:25 tunnel kernel: [ 0.000000] microcode: microcode updated early to revision 0xf0, date = 2021-11-12
--------------------------------------
----------------------------------
Jul 24 23:08:55 tunnel sensord: Chip: nvme-pci-0600
Jul 24 23:08:55 tunnel sensord: Adapter: PCI adapter
Jul 24 23:08:55 tunnel sensord: Composite: 24.9 C (min = -40.1 C, max = 83.8 C)
Jul 24 23:08:55 tunnel sensord: Sensor 2: 24.9 C (min = -40.1 C, max = 83.8 C)
Jul 24 23:08:55 tunnel sensord: Chip: coretemp-isa-0000
Jul 24 23:08:55 tunnel sensord: Adapter: ISA adapter
Jul 24 23:08:55 tunnel sensord: Package id 0: 29.0 C
Jul 24 23:08:55 tunnel sensord: Core 0: 28.0 C
Jul 24 23:08:55 tunnel sensord: Core 1: 27.0 C
Jul 24 23:08:55 tunnel sensord: Core 2: 28.0 C
Jul 24 23:08:55 tunnel sensord: Core 3: 27.0 C
Jul 24 23:08:55 tunnel sensord: Chip: acpitz-acpi-0
Jul 24 23:08:55 tunnel sensord: Adapter: ACPI interface
Jul 24 23:08:55 tunnel sensord: temp1: 27.8 C
Jul 24 23:28:55 tunnel sensord: Chip: nvme-pci-0600
Jul 24 23:28:55 tunnel sensord: Adapter: PCI adapter
Jul 24 23:28:55 tunnel sensord: Composite: 24.9 C (min = -40.1 C, max = 83.8 C)
Jul 24 23:28:55 tunnel sensord: Sensor 2: 24.9 C (min = -40.1 C, max = 83.8 C)
Jul 24 23:28:55 tunnel sensord: Chip: coretemp-isa-0000
Jul 24 23:28:55 tunnel sensord: Adapter: ISA adapter
Jul 24 23:28:55 tunnel sensord: Package id 0: 29.0 C
Jul 24 23:28:55 tunnel sensord: Core 0: 28.0 C
Jul 24 23:28:55 tunnel sensord: Core 1: 27.0 C
Jul 24 23:28:55 tunnel sensord: Core 2: 27.0 C
Jul 24 23:28:55 tunnel sensord: Core 3: 27.0 C
Jul 24 23:28:55 tunnel sensord: Chip: acpitz-acpi-0
Jul 24 23:28:55 tunnel sensord: Adapter: ACPI interface
Jul 24 23:28:55 tunnel sensord: temp1: 27.8 C ----------------------------------------------
I cannot find any core files, and I do not think I am suppressing them.
On 7/28/23 11:56, Bit Twister wrote:
On Fri, 28 Jul 2023 10:27:47 +0200, J.O. Aho wrote:
On 7/28/23 03:16, William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing. >>>> Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the >>>> power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
This reminds me of the issue I have had with running two plasma5
sessions, after a while they will just eat up the memory and when all is >>> gone, then everything freezes, you can't login from remote and local
login takes too long from entering username to entering password that
the login is canceled.
I would run a memory check once in a while, for example use crontab to
run this one liner:
date && ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | head -n 11 &&
echo "----" >> /var/log/memusage.log
Then you can see if there is a process that grows.
Cute, and with just a little bit of scripting/coding you could automate
the check by flagging any line with value above some watermark for cpu
and memory percentage.
If it was me, I would print any greater than 1.0 for ether one.
Since values have a decimal I would guess that I would have to
use bc to test greater than watermark value if using bash.
You could use awk, have to always point out this great tool as it bears
my name
date && ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v
min="0.3" '$6 >= min || $6=="%MEM"' && echo "----"
Just change the -v min="0.3" to the value you think it has be or more to
be displayed, we also keep displaying the column names, sure there is
better ways of doing this, but my skills not that great.
On 2023-07-28, J.O. Aho <user@example.net> wrote:
On 7/28/23 03:16, William Unruh wrote:
I have a disturbing system, which every once in a while freezes. The
sceen on the monitor is some X scene (usually has Chrome running, but
they again that is often the case) but the keyboard,mouse, do nothing.
Trying to log on from the network fails with no response from the
machine. Alt-ctrl-del does nothing. The only way to recover is via the
power switch on the back.
Afterwards, looking at /var/log/syslog, or /var/log/messages shows
nothingsignificant that I can see just before the freeze.
This reminds me of the issue I have had with running two plasma5
sessions, after a while they will just eat up the memory and when all is
gone, then everything freezes, you can't login from remote and local
login takes too long from entering username to entering password that
the login is canceled.
Except in my case, nothing I type does anything, the mouse cursor does
not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp
do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as
it starts to swap. This is just a complete sudden freeze. (Unlike this
past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
Except in my case, nothing I type does anything, the mouse cursor does
not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp
do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as
it starts to swap. This is just a complete sudden freeze. (Unlike this
past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
On Sat, 29 Jul 2023 04:18:32 -0400, J.O. Aho <user@example.net> wrote:
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
That is not reliable. /proc/$PID/comm may contain spaces so by time awk
gets it
the column number is wrong.
[dave@x3 ~]$ ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v min="3.0" '$6 >= min || $6=="%MEM"'
USER UID COMMAND PID %CPU %MEM dave 500 opera 6969 2.8 4.7 ddclient 468 ddclient - slee 5691 0.0 0.0
[dave@x3 ~]$ cat /proc/5691/comm
ddclient - slee
[dave@x3 ~]$ cat /proc/5691/cmdline
ddclient - sleeping for 90 seconds[dave@x3 ~]$
I don't see a reliable field separator.
Regards, Dave Hodgins
min="3.0" '$5 >= min || $5=="%MEM"'
On Sat, 29 Jul 2023 04:18:32 -0400, J.O. Aho <user@example.net> wrote:
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
That is not reliable. /proc/$PID/comm may contain spaces so by time awk gets it
the column number is wrong.
[dave@x3 ~]$ ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v min="3.0" '$6 >= min || $6=="%MEM"'
USER UID COMMAND PID %CPU %MEM
dave 500 opera 6969 2.8 4.7
ddclient 468 ddclient - slee 5691 0.0 0.0
[dave@x3 ~]$ cat /proc/5691/comm
ddclient - slee
[dave@x3 ~]$ cat /proc/5691/cmdline
ddclient - sleeping for 90 seconds[dave@x3 ~]$
I don't see a reliable field separator.
Regards, Dave Hodgins
William Unruh <unruh@invalid.ca> writes:
Except in my case, nothing I type does anything, the mouse cursor does
not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp
do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as
it starts to swap. This is just a complete sudden freeze. (Unlike this
past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
Yes. Normal behavior for running out of RAM is swapping, and for running
out of RAM+swap is to start killing user processes.
In this case:
| Trying to log on from the network fails with no response from the
| machine.
Does it respond to ping?
If so then the kernel’s still working, at least a bit (the lack of
kernel logs suggest everything above that is dead).
If it does not ping then the kernel has crashed, either due to a kernel
bug or a hardware fault.
On Sat, 29 Jul 2023 04:18:32 -0400, J.O. Aho <user@example.net> wrote:
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
That is not reliable. /proc/$PID/comm may contain spaces so by time awk gets it
the column number is wrong.
[dave@x3 ~]$ ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v min="3.0" '$6 >= min || $6=="%MEM"'
USER UID COMMAND PID %CPU %MEM
dave 500 opera 6969 2.8 4.7
ddclient 468 ddclient - slee 5691 0.0 0.0
[dave@x3 ~]$ cat /proc/5691/comm
ddclient - slee
[dave@x3 ~]$ cat /proc/5691/cmdline
ddclient - sleeping for 90 seconds[dave@x3 ~]$
I don't see a reliable field separator.
William Unruh <unruh@invalid.ca> writes:
Except in my case, nothing I type does anything, the mouse cursor does
not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp
do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as
it starts to swap. This is just a complete sudden freeze. (Unlike this
past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
Yes. Normal behavior for running out of RAM is swapping, and for running
out of RAM+swap is to start killing user processes.
On Sat, 29 Jul 2023 09:09:32 -0400, David W. Hodgins wrote:
On Sat, 29 Jul 2023 04:18:32 -0400, J.O. Aho <user@example.net> wrote:
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
That is not reliable. /proc/$PID/comm may contain spaces so by time awk gets it
the column number is wrong.
[dave@x3 ~]$ ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v min="3.0" '$6 >= min || $6=="%MEM"'
USER UID COMMAND PID %CPU %MEM
dave 500 opera 6969 2.8 4.7
ddclient 468 ddclient - slee 5691 0.0 0.0
[dave@x3 ~]$ cat /proc/5691/comm
ddclient - slee
[dave@x3 ~]$ cat /proc/5691/cmdline
ddclient - sleeping for 90 seconds[dave@x3 ~]$
I don't see a reliable field separator.
My bash solution
while read -r line ; do
set -- $line
_bin=$3
shift $(( $# - 2 ))
_cpu=$1
_mem=$2
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
Except in my case, nothing I type does anything, the mouse cursor does
not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp
do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as >>> it starts to swap. This is just a complete sudden freeze. (Unlike this
past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
Yes. Normal behavior for running out of RAM is swapping, and for running
out of RAM+swap is to start killing user processes.
The problem is that the swapping in and out Xorg and the desktop
environment makes things extremely slow and the tradition in Linux is to kill a random process (not sure if it's changed nowadays), the
likelihood that the right process is killed is slim and if not the right
one is killed, the swapping will not end and the swap will keep the
system slow and even if it kills the right process it will in reality
take hours before anything happens due of the swap in and out.
My experience is that things do not get better even if you disable swap (seldom you really need swap when having 64GB RAM), for some reason it
seems to be the disk is as active as during the swap in / out. Could it
have to do with zram?
Something that makes a difference is setting a hard memory limit on the process, then it will be killed when it tries to use more RAM than the limit.
Of course at this point we don't know what the issues is for OP and he
don't use plasma5, so I doubt it's plasmashell that is his issue.
On Sat, 29 Jul 2023 14:52:19 -0400, Bit Twister <BitTwister@mouse-potato.com> wrote:
On Sat, 29 Jul 2023 09:09:32 -0400, David W. Hodgins wrote:
On Sat, 29 Jul 2023 04:18:32 -0400, J.O. Aho <user@example.net> wrote:
*/5 * * * * root (date && ps -Ao user,uid,comm,pid,pcpu,pmem
--sort=-pmem | awk -v min="10.0" '$6 >= min || $6=="%MEM"' && echo
"----" >> /var/log/mem.log)
That is not reliable. /proc/$PID/comm may contain spaces so by time awk gets it
the column number is wrong.
[dave@x3 ~]$ ps -Ao user,uid,comm,pid,pcpu,pmem --sort=-pmem | awk -v min="3.0" '$6 >= min || $6=="%MEM"'
USER UID COMMAND PID %CPU %MEM
dave 500 opera 6969 2.8 4.7
ddclient 468 ddclient - slee 5691 0.0 0.0
[dave@x3 ~]$ cat /proc/5691/comm
ddclient - slee
[dave@x3 ~]$ cat /proc/5691/cmdline
ddclient - sleeping for 90 seconds[dave@x3 ~]$
I don't see a reliable field separator.
My bash solution
while read -r line ; do
set -- $line
_bin=$3
shift $(( $# - 2 ))
_cpu=$1
_mem=$2
The solution of putting the comm field last works fine.
$ ps -Ao user,uid,pid,pcpu,pmem,comm --sort=-pmem | awk -v min="2.0" '$5 >= min || $5=="%MEM"'
USER UID PID %CPU %MEM COMMAND
dave 500 6969 3.7 4.9 opera
dave 500 33667 7.5 4.0 firefox
dave 500 6186 0.7 2.9 plasmashell
dave 500 34910 1.3 2.3 Isolated Web Co
dave 500 34975 0.8 2.3 Isolated Web Co
dave 500 53719 2.1 2.0 Isolated Web Co
Nice solution.
On 2023-07-29, Richard Kettlewell <invalid@invalid.invalid> wrote:[...]
Does it respond to ping?
No.
If it does not ping then the kernel has crashed, either due to a kernel
bug or a hardware fault.
That is sure what it looks like. Weird thing is that the video card is
still sending out the last image, so it is running.
On 2023-07-29, J.O. Aho <user@example.net> wrote:
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
Except in my case, nothing I type does anything, the mouse cursor does >>>> not move, Teh Alt-ctrl-F keys do nothing, alt-ctrl-det or alt-ctrl-bksp >>>> do nothing, gkrellm stops updating. Usually when I have run out of
memory, somethings still work, and the machine slows down drastically as >>>> it starts to swap. This is just a complete sudden freeze. (Unlike this >>>> past time, sometimes this has happened while I am working on the
machine. This time it happened while I was asleep)
Yes. Normal behavior for running out of RAM is swapping, and for running >>> out of RAM+swap is to start killing user processes.
The problem is that the swapping in and out Xorg and the desktop
environment makes things extremely slow and the tradition in Linux is to
kill a random process (not sure if it's changed nowadays), the
likelihood that the right process is killed is slim and if not the right
one is killed, the swapping will not end and the swap will keep the
system slow and even if it kills the right process it will in reality
take hours before anything happens due of the swap in and out.
My experience is that things do not get better even if you disable swap
(seldom you really need swap when having 64GB RAM), for some reason it
seems to be the disk is as active as during the swap in / out. Could it
have to do with zram?
Something that makes a difference is setting a hard memory limit on the
process, then it will be killed when it tries to use more RAM than the
limit.
Of course at this point we don't know what the issues is for OP and he
don't use plasma5, so I doubt it's plasmashell that is his issue.
Actually I do use Plasma and sddm.
But as I said there is little indication beforehand that is is swapping
that is the problem (in the times that it happened to me while I was
working on the system.-- as I said this last time it occured just before bedtime and I discovered it next morning. )
William Unruh <unruh@invalid.ca> writes:
On 2023-07-29, Richard Kettlewell <invalid@invalid.invalid> wrote:[...]
Does it respond to ping?
No.
If it does not ping then the kernel has crashed, either due to a kernel
bug or a hardware fault.
That is sure what it looks like. Weird thing is that the video card is
still sending out the last image, so it is running.
The video card keeps transmits to the display under its own steam, it doesn’t need the kernel to tell it to do so.
On 7/30/23 00:39, William Unruh wrote:....
On 2023-07-29, J.O. Aho <user@example.net> wrote:
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
Running two Xorg with their own plasma5, one with weather applet and directory display and the other one with just the directory display.
Fixed image background on both.
The memory leak is quite random when it starts, but latest 2 days after start of plasmashell it will have begun to eat up a lot of memory, first slowly and then faster and faster. When using strace it seems
plasmashell is trying to connect somewhere and it gets a timeout, during
the timeout period it manage to queue 3-4 more requests (haven't figured
out where it tries to connect). Usually it's just one of the
plasmashells that hogs a lot of memory, like 30-40GB while the other may just be around 10GB (which one uses the most is random). When the RAM
run out, then the disk activity starts (remind you that I do not have a swap) and when the disk activity has begun, the Xorg will be so slow and
if you move the mouse pointer you will see it move a small distance and
then freeze. It will be now impossible to ssh to the machine, if you
have a running ssh connection to the machine you can still write
something, but it will be slow like a 120baud modem. If you already are
root ('su -' before it happen), then you can run 'killall -9
plasmashell' and after a short while the machine will be somewhat usable again, now you just need to start the plasmashell for each user, this
tend to be a two step process, first "su - username" and run
"DISPLAY=:X.0 plasmashell --replace", then go and switch to that users
VT and then run from konsole "plasmashell --replace", and then repeat
the process with the next user.
Nowadays I don't use plasma5 at all, I will make a try with plasma6 when it's in the distros repository, but until then I will keep on using lxqt
and I suggest you do also check for another desktop environment to use
until plasma6 is out for your distro.
On 2023-07-30, J.O. Aho <user@example.net> wrote:
On 7/30/23 00:39, William Unruh wrote:...
On 2023-07-29, J.O. Aho <user@example.net> wrote:
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
Running two Xorg with their own plasma5, one with weather applet and
directory display and the other one with just the directory display.
Fixed image background on both.
The memory leak is quite random when it starts, but latest 2 days after
start of plasmashell it will have begun to eat up a lot of memory, first
slowly and then faster and faster. When using strace it seems
plasmashell is trying to connect somewhere and it gets a timeout, during
the timeout period it manage to queue 3-4 more requests (haven't figured
out where it tries to connect). Usually it's just one of the
plasmashells that hogs a lot of memory, like 30-40GB while the other may
just be around 10GB (which one uses the most is random). When the RAM
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I run Intel onboard graphics. There is no slowdown beforehand. It just
as if someone threw a switch to shut everything down, except that the
image keeps showing (ie the grephics card is still sending out a
signal).
No disk activity, no slowdown. just stop.
J.O. Aho wrote:
Nowadays I don't use plasma5 at all, I will make a try with plasma6
when it's in the distros repository, but until then I will keep on
using lxqt and I suggest you do also check for another desktop
environment to use until plasma6 is out for your distro.
I realize that we are talking about an 'unknown' diagnosis, but it
sounds like it is closely associated w/ the KDE desktop, that
theoretically anyone using plasma5 could spring a memory leak, or, is
there potentially some much more limited concept going on here?
Like some 'specific' KDE 'gear' that if someone didn't have that KDE
app, they wouldn't have a leak?
On 2023-07-30, J.O. Aho <user@example.net> wrote:
On 7/30/23 00:39, William Unruh wrote:...
On 2023-07-29, J.O. Aho <user@example.net> wrote:
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
Running two Xorg with their own plasma5, one with weather applet and
directory display and the other one with just the directory display.
Fixed image background on both.
The memory leak is quite random when it starts, but latest 2 days after
start of plasmashell it will have begun to eat up a lot of memory, first
slowly and then faster and faster. When using strace it seems
plasmashell is trying to connect somewhere and it gets a timeout, during
the timeout period it manage to queue 3-4 more requests (haven't figured
out where it tries to connect). Usually it's just one of the
plasmashells that hogs a lot of memory, like 30-40GB while the other may
just be around 10GB (which one uses the most is random). When the RAM
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
On 2023-07-30 at 14:59 ADT, William Unruh <unruh@invalid.ca> wrote:
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I don't think the VSZ is all that significant.
What is the RSS?
On 7/30/23 19:59, William Unruh wrote:
Mu plasmashell right now is 2GB VSZ
I think I usually had it to take ~700MB when it started, if it was up on
2GB I knew it would start to eat more memory, so a "plasmashell
--replace" was the only thing that prevented it from going bad, for a
while, in best case you buy yourself another day.
On 7/30/23 19:59, William Unruh wrote:....
On 2023-07-30, J.O. Aho <user@example.net> wrote:
On 7/30/23 00:39, William Unruh wrote:
On 2023-07-29, J.O. Aho <user@example.net> wrote:
On 29/07/2023 11:44, Richard Kettlewell wrote:
William Unruh <unruh@invalid.ca> writes:
How much ram do you have?
64GB, I used to have 32, but I ran into the issue and I was hoping that doubling the amount of RAM should make it less affecting, gosh I was wrong.
Mu plasmashell right now is 2GB VSZ
I think I usually had it to take ~700MB when it started, if it was up on
2GB I knew it would start to eat more memory, so a "plasmashell
--replace" was the only thing that prevented it from going bad, for a
while, in best case you buy yourself another day.
I run Intel onboard graphics. There is no slowdown beforehand. It just
as if someone threw a switch to shut everything down, except that the
image keeps showing (ie the grephics card is still sending out a
signal).
You will have a desktop on you screen all the way till you reboot, this
far I never seen the plasmashell process to be auto killed, if it would
then your screen would turn black, but a mouse pointer that you can move around.
No disk activity, no slowdown. just stop.
If you had been a hour earlier, I think you would maybe caught the
slowdown, as it's related to the amount of free RAM you have left, when
you don't it's the freeze time.
On 2023-07-30, J.O. Aho <user@example.net> wrote:
On 7/30/23 19:59, William Unruh wrote:
...
How much ram do you have?
64GB, I used to have 32, but I ran into the issue and I was hoping that
doubling the amount of RAM should make it less affecting, gosh I was wrong.
I have 8 GB. ps says plasmashell right now has VSZ of 1.9GB and %MEM of
3.1% ( which sould be 250MB I guess.
If you had been a hour earlier, I think you would maybe caught the
slowdown, as it's related to the amount of free RAM you have left, when
you don't it's the freeze time.
I have about 8GB of swap, so it should slow down when it starts to use
swap.
Then your plasmashell is using 250MB at the moment, the VSZ is a in
theory amount of RAM the process would use if it had to load
everything at once, but as you ain't using all the features of
plasmahsell this will not happen.
Jim Diamond <JimDiamond@jdvb.ca> writes:
On 2023-07-30 at 14:59 ADT, William Unruh <unruh@invalid.ca> wrote:
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I don't think the VSZ is all that significant.
Indeed VSZ tells you very little - for instance it includes a 2MB dead
page in the middle of every shared library, which (in a complex process) scales up VSZ to something huge, but consumes almost no real resources.
What is the RSS?
I think the theories about user space memory leaks are a red herring.
The reported behavior is just not consistent with them.
On 2023-07-31 at 04:11 ADT, Richard Kettlewell <invalid@invalid.invalid> wrote:
Jim Diamond <JimDiamond@jdvb.ca> writes:
On 2023-07-30 at 14:59 ADT, William Unruh <unruh@invalid.ca> wrote:
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I don't think the VSZ is all that significant.
Indeed VSZ tells you very little - for instance it includes a 2MB dead
page in the middle of every shared library, which (in a complex process)
scales up VSZ to something huge, but consumes almost no real resources.
What is the RSS?
I think the theories about user space memory leaks are a red herring.
The reported behavior is just not consistent with them.
I have not followed the thread too closely, but I suspect you are right.
But when I saw someone reporting VSZ, I thought that they should know that
it wasn't meaningful, just in case they keep barking up that tree.
On 2023-07-31, Jim Diamond <JimDiamond@jdvb.ca> wrote:
On 2023-07-31 at 04:11 ADT, Richard Kettlewell <invalid@invalid.invalid> wrote:
Jim Diamond <JimDiamond@jdvb.ca> writes:
On 2023-07-30 at 14:59 ADT, William Unruh <unruh@invalid.ca> wrote:
How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I don't think the VSZ is all that significant.
Indeed VSZ tells you very little - for instance it includes a 2MB dead
page in the middle of every shared library, which (in a complex process) >>> scales up VSZ to something huge, but consumes almost no real resources.
What is the RSS?
I think the theories about user space memory leaks are a red herring.
The reported behavior is just not consistent with them.
I have not followed the thread too closely, but I suspect you are right.
But when I saw someone reporting VSZ, I thought that they should know that >> it wasn't meaningful, just in case they keep barking up that tree.
Yes, thanks for that. VSZ seemed much to large to be reasonable. but I
could not see anything else in the ps aux report that could be the
actual memory used (except %MEM and what it was a percentage of I had no idea.)
On 7/31/2023 12:47 PM, William Unruh wrote:
On 2023-07-31, Jim Diamond <JimDiamond@jdvb.ca> wrote:
On 2023-07-31 at 04:11 ADT, Richard Kettlewell <invalid@invalid.invalid> wrote:
Jim Diamond <JimDiamond@jdvb.ca> writes:
On 2023-07-30 at 14:59 ADT, William Unruh <unruh@invalid.ca> wrote: >>>>>> How much ram do you have?
Mu plasmashell right now is 2GB VSZ
I don't think the VSZ is all that significant.
Indeed VSZ tells you very little - for instance it includes a 2MB dead >>>> page in the middle of every shared library, which (in a complex process) >>>> scales up VSZ to something huge, but consumes almost no real resources. >>>>
What is the RSS?
I think the theories about user space memory leaks are a red herring.
The reported behavior is just not consistent with them.
I have not followed the thread too closely, but I suspect you are right. >>>
But when I saw someone reporting VSZ, I thought that they should know that >>> it wasn't meaningful, just in case they keep barking up that tree.
Yes, thanks for that. VSZ seemed much to large to be reasonable. but I
could not see anything else in the ps aux report that could be the
actual memory used (except %MEM and what it was a percentage of I had no
idea.)
Only you know the particulars of your system.
You say you're using Plasma, and "plasma freezes" shows in Google searches.
If you were to note a rising RAM consumption (say, leave top running
when freeze occurs), in the example here, one post mentions
it is possible it's a video card driver issue. The graphics stack
is producing frames, but there is no consumer.
https://bugzilla.redhat.com/show_bug.cgi?id=1399396
In some cases, the NVidia driver has watchdog enabled, the Nouveau
does not. This makes the Nvidia driver capable of doing a VPU recover,
and restoring service before it is too late.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 39:08:53 |
Calls: | 6,708 |
Calls today: | 1 |
Files: | 12,241 |
Messages: | 5,353,638 |