Hi. So, foolish me, I decided to go from a working 5.10.155 system to
try latest lts of 5.15 which is 5.15.93. Compile, install went well,
but the system keeps rebooting. It gets all the way and even starts
the local services and then here are the last few lines, which may be relevant or not:
Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting local.service...
Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting systemd-update-utmp-runlevel.service...
Feb 14 06:36:31 ccs.covici.com bash[5753]: rm: cannot remove '/etc/ppp/provider_is_up': No such file or directory
Feb 14 06:36:31 ccs.covici.com systemd[1]: systemd-update-utmp-runlevel.service: Deactivated successfully.
Feb 14 06:36:31 ccs.covici.com systemd[1]: Finished systemd-update-utmp-runlevel.service.
-- Boot 5c394be675854680a9cb616208f374f3 --
Any trouble shooting suggestions as to what is making the system
reboot?
On Tue, Feb 14, 2023 at 9:08 AM John Covici <covici@ccs.covici.com> wrote:
Hi. So, foolish me, I decided to go from a working 5.10.155 system to
try latest lts of 5.15 which is 5.15.93. Compile, install went well,
but the system keeps rebooting. It gets all the way and even starts
the local services and then here are the last few lines, which may be relevant or not:
Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting local.service...
Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting systemd-update-utmp-runlevel.service...
Feb 14 06:36:31 ccs.covici.com bash[5753]: rm: cannot remove '/etc/ppp/provider_is_up': No such file or directory
Feb 14 06:36:31 ccs.covici.com systemd[1]: systemd-update-utmp-runlevel.service: Deactivated successfully.
Feb 14 06:36:31 ccs.covici.com systemd[1]: Finished systemd-update-utmp-runlevel.service.
-- Boot 5c394be675854680a9cb616208f374f3 --
Any trouble shooting suggestions as to what is making the system
reboot?
Where are you getting this from, the system log/journal? This doesn't
seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
expect the most critical info to be in the log (since it will stop
syncing to protect the filesystem). The details you need probably
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device. This won't be interrupted by a PANIC unless
there is some issue with the hardware or networking stack.
If you can get the final messages on dmesg and the panic core dump
that would help.
The other thing you can do is try to capture a kernel core dump, but
that is a bit more complicated to set up.
Otherwise your log is just going to say that everything was fine until
it wasn't.
On Tue, 14 Feb 2023 14:08:34 -0500,
Rich Freeman wrote:
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device.
OK, how would I set up logging to a network and what would I have to
do on another computer -- which in my case is Windows?
On Tue, Feb 14, 2023 at 2:54 PM John Covici <covici@ccs.covici.com> wrote:Sounds great -- I notice you ommitted the ip address, my network
On Tue, 14 Feb 2023 14:08:34 -0500,
Rich Freeman wrote:
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device.
OK, how would I set up logging to a network and what would I have to
do on another computer -- which in my case is Windows?
The docs are at: https://www.kernel.org/doc/Documentation/networking/netconsole.txt
(you can also google for linux netconsole for some wiki articles on it)
I have on my command line: netconsole=@/,6666@10.1.0.52
That IP is the host I want the log traffic to go to. (Read the docs
if you have a more complicated networking setup - I assume that will
just run ARP and send stuff out without using a gateway/etc.)
Then on a receiving linux host I'd run (I think - it has been a while):
nc -u -l -p 6666
Now, you mentioned Windows. I've never used it, but nmap has a
program available in a windows version called ncat that might do the
job: https://nmap.org/ncat/
You just want to make sure you have it listening on port 6666 for UDP.
Make sure you use UDP or you won't receive anything.
If it is working you should get a ton of log spam when your host boots
- anything that shows up in dmesg will show up in the network console.
It is sent in realtime.
Where are you getting this from, the system log/journal? This doesn't
seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
expect the most critical info to be in the log (since it will stop
syncing to protect the filesystem). The details you need probably
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device. This won't be interrupted by a PANIC unless
there is some issue with the hardware or networking stack.
On 2023-02-14, Rich Freeman <rich0@gentoo.org> wrote:
Where are you getting this from, the system log/journal? This doesn't
seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
expect the most critical info to be in the log (since it will stop
syncing to protect the filesystem). The details you need probably
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device. This won't be interrupted by a PANIC unless
there is some issue with the hardware or networking stack.
If you've got a serial port[1], you could also set up serial
logging. Though using serial ports have become a bit of a lost art,
the serial console code in the kernel is pretty carefully designed to
be the last man standing when things start to die. It's possible
(though I wouldn't say probable) that a serial console will be able to
show you stuff closer to the event horizon than a network console can.
Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002J27R8/
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
On 2023-02-14, Rich Freeman <rich0@gentoo.org> wrote:
Where are you getting this from, the system log/journal? This doesn't
seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
expect the most critical info to be in the log (since it will stop
syncing to protect the filesystem). The details you need probably
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device. This won't be interrupted by a PANIC unless
there is some issue with the hardware or networking stack.
If you've got a serial port[1], you could also set up serial
logging. Though using serial ports have become a bit of a lost art,
the serial console code in the kernel is pretty carefully designed to
be the last man standing when things start to die. It's possible
(though I wouldn't say probable) that a serial console will be able to
show you stuff closer to the event horizon than a network console can.
Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002J27R8/
The sending computer has two nics, eno1 for the internal network and
eno2 is on the internet. So, my netconsole stanza said netconsole=@192.168.0.1/eno1,@192.168.0.2
The box which is at 192.168.0.2 has netcat (windows version) and I
tried the following:
netcat -u -v -l 192.168.0.2 6666 and I also tried 192.168.0.1 6666
which is the ip address of the linux console which I am trying to
debug.
I also tried 0.0.0.0 6666 which did not work either, but I think the
windows firewall was blocking, and I did fix that, but did not try the 0.0.0.0 after that.
On Thu, Feb 16, 2023 at 6:50 AM John Covici <covici@ccs.covici.com> wrote:
The sending computer has two nics, eno1 for the internal network and
eno2 is on the internet. So, my netconsole stanza said netconsole=@192.168.0.1/eno1,@192.168.0.2
Is CONFIG_NETCONSOLE enabled for your kernel?
I'm not sure if the kernel will assign the names eno1/2 to interfaces
- I think those might be assigned by udev, which probably won't have
run before the kernel parses this instruction. You might need to use
eth0/1 - and your guess is as good as mine which one corresponds to
which.
If it isn't one of those it might not hurt to put the target mac
address in there just to be safe. I haven't needed that but maybe
there are situations where ARP won't work (it would be needed if you
are crossing subnets, in which case you'd need the gateway MAC). Keep
in mind that this is a low-level function that doesn't use any routing/userspace/etc. It was designed to be robust in the event of a
PANIC and to be able to be enabled fairly early during boot, so it
can't rely on the sorts of things we just take for granted with
networking.
The box which is at 192.168.0.2 has netcat (windows version) and I
tried the following:
netcat -u -v -l 192.168.0.2 6666 and I also tried 192.168.0.1 6666
which is the ip address of the linux console which I am trying to
debug.
I also tried 0.0.0.0 6666 which did not work either, but I think the windows firewall was blocking, and I did fix that, but did not try the 0.0.0.0 after that.
So I'm pretty sure that netcat requires listing the destination IP,
since it has to open a socket to listen on that IP. You can
optionally set a source address/port in which case it will ignore
anything else, but by default it will accept packets from any source.
I was definitely going to suggest making sure that a windows firewall
wasn't blocking the inbound connections. That's fairly default
behavior on windows.
<br></div><div>?</div><div><br></div><div>HTH,</div><div>Mark</div></div>
hmmm, but what should I use for the source ip, I only assign those
when I bring the interface up when I start the interface -- I have
something like this:
[Unit]
Description=Network Connectivity for %i
...
So, before I run this, I don't think the card has any ip address, does
it?
-----Original Message-----
From: John Covici <covici@ccs.covici.com>
Sent: Wednesday, February 15, 2023 7:20 AM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Re: my 5.15.93 kernel keeps rebooting
On Wed, 15 Feb 2023 09:50:27 -0500,
Grant Edwards wrote:
On 2023-02-14, Rich Freeman <rich0@gentoo.org> wrote:
Where are you getting this from, the system log/journal? This
doesn't seem like a clean shutdown, so if it is a kernel PANIC I
wouldn't expect the most critical info to be in the log (since it
will stop syncing to protect the filesystem). The details you need
probably will be displayed on the console briefly. You can also
enable a network console, which will send the dmesg output
continuously over UDP to another device. This won't be interrupted
by a PANIC unless there is some issue with the hardware or networking stack.
If you've got a serial port[1], you could also set up serial logging.
Though using serial ports have become a bit of a lost art, the serial
console code in the kernel is pretty carefully designed to be the last
man standing when things start to die. It's possible (though I
wouldn't say probable) that a serial console will be able to show you
stuff closer to the event horizon than a network console can.
Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002
J27R8/
I do have one which I use for my speech synthesizer. I also have one on my other box which I could hook up -- if I can find my null modem cable. I think I will try the netconsole first and the serial console if that does not work.
Thanks for the hint.
https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what'sgoing on.
Well, some progress, but no joy. I found actual messages from
netconsole and it seems no matter what device I put for the source, netconsole says it doesn't exist. I tried my eno1, and also eth0 and
eth1. In my normal boot sequence, I see that udev renamed eth1 to
eno1, but netconsole still said it does not exist. So, I may have to
use the serial console method, I have to find my cables for that. I
did also try to add net.ifnames=0 to my boot options, but no joy
there.
--
Your life is like a penny. You're going to lose it. The question is:
How do
you spend it?
John Covici wb2una
covici@ccs.covici.com
[1 <text/plain; UTF-8 (7bit)>]
On Fri, Feb 17, 2023 at 12:03 PM John Covici <covici@ccs.covici.com> wrote: <SNIP>
Well, some progress, but no joy. I found actual messages from
netconsole and it seems no matter what device I put for the source, netconsole says it doesn't exist. I tried my eno1, and also eth0 and
eth1. In my normal boot sequence, I see that udev renamed eth1 to
eno1, but netconsole still said it does not exist. So, I may have to
use the serial console method, I have to find my cables for that. I
did also try to add net.ifnames=0 to my boot options, but no joy
there.
--
Your life is like a penny. You're going to lose it. The question is:
How do
you spend it?
John Covici wb2una
covici@ccs.covici.com
John,
I did a bad job at trying to point you in this direction the other day, and in my testing I'm not sure how well it works. However another
option you might investigate is on the receiving end you can
apparently set the transmitter's IP address by using the
transmitter's mac address. Supposedly you would execute
something like the following, with extra spaces added
for readability:
sudo arp -s 192.168.86.244 90:e6:ba:10:a3:e7 temp
which supposedly says 'when you see a packet with this
mac address associate it with this IP address'. The temp
part says don't add it to the permanent tables.
After executing this you are supposed to be able to use tools
that filter by IP address but I didn't have great results.
Hope this helps,
Mark
[2 <text/html; UTF-8 (quoted-printable)>]
-----Original Message-----
From: John Covici <covici@ccs.covici.com>
Sent: Wednesday, February 15, 2023 7:20 AM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Re: my 5.15.93 kernel keeps rebooting
On Wed, 15 Feb 2023 09:50:27 -0500,
Grant Edwards wrote:
On 2023-02-14, Rich Freeman <rich0@gentoo.org> wrote:
Where are you getting this from, the system log/journal? This
doesn't seem like a clean shutdown, so if it is a kernel PANIC I
wouldn't expect the most critical info to be in the log (since it
will stop syncing to protect the filesystem). The details you need
probably will be displayed on the console briefly. You can also
enable a network console, which will send the dmesg output
continuously over UDP to another device. This won't be interrupted
by a PANIC unless there is some issue with the hardware or networking stack.
If you've got a serial port[1], you could also set up serial logging.
Though using serial ports have become a bit of a lost art, the serial
console code in the kernel is pretty carefully designed to be the last
man standing when things start to die. It's possible (though I
wouldn't say probable) that a serial console will be able to show you
stuff closer to the event horizon than a network console can.
Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002
J27R8/
I do have one which I use for my speech synthesizer. I also have one on my other box which I could hook up -- if I can find my null modem cable. I think I will try the netconsole first and the serial console if that does not work.
Thanks for the hint.
https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what'sgoing on.
-----Original Message-----
From: John Covici <covici@ccs.covici.com>
Sent: Wednesday, February 15, 2023 7:20 AM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Re: my 5.15.93 kernel keeps rebooting
On Wed, 15 Feb 2023 09:50:27 -0500,
Grant Edwards wrote:
On 2023-02-14, Rich Freeman <rich0@gentoo.org> wrote:
Where are you getting this from, the system log/journal? This
doesn't seem like a clean shutdown, so if it is a kernel PANIC I
wouldn't expect the most critical info to be in the log (since it
will stop syncing to protect the filesystem). The details you need
probably will be displayed on the console briefly. You can also
enable a network console, which will send the dmesg output
continuously over UDP to another device. This won't be interrupted
by a PANIC unless there is some issue with the hardware or networking stack.
If you've got a serial port[1], you could also set up serial logging.
Though using serial ports have become a bit of a lost art, the serial
console code in the kernel is pretty carefully designed to be the last
man standing when things start to die. It's possible (though I
wouldn't say probable) that a serial console will be able to show you
stuff closer to the event horizon than a network console can.
Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...
[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:
https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002
J27R8/
I do have one which I use for my speech synthesizer. I also have one on my other box which I could hook up -- if I can find my null modem cable. I think I will try the netconsole first and the serial console if that does not work.
Thanks for the hint.
https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what'sgoing on.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 302 |
Nodes: | 16 (2 / 14) |
Uptime: | 100:15:38 |
Calls: | 6,767 |
Calls today: | 5 |
Files: | 12,295 |
Messages: | 5,376,420 |
Posted today: | 1 |