I have an "image" iso, that was copied OK to my server, but now when I
try to either scp it to another server or copy it locally my server crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
I'll try and test with vbox, when I have created another large image
file :-)
And will also test my T4
I have an "image" iso, that was copied OK to my server, but now when I
try to either scp it to another server or copy it locally my server >crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
YTC#1 <bdp@ytc1-spambin.co.uk> writes:
I have an "image" iso, that was copied OK to my server, but now when I
try to either scp it to another server or copy it locally my server
crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
Most likely it is bad hardware. I've certainly dealt with many files (including ISOs) larger than 16GB in size on Solaris boxes, as well as
a bazillion other people.
If you are running ZFS, what does 'zpool status' show? I'm guessing
you'd see errors here. You should see a bunch of zeros.
On 10/12/2021 18:03, Doug McIntyre wrote:
YTC#1 <bdp@ytc1-spambin.co.uk> writes:
I have an "image" iso, that was copied OK to my server, but now when I
try to either scp it to another server or copy it locally my server
crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
Most likely it is bad hardware. I've certainly dealt with many files
(including ISOs) larger than 16GB in size on Solaris boxes, as well as
a bazillion other people.
If you are running ZFS, what does 'zpool status' show? I'm guessing
you'd see errors here. You should see a bunch of zeros.
That ws my 1st port of call.
---8<
pool: rpool
id: 7278334453663277700
state: ONLINE
scan: scrub repaired 0 in 3h58m with 0 errors on Tue Nov 23 13:32:11
2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
---8<
I've just another scrub, as I have realsied the file was copied to the
system after Sunday.
On 11/12/2021 09:50, YTC#1 wrote:
On 10/12/2021 18:03, Doug McIntyre wrote:
YTC#1 <bdp@ytc1-spambin.co.uk> writes:
I have an "image" iso, that was copied OK to my server, but now when I >>>> try to either scp it to another server or copy it locally my server
crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
Most likely it is bad hardware. I've certainly dealt with many files
(including ISOs) larger than 16GB in size on Solaris boxes, as well as
a bazillion other people.
If you are running ZFS, what does 'zpool status' show? I'm guessing
you'd see errors here. You should see a bunch of zeros.
That ws my 1st port of call.
---8<
pool: rpool
id: 7278334453663277700
state: ONLINE
scan: scrub repaired 0 in 3h58m with 0 errors on Tue Nov 23
13:32:11 2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0 >> c1t0d0 ONLINE 0 0 0 >> c4t2d0 ONLINE 0 0 0 >>
---8<
I've just another scrub, as I have realsied the file was copied to the
system after Sunday.
Well, that broke it. Good style.
Fails to boot, beyond devices, hangs at pci@0,0/pci15d9,888@14/storage,c/esi@0,1 (ses 0) unknown
(nah, I don't know what the esi is either :-) )
It appears to have seen all the (4 disks).
Looks like I need to go into debug mode tomorrow, probably try single
disk (no mirror) boots (after I bring up an inspect via PXE)
But of course my PXE boot is a zone on the server :-) (I'll have to use
my spare on my Mac :-) ).
Looks like I will have to test my backup/DR procedures then .....
On 11/12/2021 18:27, YTC#1 wrote:
On 11/12/2021 09:50, YTC#1 wrote:After letting it "rest" and having a mull over it, I concluded it is
On 10/12/2021 18:03, Doug McIntyre wrote:
YTC#1 <bdp@ytc1-spambin.co.uk> writes:
I have an "image" iso, that was copied OK to my server, but now when I >>>>> try to either scp it to another server or copy it locally my server
crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
Most likely it is bad hardware. I've certainly dealt with many files
(including ISOs) larger than 16GB in size on Solaris boxes, as well as >>>> a bazillion other people.
If you are running ZFS, what does 'zpool status' show? I'm guessing
you'd see errors here. You should see a bunch of zeros.
That ws my 1st port of call.
---8<
pool: rpool
id: 7278334453663277700
state: ONLINE
scan: scrub repaired 0 in 3h58m with 0 errors on Tue Nov 23
13:32:11 2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0 >>> c1t0d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
---8<
I've just another scrub, as I have realsied the file was copied to
the system after Sunday.
Well, that broke it. Good style.
Fails to boot, beyond devices, hangs at
pci@0,0/pci15d9,888@14/storage,c/esi@0,1 (ses 0) unknown
(nah, I don't know what the esi is either :-) )
It appears to have seen all the (4 disks).
Looks like I need to go into debug mode tomorrow, probably try single
disk (no mirror) boots (after I bring up an inspect via PXE)
But of course my PXE boot is a zone on the server :-) (I'll have to
use my spare on my Mac :-) ).
Looks like I will have to test my backup/DR procedures then .....
possibly a SATA controller issue. I have 2 controllers in the server
(built in and a PCIE card).
I disconnected all drives, except a single rpool, connected to the on
board SATA.
System booted.
I added a single data pool to the on board SATA.
System booted.
I added all disks to on board SATA only.
System booted.
Next test (tomorrow) will be to copy the large file again.
If it is the 2nd SATA that will annoy me, as I would have expected to
just lose 1/2 my disks if that failed, not the entire sysetm.
On 13/12/2021 09:14, YTC#1 wrote:
On 11/12/2021 18:27, YTC#1 wrote:
On 11/12/2021 09:50, YTC#1 wrote:After letting it "rest" and having a mull over it, I concluded it is
On 10/12/2021 18:03, Doug McIntyre wrote:
YTC#1 <bdp@ytc1-spambin.co.uk> writes:
I have an "image" iso, that was copied OK to my server, but now
when I
try to either scp it to another server or copy it locally my server >>>>>> crashes/panics.
S11.4
Supermicro H/W (X11SSSZ-F), I5 7400
48Gb
File size is 16Gb
*always* panics at 9.1Gb copied
Not a prod box.
runs a number of zones (including a KZ)
Just wondering if a known issue, or one that needs chasing.
Most likely it is bad hardware. I've certainly dealt with many files >>>>> (including ISOs) larger than 16GB in size on Solaris boxes, as well as >>>>> a bazillion other people.
If you are running ZFS, what does 'zpool status' show? I'm guessing
you'd see errors here. You should see a bunch of zeros.
That ws my 1st port of call.
---8<
pool: rpool
id: 7278334453663277700
state: ONLINE
scan: scrub repaired 0 in 3h58m with 0 errors on Tue Nov 23
13:32:11 2021
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0 >>>> c1t0d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
---8<
I've just another scrub, as I have realsied the file was copied to
the system after Sunday.
Well, that broke it. Good style.
Fails to boot, beyond devices, hangs at
pci@0,0/pci15d9,888@14/storage,c/esi@0,1 (ses 0) unknown
(nah, I don't know what the esi is either :-) )
It appears to have seen all the (4 disks).
Looks like I need to go into debug mode tomorrow, probably try single
disk (no mirror) boots (after I bring up an inspect via PXE)
But of course my PXE boot is a zone on the server :-) (I'll have to
use my spare on my Mac :-) ).
Looks like I will have to test my backup/DR procedures then .....
possibly a SATA controller issue. I have 2 controllers in the server
(built in and a PCIE card).
I disconnected all drives, except a single rpool, connected to the on
board SATA.
System booted.
I added a single data pool to the on board SATA.
System booted.
I added all disks to on board SATA only.
System booted.
Next test (tomorrow) will be to copy the large file again.
If it is the 2nd SATA that will annoy me, as I would have expected to
just lose 1/2 my disks if that failed, not the entire sysetm.
And during the copy (scp to desktop), the follwoing appeared in /var/adm/messages at approx 9.1Gb copied. The copy stalled and the
continued
---8<
Dec 14 10:12:47 ytc1 genunix: [ID 408114 kern.notice] /pci@0,0/pci15d9,888@14/storage@7 (scsa2usb2) removed
Dec 14 10:20:14 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:14 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:14 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:14 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:14 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:20 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:20 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:20 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:20 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:20 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:25 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:25 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:25 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:25 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:25 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:30 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:30 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:30 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:30 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:30 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:35 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:35 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:35 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:35 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:36 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:41 ytc1 ahci: [ID 296163 kern.warning] WARNING: ahci0: ahci
port 2 has task file error
Dec 14 10:20:41 ytc1 ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci
port 2 is trying to do error recovery
Dec 14 10:20:41 ytc1 ahci: [ID 693748 kern.warning] WARNING: ahci0: ahci
port 2 task_file_status = 0x4041
Dec 14 10:20:41 ytc1 ahci: [ID 657156 kern.warning] WARNING: ahci0:
error recovery for port 2 succeed
Dec 14 10:20:41 ytc1 ahci: [ID 811322 kern.notice] NOTICE: ahci0: ahci_tran_reset_dport port 2 reset device
Dec 14 10:20:41 ytc1 scsi: [ID 583609 kern.warning] WARNING: /pci@0,0/pci15d9,888@17/disk@2,0 (sd8): disk not responding to selection ---8<
sd8 is the rpool mirror, which had been attached to the PCIE sata
controller
No issue now when copying from internal disk to internal disk (rpool to
data pool)
recopied from server to desktop (scp, message did not re-appear).
So, I guess I am looking at a HDD issue, time to buy a new one. Or maybe upgrade to SSD :-)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 292 |
Nodes: | 16 (2 / 14) |
Uptime: | 186:06:18 |
Calls: | 6,616 |
Files: | 12,165 |
Messages: | 5,314,899 |