Hello all,
I update to the last stable kernel 5.15.11-gentoo with the same configuration as the old kernel and now, the boot time is quite long.
Test :
5.10.76-gentoo-r1 kernel : boot time 30s
5.15.11-gentoo kernel : boot time 70s
My setup (non EFI) :
- SSD 250 Go : /dev/sdd1 ext2 for boot and /dev/sdd2 ext4 for /
- SSD 250 Go /dev/sdc1 ext4 for home
- Two 2T sata disks Seagate BarraCuda 3.5 /dev/sda1 ext4 for data and /dev/sdb1 ext4 for data backup (Not Raid)
With the new kernel, the two Seagate disks seem to make the boot time
quite longer.
Test :
booting without mounting the disks : 20s
booting with mounting only one disk : 25s
booting with both disks : more than 60s
Testing the disks :
- smartctl -s on -a /dev/sda and smartctl -s on -a /dev/sdb : No error reported.
- fsck -a /dev/sda1 and fsck -a /dev/sdb1 : clean
- e2fsck -cfpv /dev/sda1 : clean
Nevertheless, dmesg shows a lot of errors (attached image) with the new kernel.
Those errors do not appear with 5.10.76-gentoo-r1 kernel.
I'm rather confused...
Have you any idea ?
On 26/12/2021 18:50, Jacques Montier wrote:
Hello all,
I update to the last stable kernel 5.15.11-gentoo with the same configuration as the old kernel and now, the boot time is quite long.
Test :
5.10.76-gentoo-r1 kernel : boot time 30s
5.15.11-gentoo kernel : boot time 70s
My setup (non EFI) :
- SSD 250 Go : /dev/sdd1 ext2 for boot and /dev/sdd2 ext4 for /
- SSD 250 Go /dev/sdc1 ext4 for home
- Two 2T sata disks Seagate BarraCuda 3.5 /dev/sda1 ext4 for data and /dev/sdb1 ext4 for data backup (Not Raid)
With the new kernel, the two Seagate disks seem to make the boot time
quite longer.
Test :
booting without mounting the disks : 20s
booting with mounting only one disk : 25s
booting with both disks : more than 60s
Testing the disks :
- smartctl -s on -a /dev/sda and smartctl -s on -a /dev/sdb : No error reported.
- fsck -a /dev/sda1 and fsck -a /dev/sdb1 : clean
- e2fsck -cfpv /dev/sda1 : clean
Nevertheless, dmesg shows a lot of errors (attached image) with the new kernel.
Those errors do not appear with 5.10.76-gentoo-r1 kernel.
I'm rather confused...
Have you any idea ?
What does fdisk print say? Are your partitions mis-aligned?
Unlikely, but it depends how long ago they were partitioned. There's all
this stuff about switching from 512B to 4K sectors and that *could* be
the problem.
Cheers,
Wol
<br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le lun. 27 déc. 2021 à 01:44, Wols Lists <<a href="mailto:antlists@youngman.org.uk">antlists@youngman.org.uk</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 26/12/2021 18:50, Jacques Montier wrote:<br>
<div> </div></div></div>
Well, i don't know if my partitions are aligned or mis-aligned... How
could i get it ?
On 27/12/2021 11:07, Jacques Montier wrote:
Well, i don't know if my partitions are aligned or mis-aligned... How
could i get it ?
fdisk would have spewed a bunch of warnings. So you're okay.
I'm not sure of the details, but it's the classic "off by one" problem -
if there's a mismatch between the kernel block size and the disk block
size any writes required doing a read-update-write cycle which of course knackered performance. I had that hit a while back.
But seeing as fdisk isn't moaning, that isn't the problem ...
Cheers,
Wol
On Monday, 27 December 2021 11:32:39 GMT Wols Lists wrote:
On 27/12/2021 11:07, Jacques Montier wrote:
Well, i don't know if my partitions are aligned or mis-aligned... How
could i get it ?
fdisk would have spewed a bunch of warnings. So you're okay.
I'm not sure of the details, but it's the classic "off by one" problem -
if there's a mismatch between the kernel block size and the disk block
size any writes required doing a read-update-write cycle which of course
knackered performance. I had that hit a while back.
But seeing as fdisk isn't moaning, that isn't the problem ...
Cheers,
Wol
I also thought of misaligned boundaries when I first saw the error, but the mention of Seagate by the OP pointed me to another edge case which crept up with zstd compression on ZFS. I'm mentioning it here in case it is relevant:
https://livelace.ru/posts/2021/Jul/19/unaligned-write-command/
On 27/12/2021 13:40, Michael wrote:
On Monday, 27 December 2021 11:32:39 GMT Wols Lists wrote:that might be of interest to me ... I'm getting system lockups but
On 27/12/2021 11:07, Jacques Montier wrote:
Well, i don't know if my partitions are aligned or mis-aligned... How
could i get it ?
fdisk would have spewed a bunch of warnings. So you're okay.
I'm not sure of the details, but it's the classic "off by one"
problem -
if there's a mismatch between the kernel block size and the disk block
size any writes required doing a read-update-write cycle which of
course
knackered performance. I had that hit a while back.
But seeing as fdisk isn't moaning, that isn't the problem ...
Cheers,
Wol
I also thought of misaligned boundaries when I first saw the error,
but the
mention of Seagate by the OP pointed me to another edge case which
crept up
with zstd compression on ZFS. I'm mentioning it here in case it is
relevant:
https://livelace.ru/posts/2021/Jul/19/unaligned-write-command/
it's not an SSD. I've got two IronWolves and a Barracuda.
But I notice the OP has a Barra*C*uda. Note the different spelling.
That's a shingled drive I believe, which shouldn't make a lot of
difference in light usage, but you don't want to hammer it!
Cheers,
Wol
On 27/12/2021 13:40, Michael wrote:
On Monday, 27 December 2021 11:32:39 GMT Wols Lists wrote:
On 27/12/2021 11:07, Jacques Montier wrote:
Well, i don't know if my partitions are aligned or mis-aligned... How
could i get it ?
fdisk would have spewed a bunch of warnings. So you're okay.
I'm not sure of the details, but it's the classic "off by one" problem - >> if there's a mismatch between the kernel block size and the disk block
size any writes required doing a read-update-write cycle which of course >> knackered performance. I had that hit a while back.
But seeing as fdisk isn't moaning, that isn't the problem ...
Cheers,
Wol
I also thought of misaligned boundaries when I first saw the error, but the mention of Seagate by the OP pointed me to another edge case which crept up with zstd compression on ZFS. I'm mentioning it here in case it is relevant:
https://livelace.ru/posts/2021/Jul/19/unaligned-write-command/
that might be of interest to me ... I'm getting system lockups but it's
not an SSD. I've got two IronWolves and a Barracuda.
But I notice the OP has a Barra*C*uda. Note the different spelling.
That's a shingled drive I believe, which shouldn't make a lot of
difference in light usage, but you don't want to hammer it!
Wols Lists wrote:
On 27/12/2021 13:40, Michael wrote:I don't recall seeing this mentioned but this may be part of the issue
On Monday, 27 December 2021 11:32:39 GMT Wols Lists wrote:that might be of interest to me ... I'm getting system lockups but
On 27/12/2021 11:07, Jacques Montier wrote:I also thought of misaligned boundaries when I first saw the error,
Well, i don't know if my partitions are aligned or mis-aligned... How >>>>> could i get it ?fdisk would have spewed a bunch of warnings. So you're okay.
I'm not sure of the details, but it's the classic "off by one"
problem -
if there's a mismatch between the kernel block size and the disk block >>>> size any writes required doing a read-update-write cycle which of
course
knackered performance. I had that hit a while back.
But seeing as fdisk isn't moaning, that isn't the problem ...
Cheers,
Wol
but the
mention of Seagate by the OP pointed me to another edge case which
crept up
with zstd compression on ZFS. I'm mentioning it here in case it is
relevant:
https://livelace.ru/posts/2021/Jul/19/unaligned-write-command/
it's not an SSD. I've got two IronWolves and a Barracuda.
But I notice the OP has a Barra*C*uda. Note the different spelling.
That's a shingled drive I believe, which shouldn't make a lot of
difference in light usage, but you don't want to hammer it!
Cheers,
Wol
unless I'm missing something that rules this out. Could it be a drive
is a SMR drive? I recently made a new backup after wiping out the
drive. I know the backup drive is a SMR drive. At first, it copied at
a fairly normal speed but after a short time frame, it started slowing down. At times, it would do only about 50 to 60MBs/sec. It started out
at well over 100MBs/sec which is fairly normal for this rig. I would
stop the copy process, let it catch up and restart just to give it some
time to process. I can't say it was any faster that way tho.
The way I noticed my drive was SMR, I could feel the heads going back
and forth by putting my hand on the enclosure. It had a bumpy feel to
it. You can't really hear it tho. If you can feel those little bumps
even when the drive isn't mounted, I'd be thinking it is a SMR drive.
There are also sites that you can look this sort of thing up on too. If needed, I can go dig out some links.
Just thought it worth a mention.
Dale
:-) :-)
A point to keep in mind - if you can feel the drive moving it may be generating errors! Depending on the drive, the errors may just be
handled internally and I can see it slowing things down though probably
would be barely noticeable. I have seen it myself with random errors
from a WD green drive disappearing when properly immobilised. When investigating I ran across articles discussing the problem, one of which fastened the drives to a granite slab for tests! Also see discussions
on NAS seups and vibrations affecting co located drives.
BillK
** Interesting read https://www.ept.ca/features/everything-need-know-hard-drive-vibration/
A point to keep in mind - if you can feel the drive moving it may be generating errors! Depending on the drive, the errors may just be
handled internally and I can see it slowing things down though
probably would be barely noticeable. I have seen it myself with
random errors from a WD green drive disappearing when properly
immobilised. When investigating I ran across articles discussing the problem, one of which fastened the drives to a granite slab for
tests! Also see discussions on NAS seups and vibrations affecting co located drives.
BillK
** Interesting read https://www.ept.ca/features/everything-need-know-hard-drive-vibration/
William Kenworthy wrote:
A point to keep in mind - if you can feel the drive moving it may be generating errors! Depending on the drive, the errors may just be
handled internally and I can see it slowing things down though
probably would be barely noticeable. I have seen it myself with
random errors from a WD green drive disappearing when properly
immobilised. When investigating I ran across articles discussing the problem, one of which fastened the drives to a granite slab for
tests! Also see discussions on NAS seups and vibrations affecting co located drives.
BillK
** Interesting read https://www.ept.ca/features/everything-need-know-hard-drive-vibration/
This is just because it is a SMR drive. It's done this ever since I
bought the drive and it has passed all tests. There's a whole thread on
this dating back several years. I managed to buy a SMR drive before I
even knew they existed. Once it fills up that PMR section, it gets
really slow.
Dale
:-) :-)
<i><br></i></div></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le mar. 28 déc. 2021 à 13:32, Dale <<a href="mailto:rdalek1967@gmail.com">rdalek1967@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">William Kenworthy wrote:<br>
<br></div><div>I read on the net that it could be possible to desactivate the sata protocol NCQ (Native Command Queuing)</div><div>So, in the grub file, i added the ligne GRUB_CMDLINE_LINUX=libata.force=noncq</div><div>Now, all the errors messagesare gone and the booting time gets down to 24s with the two kernel versions.</div><div>BUT : do you think it could damage or slow down my SSD and HDD disks ?</div><div><br></div><div>Thanks again,</div><div><br></div><div>Regards,</div><div><br></div><
--</div><div>Jacques</div><div><br></div><div><br></div><div><br></div><div> </div></div></div>
Le mar. 28 déc. 2021 à 13:32, Dale <rdalek1967@gmail.com> a écrit :
William Kenworthy wrote:Hello all,
A point to keep in mind - if you can feel the drive moving it may be
generating errors! Depending on the drive, the errors may just be
handled internally and I can see it slowing things down though
probably would be barely noticeable. I have seen it myself with
random errors from a WD green drive disappearing when properly
immobilised. When investigating I ran across articles discussing the
problem, one of which fastened the drives to a granite slab for
tests! Also see discussions on NAS seups and vibrations affecting co
located drives.
BillK
** Interesting read
https://www.ept.ca/features/everything-need-know-hard-drive-vibration/
This is just because it is a SMR drive. It's done this ever since I
bought the drive and it has passed all tests. There's a whole thread on
this dating back several years. I managed to buy a SMR drive before I
even knew they existed. Once it fills up that PMR section, it gets
really slow.
Dale
:-) :-)
Thanks a lot for all your responses !
I think this issue is kernel related.
No problem with 5.10.76-gentoo-r1, but the issue appears
with 5.15.11-gentoo.
I read on the net that it could be possible to desactivate the sata
protocol NCQ (Native Command Queuing)
So, in the grub file, i added the
ligne GRUB_CMDLINE_LINUX=libata.force=noncq
Now, all the errors messages are gone and the booting time gets down to
24s with the two kernel versions.
BUT : do you think it could damage or slow down my SSD and HDD disks ?
Thanks again,
Regards,
--
Jacques
Me again !
<i><br></i></div></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le mar. 28 déc. 2021 à 14:03, Jacques Montier <<a href="mailto:jmontier@gmail.com">jmontier@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div><div
<br></div><div>I read on the net that it could be possible to desactivate the sata protocol NCQ (Native Command Queuing)</div><div>So, in the grub file, i added the ligne GRUB_CMDLINE_LINUX=libata.force=noncq</div><div>Now, all the errors messagesare gone and the booting time gets down to 24s with the two kernel versions.</div><div>BUT : do you think it could damage or slow down my SSD and HDD disks ?</div><div><br></div><div>Thanks again,</div><div><br></div><div>Regards,</div><div><br></div><
--</div><div>Jacques</div><div><br></div><div><br></div><div><br></div></div></div></blockquote><div>Me again !</div><div><br></div><div>Well, il cleaned my dusty mobo, unplugged and plugged again the sata cables.</div><div>Now, with or without NCQ,boot time is rather short (~28s).</div><div>So it seems it was a connection problem.</div><div><br></div><div>I still have some errors as :</div><div><br></div><div>............................................</div><div>[ 24.708377] ata6.00:
..........................................<br></div><div><br></div><div>To be sure, i'll buy some news sata cables.</div><div><br></div><div>Sorry for the noise and thanks again for having helped me.</div><div><br></div><div>Regards,</div><div><quote"><div></div><div> </div></div></div>
</div><div>--</div><div>Jacques</div><div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_
Well, il cleaned my dusty mobo, unplugged and plugged again the sata cables. Now, with or without NCQ, boot time is rather short (~28s).
So it seems it was a connection problem.
I don't recall seeing this mentioned but this may be part of the issue
unless I'm missing something that rules this out. Could it be a drive
is a SMR drive?
Am Mon, Dec 27, 2021 at 08:15:51AM -0600 schrieb Dale:
I don't recall seeing this mentioned but this may be part of the issue
unless I'm missing something that rules this out. Could it be a drive
is a SMR drive?
SMR may slow down drive response time and throughput, but it should never generate I/O errors in the syslog. If resetting or swapping the SATA cables does not help, then I’d suspect the drive going bad. A long selftest might be in order (smartctl -t long). smartctl -a shows how long this will take approximately (it’s rather accurate).
For my PC’s rust drive (1 TB WD Blue) it says:
Extended self-test routine
recommended polling time: ( 113) minutes.
Frank Steinmetzger wrote:
Am Mon, Dec 27, 2021 at 08:15:51AM -0600 schrieb Dale:
I don't recall seeing this mentioned but this may be part of the issue
unless I'm missing something that rules this out. Could it be a drive
is a SMR drive?
SMR may slow down drive response time and throughput, but it should never generate I/O errors in the syslog. If resetting or swapping the SATA cables does not help, then I’d suspect the drive going bad.
A long selftest might be in order (smartctl -t long). smartctl -a shows
how long this will take approximately (it’s rather accurate).
For my PC’s rust drive (1 TB WD Blue) it says:
Extended self-test routine
recommended polling time: ( 113) minutes.
If it helps any, my 6Tb drive takes around 700 minutes. My 8Tb drive
takes around 1200 minutes.
Yea, over two days. O_O
Dale
:-) :-)
Am Sun, Jan 02, 2022 at 01:38:01PM -0600 schrieb Dale:
Same for my 6 TB Reds in the NAS. But 1200 is a rather big increase. Did you ever try this? Almost double for only one third more capacity.
I suspect that internally the drive can do the long selftest in parallel -- all platters at the same time. But when going from CMR to SMR, platter count does not grow linearly with capacity. So the drive may have ⅓ more capacity,
but number of platters stayed the same.
Yea, over two days. O_OUhm, even without a calculator, I challange that. 1 hour is 60 minutes, so
10 hours = 600 minutes, making 1200 minutes a mere 20 hours. ;-)
Dale
:-) :-)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 68:42:32 |
Calls: | 6,655 |
Calls today: | 1 |
Files: | 12,200 |
Messages: | 5,332,040 |
Posted today: | 1 |