Thanks. I've done some reading and there is more to do plus some experimentation.
I understand that mdadm is used to create the raid arrays, it is not
part of lvm itself?
OK. Simple to set up with the SSDs as one is blank and the other has
free space.
With RAID-10 far 2 am I correct in assuming that the available capacity
for a two device array of 3TB disks would be 3TB (two copies of data)?
Part of the RAID-10 with SSD can be used as cache.
OK, so having done some reading up but not carried out any testing I
think the following is a possible setup. Note I am using LUKS so I will
add this extra layer to the mix. I have not noticed any significant performance difference with and without it.
SSDs:
Create empty partition on larger system SSD.
Add smaller empty SSD.
Create raid 10 far 2 raid array with mdadm for example (need to check
what metadata means!):
mdadm --create /dev/md-ssd --level=10 --metadata=0.90 --raid-devices=2 --layout=f2 /dev/sda4 /dev/sdb
Use crypt setup to create an encrypted block device on top of this raid
10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2 device.
Create a logical volume on top of the above to use as the cache device.
HDD:
Similar to the SSD case above except that the logical volume will be for
the slow disks doing the bulk of the work.
Create a cached device:
Use lvmcache to create a device using the the two logical volumes
created above (bcache would also work).
Create a file system on top of the cached device. If using btrfs (what
I do now) use single for data as the raid is occuring a couple of levels down.
Of course, I could have got completely the wrong idea above and invented
some horrid monster!
It looks like I could do this without the LVM layer between the LUKs and cache parts. However, this does give me the flexability to create
logical volumes that are on HDD RAID10, SDD RAID10 soley as well as
cached. Or different cache options. I think I could have put another
LVM on top of the cached item I was creating but I think that was over
doing it.
On 2016-12-10, Peter Chant <pete@petezilla.co.uk> wrote:
Is there a reason why you gave that one as an example, chipset,
manufacturer or was it simply an inexpensive board with useful specs?
Just an example of a low-cost SATA III board. There are plenty around.
I used one similar to this for my SSD before upgrading to a board that
had built-in SATA III.
It sounds like your bottleneck is the mechanical drives. To significantly speed up you'll need faster drives, or at least a faster type of RAID array.
Be careful, it is easy to destroy data.
Maybe you can practice with loop devices.
There are some howtos around.
With RAID-10 far 2 am I correct in assuming that the available capacity
for a two device array of 3TB disks would be 3TB (two copies of data)?
Yep.
SSDs:
Create empty partition on larger system SSD.
Add smaller empty SSD.
Create raid 10 far 2 raid array with mdadm for example (need to check
what metadata means!):
mdadm --create /dev/md-ssd --level=10 --metadata=0.90 --raid-devices=2
--layout=f2 /dev/sda4 /dev/sdb
Metadata 1.0, 1.1 and 1.2 are the new ones, use these,
not the 0.90, which have less features.
Maybe, not really sure, but partitioning /dev/sdb might
be better.
Use crypt setup to create an encrypted block device on top of this raid
10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2
device.
Or the other way around.
I'm not sure which is better, maybe your proposal.
Create a logical volume on top of the above to use as the cache device.
Yep, if you use lvmcache, maybe bcache can do as well.
HDD:
Similar to the SSD case above except that the logical volume will be for
the slow disks doing the bulk of the work.
OK, I think.
Create a cached device:
Use lvmcache to create a device using the the two logical volumes
created above (bcache would also work).
Seems good to me.
Create a file system on top of the cached device. If using btrfs (what
I do now) use single for data as the raid is occuring a couple of levels
down.
Well, it might be btrfs has already RAID-10.
Again, code is shared between this and md too.
Of course, I could have got completely the wrong idea above and invented
some horrid monster!
If I understood it right, it sounds OK to me.
I would, in any case, strongly suggest to experiment,
maybe, as mentioned above, with loop devices.
Not for performances, but for practising possible
combinations and layouts.
Then there is the story of the caching, which has
different scope and performances between bcache
and lvmcache.
In your specific case, I cannot judge which is better,
but lvmcache seems to me easier.
It looks like I could do this without the LVM layer between the LUKs and
cache parts. However, this does give me the flexability to create
logical volumes that are on HDD RAID10, SDD RAID10 soley as well as
cached. Or different cache options. I think I could have put another
LVM on top of the cached item I was creating but I think that was over
doing it.
Probably it is.
I would try to use not more than one component at time.
So, 1 md, 1 LUKS, 1 LVM, at maximum.
If possible less.
It would be also possible to create two PVs out of the two
RAID-10 and a single VG with both.
Then the LV can be fitted in one or the other PV.
LUKS can be at RAID level or on top of LV.
This type of "generic" setup has some advantages in case
of hardware updates (easy to add / remove storage devices,
by using pvmove).
bye,
I'm wondering if there are any cheap / easy speed ups for an IO bound machine?
ASUS M4A78 Pro motherboard
Phenom II x6 1090T processor (fastest or second fastest that will
go in this mobo)
8GB RAM
480GB SSD for system files, btrfs, single.
2x 3TB WD red HDD as btrfs RAID1 for data.
Also barely used DVD burner.
I use the on-board graphics.
Slackware 14.1 (in process of 14.2 upgrade) with 4.8.5 kernel.
Using atop etc it seems that the machine is IO bound. Also notice that
if apache / php / mariadb, which are running on the same machine, are churning away and unresponsive I can browse the internet still with
little impact which makes me think it is a local IO thing.
It would be also possible to create two PVs out of the two
RAID-10 and a single VG with both.
Then the LV can be fitted in one or the other PV.
LUKS can be at RAID level or on top of LV.
If I understand correctly: One mdadm raid10 far 2 from SSDs, one from
HDDs and then add them BOTH to one single VG, and then build the cached
file system from that one volume group? I did not think it worked like
that, as you'd have to control which disks the cache was on.
My first thought is to ask what the machine is being used for. It is
not normal to have apache/php/mariadb and firefox/browsing on the same machine. Usually you have either a server (and therefore no desktop processes), or a desktop (and therefore no server processes of
particular relevance). So what are you doing with the system? What is
it that is using the I/O ? Is the machine really too slow, and what do
you /feel/ is slow on it?
My second thought is that often the best way to improve I/O performance
is to add more ram. Is that a possibility with your hardware?
On 12/12/2016 08:11 AM, David Brown wrote:
My first thought is to ask what the machine is being used for. It is
not normal to have apache/php/mariadb and firefox/browsing on the same
machine. Usually you have either a server (and therefore no desktop
processes), or a desktop (and therefore no server processes of
particular relevance). So what are you doing with the system? What is
it that is using the I/O ? Is the machine really too slow, and what do
you /feel/ is slow on it?
Home desktop / server. I have mariadb on it and the easierst way for a
front end for some stuff I am doing seemed to be a lamp stack.
Also have mediatomb on it and serve files to mpd on a raspberry pi.
Mediatomb seems to churn the disk sometimes. Not installed right now.
My second thought is that often the best way to improve I/O performance
is to add more ram. Is that a possibility with your hardware?
I've got 8GB and about 40% acts as a disk cache. I could try hunting
down 16GB on ebay but I paid full price a year or so ago for the 8GB.
Looking at kinfocentre right now I have 19% of my memory free, 35% in
disk cache and 44% application data. So would adding more ram do
anything but add to free memory?
On 2016-12-10 21:44, Peter Chant wrote:
[...]
Well, I've not taken any formal stats but a good part of the load is
mariadb at the same time as apache / php and firefox. I'd assume random.
This can fit SSD or HDD + cache on SSD.
[...]
I'm a bit confused here. My understanding is that RAID10 requires four
disks, that they are striped in pairs and the pairs then mirror each
other. So I can't do that if I have two drives.
Look at the Linux MD RAID-10 documentation.
You'll see any number, even odd, will do.
Here some reference:
https://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10
In case of RAID-10, with 2 disks, layout "far 2",
the disks combines RAID-0 and RAID-1.
So, the RAID will survive a failure of 1 disk,
but the read performances are of RAID-0.
In your case, I'll combine the two HDDs in one
RAID-10 and two SSDs in an other RAID-10.
Both with layout "far 2".
Part of the RAID-10 with SSD can be used as cache.
[...]
Hmm. bcache is simpler as a starting point.
Well, not really.
If you already use LVM, than lvmcache is easier.
Because you can add and remove the cache to the
LVM volume on the fly.
With bcache, you'll have to start with it from
the beginning.
bye,
On 2016-12-11 12:54, Peter Chant wrote:
[...]
Thanks. I've done some reading and there is more to do plus some
experimentation.
Experimentation is good.
It is the only way to have so idea of the
different peculiarities.
I understand that mdadm is used to create the raid arrays, it is not
part of lvm itself?
LVM (dmraid) and md share a lot of code, but I'm
not sure about this RAID-10. Maybe it is only md.
[...]
OK. Simple to set up with the SSDs as one is blank and the other has
free space.
Be careful, it is easy to destroy data.
Maybe you can practice with loop devices.
There are some howtos around.
With RAID-10 far 2 am I correct in assuming that the available capacity
for a two device array of 3TB disks would be 3TB (two copies of data)?
Yep.
Part of the RAID-10 with SSD can be used as cache.
OK, so having done some reading up but not carried out any testing I
think the following is a possible setup. Note I am using LUKS so I will
add this extra layer to the mix. I have not noticed any significant
performance difference with and without it.
OK, you'll have to decide at which layer LUKS fits.
There are pro and cons for each case.
SSDs:
Create empty partition on larger system SSD.
Add smaller empty SSD.
Create raid 10 far 2 raid array with mdadm for example (need to check
what metadata means!):
mdadm --create /dev/md-ssd --level=10 --metadata=0.90 --raid-devices=2
--layout=f2 /dev/sda4 /dev/sdb
Metadata 1.0, 1.1 and 1.2 are the new ones, use these,
not the 0.90, which have less features.
Maybe, not really sure, but partitioning /dev/sdb might
be better.
Use crypt setup to create an encrypted block device on top of this raid
10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2
device.
Or the other way around.
I'm not sure which is better, maybe your proposal.
Create a logical volume on top of the above to use as the cache device.
Yep, if you use lvmcache, maybe bcache can do as well.
HDD:
Similar to the SSD case above except that the logical volume will be for
the slow disks doing the bulk of the work.
OK, I think.
Create a cached device:
Use lvmcache to create a device using the the two logical volumes
created above (bcache would also work).
Seems good to me.
Create a file system on top of the cached device. If using btrfs (what
I do now) use single for data as the raid is occuring a couple of levels
down.
Well, it might be btrfs has already RAID-10.
Again, code is shared between this and md too.
Of course, I could have got completely the wrong idea above and invented
some horrid monster!
If I understood it right, it sounds OK to me.
I would, in any case, strongly suggest to experiment,
maybe, as mentioned above, with loop devices.
Not for performances, but for practising possible
combinations and layouts.
Then there is the story of the caching, which has
different scope and performances between bcache
and lvmcache.
In your specific case, I cannot judge which is better,
but lvmcache seems to me easier.
It looks like I could do this without the LVM layer between the LUKs and
cache parts. However, this does give me the flexability to create
logical volumes that are on HDD RAID10, SDD RAID10 soley as well as
cached. Or different cache options. I think I could have put another
LVM on top of the cached item I was creating but I think that was over
doing it.
Probably it is.
I would try to use not more than one component at time.
So, 1 md, 1 LUKS, 1 LVM, at maximum.
If possible less.
It would be also possible to create two PVs out of the two
RAID-10 and a single VG with both.
Then the LV can be fitted in one or the other PV.
LUKS can be at RAID level or on top of LV.
This type of "generic" setup has some advantages in case
of hardware updates (easy to add / remove storage devices,
by using pvmove).
bye,
OK, so having done some reading up but not carried out any testing I
think the following is a possible setup. Note I am using LUKS so I will >>> add this extra layer to the mix. I have not noticed any significant
performance difference with and without it.
This makes me even more suspicious that you (the OP) really have an I/O problem, or have identified where it is.
Create raid 10 far 2 raid array with mdadm for example (need to check
what metadata means!):
mdadm --create /dev/md-ssd --level=10 --metadata=0.90 --raid-devices=2
--layout=f2 /dev/sda4 /dev/sdb
That looks like you are raiding a partition on one device with the
entire second device. Are you sure that is what you want?
I don't think anything has been said about the sizes and partitioning of
the SSD. For smaller or cheaper SSDs, it is worth leaving a small
amount of unpartitioned space at the end of the disk to give it more flexibility in garbage collection. (Do a secure erase before
partitioning if it is not a new clean SSD.)
Metadata 1.0, 1.1 and 1.2 are the new ones, use these,
not the 0.90, which have less features.
Maybe, not really sure, but partitioning /dev/sdb might
be better.
Use crypt setup to create an encrypted block device on top of this raid
10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2
device.
Or the other way around.
I'm not sure which is better, maybe your proposal.
Neither is better, IMHO, unless you have some reason to be seriously paranoid. It is understandable why one would want to encrypt a portable machine that you use a lot while travelling, but a home desktop? Think
about whether encryption here is really a useful thing - adding layers
of complexity does not make anything faster, and it makes it a whole lot
more difficult if something goes wrong or if you want to recover your
files from a different system.
Create a logical volume on top of the above to use as the cache device.
Yep, if you use lvmcache, maybe bcache can do as well.
Again, that's unnecessary complexity for a system like this. The
benefits of lvmcache and bcache are debatable even for loads that match
them.
btrfs does not have raid10, and it does not share significant raid1 or
raid0 code with md. (It /does/ share code for calculating raid5 and
raid6 parities, but that's another issue - and don't use btrfs raid5/6
until the bugs are sorted out!).
You only want the raid1 at one level. Your choice is raid1 on btrfs for
best performance and efficiency (since only the actual useful data is mirrored, rather than the entire raw disk), or raid10,far on the md
layer (for greater large file streaming read speed). This can be a big
issue with SSDs - with btrfs raid1 you avoid initially copying over an
entire diskful of data from one device to the other.
Of course, I could have got completely the wrong idea above and invented >>> some horrid monster!
If I understood it right, it sounds OK to me.
I would, in any case, strongly suggest to experiment,
maybe, as mentioned above, with loop devices.
Agreed.
Not for performances, but for practising possible
On 13/12/16 00:08, Peter Chant wrote:
On 12/12/2016 08:11 AM, David Brown wrote:
My first thought is to ask what the machine is being used for. It is
not normal to have apache/php/mariadb and firefox/browsing on the same
machine. Usually you have either a server (and therefore no desktop
processes), or a desktop (and therefore no server processes of
particular relevance). So what are you doing with the system? What is
it that is using the I/O ? Is the machine really too slow, and what do
you /feel/ is slow on it?
Home desktop / server. I have mariadb on it and the easierst way for a
front end for some stuff I am doing seemed to be a lamp stack.
Do you mean you are running server an active webserver on the same
system you are using for browsing, development, games, email, etc.?
That is a poor setup, for efficiency, reliability, and security. Of
course, economics can be a factor - but if you can afford to be playing around with SSDs and multiple disks, then you should also consider if
you should have a separate machine for the server. A small Intel NUC
with a single disk is likely to be good enough for your LAMP stack and mediatomb server, leaving your desktop free to be a desktop.
Also have mediatomb on it and serve files to mpd on a raspberry pi.
Mediatomb seems to churn the disk sometimes. Not installed right now.
My second thought is that often the best way to improve I/O performance
is to add more ram. Is that a possibility with your hardware?
I've got 8GB and about 40% acts as a disk cache. I could try hunting
down 16GB on ebay but I paid full price a year or so ago for the 8GB.
Looking at kinfocentre right now I have 19% of my memory free, 35% in
disk cache and 44% application data. So would adding more ram do
anything but add to free memory?
Yes, adding memory will make a /serious/ difference when you are trying
to work as a server and a desktop - /if/ you really are having
performance issues with I/O. But to be honest, I don't think you /are/ having I/O performance issues - I suspect you just think you are. If
you have a lot of free memory (and 19% is quite a lot, unless you have
just stopped a large process) then you are not actually doing a lot of
I/O, because disk data is cached in ram whenever there is /any/ free ram.
With more ram, writes go faster because they stay in memory for longer
and don't get flushed to disk as often. Reads go faster because it is
much more likely that the data is already in memory. You also have the option of putting /tmp on tmpfs to speed up processes that use a lot of temporary files.
But again, I would strongly suggest you try to identify what /feels/
slow, and consider how you use the machine. What are you doing in
parallel? What sort of serving are you actually doing, and is it
running in parallel with desktop usage? Why do you think your I/O is slow?
On 12/13/2016 08:29 AM, David Brown wrote:
On 13/12/16 00:08, Peter Chant wrote:
On 12/12/2016 08:11 AM, David Brown wrote:
My first thought is to ask what the machine is being used for. It is
not normal to have apache/php/mariadb and firefox/browsing on the same >>>> machine. Usually you have either a server (and therefore no desktop
processes), or a desktop (and therefore no server processes of
particular relevance). So what are you doing with the system? What is >>>> it that is using the I/O ? Is the machine really too slow, and what do >>>> you /feel/ is slow on it?
Home desktop / server. I have mariadb on it and the easierst way for a
front end for some stuff I am doing seemed to be a lamp stack.
Do you mean you are running server an active webserver on the same
system you are using for browsing, development, games, email, etc.?
That is a poor setup, for efficiency, reliability, and security. Of
course, economics can be a factor - but if you can afford to be playing
around with SSDs and multiple disks, then you should also consider if
you should have a separate machine for the server. A small Intel NUC
with a single disk is likely to be good enough for your LAMP stack and
mediatomb server, leaving your desktop free to be a desktop.
I think my lamp stack is somewhat atypical. I'm storing numerical data
in it and doing some calcs on that. Might be good for storage but calcs
in python and php are not optimal. However, this is partly historic and partly convenience and rework would be a major pita.
However, if I get slowdowns on this fairly elderly machine and a six
core cpu and 8GB of ram then with albeit a newer generation CPU I don't
see a NUC being much faster, though I admit I've never got my ands on
one. Plus there is not room for the two hdds plus the ssd OS disk.
Using this machine as the server and the nuc as the desktop would make
more sense.
I did think about this in the past, or getting a nice laptop / docking station combination and a server setup.
The system is fairly responsive right now but I am not hitting the HDDs
right now using thunderbird as I type this. This is with duperemove
hitting the HDDs hard and I'd not consider doing anything else that hit
the HDDs. Incidentally application data is now 72% of physical memory
and disk cache 24-25% with the remainder few % free.
Also have mediatomb on it and serve files to mpd on a raspberry pi.
Mediatomb seems to churn the disk sometimes. Not installed right now.
My second thought is that often the best way to improve I/O performance >>>> is to add more ram. Is that a possibility with your hardware?
I've got 8GB and about 40% acts as a disk cache. I could try hunting
down 16GB on ebay but I paid full price a year or so ago for the 8GB.
Looking at kinfocentre right now I have 19% of my memory free, 35% in
disk cache and 44% application data. So would adding more ram do
anything but add to free memory?
Yes, adding memory will make a /serious/ difference when you are trying
to work as a server and a desktop - /if/ you really are having
performance issues with I/O. But to be honest, I don't think you /are/
having I/O performance issues - I suspect you just think you are. If
you have a lot of free memory (and 19% is quite a lot, unless you have
just stopped a large process) then you are not actually doing a lot of
I/O, because disk data is cached in ram whenever there is /any/ free ram.
Given that there is little free now the numbers I quoted earlier might
not have been representative. I did not see a noticeable difference
between 4 & 8 GB therefore I'd not considered more ram. However, if it
is really likely to make a big difference and with 16GB of DDR2 going
for between £15 and £45 on ebay then some research is warranted.
With more ram, writes go faster because they stay in memory for longer
and don't get flushed to disk as often. Reads go faster because it is
much more likely that the data is already in memory. You also have the
option of putting /tmp on tmpfs to speed up processes that use a lot of
temporary files.
I've put /tmp on tmpfs before. I have /dev/shm on tmpfs at the moment
as part of slackware's default config. Generally I've abandoned this
when compiling packages filled up /tmp and I ran out of tmp space.
Generally a failure to clean up /tmp, but some packages are large and
have a lot of dependencies.
But again, I would strongly suggest you try to identify what /feels/
slow, and consider how you use the machine. What are you doing in
parallel? What sort of serving are you actually doing, and is it
running in parallel with desktop usage? Why do you think your I/O is slow? >>
I used the term 'IO bound' as I've seen the HDDs hit 80-90% for long
periods yet CPU usage has been relatively low. So to me IO was the
limiting factor. Going out and spending lots of cash (not cache!) on
the latest i7 + motherboard and memory therefore I assume would not
improve the user experience whereas speeding up the existing IO would,
if possible.
The lamp load above is likely excessive but I have seen slowdowns before
with this machine. Sometimes btrfs seems to build up a backlog of stuff
to do (btrfs cleaner, transactions etc) for a while after doing
something disk intensive. But I've noticed this less lately. Btrfs has
not got a reputation for being slow although odd and specific cases do
show up on the mailing list from time to time. I'm not planning on
swapping file systems unless to another with subvolumes and probably snapshots as subvolumes have let me organise things in a much more
logical and efficient manner since I have started using them.
I have a nagging feeling that something just is not right. However, I
need to benchmark. I also have had a cheap two interface SATAIII card.
If there is something odd with the disk interface (can't see what) maybe
that will shake it out. It should allow the SSD to function to its
potential anyway, so it is not a bad idea.
Unfortunately the slightly higher range 4 port SATA III PCIe x2 cards
seem limited right now, I'd have to go up quite a notch in price to
eight port / SAS cards and I'm starting to through reasonable sums of
money at an elderly mobo / processor / ram combination with no assured outcome. However, cheap improvements and especially improvements with existing kit are definitely work pursuing.
On 12/13/2016 09:17 AM, David Brown wrote:
OK, so having done some reading up but not carried out any testing I
think the following is a possible setup. Note I am using LUKS so I will >>>> add this extra layer to the mix. I have not noticed any significant
performance difference with and without it.
This makes me even more suspicious that you (the OP) really have an I/O
problem, or have identified where it is.
I've had periods where the disks have been solidly at 80-90% utalisation
for many seconds yet the CPU has been lightly loaded.
Create raid 10 far 2 raid array with mdadm for example (need to check
what metadata means!):
mdadm --create /dev/md-ssd --level=10 --metadata=0.90 --raid-devices=2 >>>> --layout=f2 /dev/sda4 /dev/sdb
That looks like you are raiding a partition on one device with the
entire second device. Are you sure that is what you want?
Well. If raiding SSD's there is some logic to this horrible looking asymetric setup. I'm using an SSD for the OS and that has plenty of
free space. I've also the older smaller SSD it replaced. Although it
looks messy I could free up some space on the current system SSD to use
a partition for RAID and use that in combination with the old, currently unused SSD. That saves shelling out for another SSD if I want to RAID a pair. Anaesthetically it is horrid and obviously would impact the speed
of OS access. But is it hardware I have so the monetary cost is no
issue. The time and hair loss cost might not be so trivial.
Would image the SSD in case of mess ups before resizing partitions.
As for the whole of /dev/sdb - I've been using btrfs for a while and it
is normal to give it whole disks, a simple slip of the finger.
I don't think anything has been said about the sizes and partitioning of
the SSD. For smaller or cheaper SSDs, it is worth leaving a small
amount of unpartitioned space at the end of the disk to give it more
flexibility in garbage collection. (Do a secure erase before
partitioning if it is not a new clean SSD.)
Have done that already.
Metadata 1.0, 1.1 and 1.2 are the new ones, use these,
not the 0.90, which have less features.
Maybe, not really sure, but partitioning /dev/sdb might
be better.
Use crypt setup to create an encrypted block device on top of this raid >>>> 10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2 >>>> device.
Or the other way around.
I'm not sure which is better, maybe your proposal.
Neither is better, IMHO, unless you have some reason to be seriously
paranoid. It is understandable why one would want to encrypt a portable
machine that you use a lot while travelling, but a home desktop? Think
about whether encryption here is really a useful thing - adding layers
of complexity does not make anything faster, and it makes it a whole lot
more difficult if something goes wrong or if you want to recover your
files from a different system.
Well, in this day and age it seemed like a reasonable idea.
Create a logical volume on top of the above to use as the cache device. >>>Yep, if you use lvmcache, maybe bcache can do as well.
Again, that's unnecessary complexity for a system like this. The
benefits of lvmcache and bcache are debatable even for loads that match
them.
So it is about as fast as it will get now?
I've a spare hdd and ssd so I'll have a play when I get time. Need to
think about a useful benchmark.
btrfs does not have raid10, and it does not share significant raid1 or
raid0 code with md. (It /does/ share code for calculating raid5 and
raid6 parities, but that's another issue - and don't use btrfs raid5/6
until the bugs are sorted out!).
Oh yes it does. :-). But I don't see anything about far 2.
You only want the raid1 at one level. Your choice is raid1 on btrfs for
best performance and efficiency (since only the actual useful data is
mirrored, rather than the entire raw disk), or raid10,far on the md
layer (for greater large file streaming read speed). This can be a big
issue with SSDs - with btrfs raid1 you avoid initially copying over an
entire diskful of data from one device to the other.
replacing a disk and the associated balance took a week.
I'm at RAID1 with btrfs now. Yes, not RAIDing btrfs over another RAID
as that make little sense here.
Of course, I could have got completely the wrong idea above and invented >>>> some horrid monster!
If I understood it right, it sounds OK to me.
I would, in any case, strongly suggest to experiment,
maybe, as mentioned above, with loop devices.
Agreed.
Spare hdd and ssd. Could use loop devices but perhaps real hw is useful, though not enough to RAID.
Not for performances, but for practising possible
I think my lamp stack is somewhat atypical. I'm storing numerical data
in it and doing some calcs on that. Might be good for storage but calcs
in python and php are not optimal. However, this is partly historic and
partly convenience and rework would be a major pita.
For Python, you can look at numpy and scipy for serious calculations -
if you can work with your data as homogeneous arrays then numpy will do
the calculations in fast C libraries rather than interpreted Python.
Also look at pypy or psyco as ways to speed up Python code. You may
also find that if you have heavy Python pages you are better using
Twisted as a webserver rather than Apache so that you work entirely
within the one Python process rather than starting and stopping
processes all the time.
Given that there is little free now the numbers I quoted earlier might
not have been representative. I did not see a noticeable difference
between 4 & 8 GB therefore I'd not considered more ram. However, if it
is really likely to make a big difference and with 16GB of DDR2 going
for between £15 and £45 on ebay then some research is warranted.
I have seen extra ram make an impressive difference to speed. Not long
ago a fellow developer here thought he needed a new graphics card at
about £500 because his current £300 one was too slow for the 3D
rendering he was doing. But £30 more ram doubled the speed of the
system, while a new graphics card would have made little difference.
It is often more efficient to have /tmp on tmpfs and let it spill out
into swap, than to have the /tmp directly on the disk.
A newer cpu and motherboard may not seem useful from the viewpoint of processor power, but they will have better throughput on the I/O and
faster native SATA.
On 12/15/2016 12:15 PM, David Brown wrote:
I think my lamp stack is somewhat atypical. I'm storing numerical data
in it and doing some calcs on that. Might be good for storage but calcs >>> in python and php are not optimal. However, this is partly historic and >>> partly convenience and rework would be a major pita.
For Python, you can look at numpy and scipy for serious calculations -
if you can work with your data as homogeneous arrays then numpy will do
the calculations in fast C libraries rather than interpreted Python.
Also look at pypy or psyco as ways to speed up Python code. You may
also find that if you have heavy Python pages you are better using
Twisted as a webserver rather than Apache so that you work entirely
within the one Python process rather than starting and stopping
processes all the time.
At this stage the rewrite is probally not worthwhile. I'm not sure this
bit is necessarily the bottleneck anyway. Investigation required.
On 12/15/2016 11:55 AM, David Brown wrote:
On 14/12/16 21:03, Peter Chant wrote:
On 12/13/2016 09:17 AM, David Brown wrote:
Well, the old SSD is not that old its a Samsung 840 Pro. Anyway. I have
it and am not otherwise using it and the same goes for a 2TB HDD. So I
can play with caching (but not necessarily RAID). So I can add them to
the machine and have a go. However, this is much larger than the
mariadb files and some others so I can try using it as a dedicated drive
for some things and see if shifting some storage to it makes a
difference. With more effort I can probably play with RAID and caching.
I've also looked up mysql optimisation and xfs is suggested. So I can
make a dedicated xfs partition and see what happens. With or without LVM...
There seem to be two schools of thought on RAID1 here. Anyway, I can
play as far as I like provided I don't waste time or money.
/If/ your old SSD is reasonably fast, you can first run a secure erase
to tell it to drop all data. Then partition it, leaving a little extra
space unused - this overprovisioning can help a lot. And use it with
btrfs raid1, not md raid1, so that only the actual useful data is
replicated.
Think I'll boot on a USB stick and unplug all other drives when trying that.
The system is on a simple partition with btrfs in single. But boot is
ext3 or 4. And there is a partition I consider the 'maintenance'
partition with a full slackware install on ext3 or 4 that I hardly touch
but is handy in case I break something. Also you can't build a kernel
from a rescue disk but you can with this which is nice.
You can reasonably argue that it is a waste of space especially on an
SSD which is expensive compared to a HDD but it is a nice comfort
blanket and doubly so on the rare occasions it is needed.
If you have a lot of data, it takes a while to copy it all over. And a
re-balance actually does a good deal more work than just a plain copy.
But at least it doesn't copy the unused space too.
Ues, there is the checksumming for a start. But that is a good thing.
Loop devices work well in testing, especially for seeing how to add
disks, replace disks, resize things, etc. Of course they are of little
help in speed testing.
I usually testing using loop devices made in a tmpfs mount - but then,
my machines normally have lots of ram.
I'd never have guessed. :-)
Anyway food for thought here from Piergorgio and yourself.
On 14/12/16 21:03, Peter Chant wrote:
On 12/13/2016 09:17 AM, David Brown wrote:
OK, so having done some reading up but not carried out any testing I >>>>> think the following is a possible setup. Note I am using LUKS so I will >>>>> add this extra layer to the mix. I have not noticed any significant >>>>> performance difference with and without it.
This makes me even more suspicious that you (the OP) really have an I/O
problem, or have identified where it is.
I've had periods where the disks have been solidly at 80-90% utalisation
for many seconds yet the CPU has been lightly loaded.
What is the computer doing at the time?
Is this actually slowing down something that you are waiting for?
It is /normal/ for a computer to run at maximum in some aspects, for
some time. If you are transferring a large file, you /want/ the disks
to be as close to 100% as possible. If you are doing a raid1 scrub, you /want/ the disks to be close to 100%, perhaps for hours at a time if the disks are big (but at low I/O priority so that other tasks can also run).
So far, you have just told me that your system is working.
Well. If raiding SSD's there is some logic to this horrible looking
asymetric setup. I'm using an SSD for the OS and that has plenty of
free space. I've also the older smaller SSD it replaced. Although it
looks messy I could free up some space on the current system SSD to use
a partition for RAID and use that in combination with the old, currently
unused SSD. That saves shelling out for another SSD if I want to RAID a
pair. Anaesthetically it is horrid and obviously would impact the speed
of OS access. But is it hardware I have so the monetary cost is no
issue. The time and hair loss cost might not be so trivial.
OK - my first thought was that it was a mistake (possibly just a missing character from a cut-and-paste). Now I see it is intentional.
I am not convinced this will be a good thing - in fact, I am confident
that it is a /bad/ thing. An old small SSD can easily be a /lot/ slower
than a new one - typically, old and small SSDs have poor garbage
collection and little over-provisioning. This means they get very slow
at writes when they are full - even slower, sometimes, than hard disks.
When you create an md raid1 array like this, the first thing md will do
is copy block-for-block from /dev/sda4 to /dev/sdb. It will write to
the entire disk - the old SSD will think that /all/ its normal blocks
are full of important data. As soon as you try to write something else
to it, it must now try to do garbage collection in "panic" mode with
minimal free space - you might find your write latencies measured in /seconds/. And if the SSD is old enough, then it won't be able to
handle reads during the erases and garbage collection. If you are
lucky, reads will come from the other disk. If you are unlucky, reads
will be stalled too.
So I would expect your system to be a good deal slower by doing this, compared to simply using the new SSD on its own.
/If/ your old SSD is reasonably fast, you can first run a secure erase
to tell it to drop all data. Then partition it, leaving a little extra
space unused - this overprovisioning can help a lot. And use it with
btrfs raid1, not md raid1, so that only the actual useful data is
replicated.
Would image the SSD in case of mess ups before resizing partitions.
As for the whole of /dev/sdb - I've been using btrfs for a while and it
is normal to give it whole disks, a simple slip of the finger.
I always use partitions, but I usually want a couple of partitions for
other things (like swap).
I don't think anything has been said about the sizes and partitioning of >>> the SSD. For smaller or cheaper SSDs, it is worth leaving a small
amount of unpartitioned space at the end of the disk to give it more
flexibility in garbage collection. (Do a secure erase before
partitioning if it is not a new clean SSD.)
Have done that already.
Including the unused space at the end?
Metadata 1.0, 1.1 and 1.2 are the new ones, use these,
not the 0.90, which have less features.
Maybe, not really sure, but partitioning /dev/sdb might
be better.
Use crypt setup to create an encrypted block device on top of this raid >>>>> 10 far 2 device.
Use LVM to create a volume group on top of the encrypted raid 10 far 2 >>>>> device.
Or the other way around.
I'm not sure which is better, maybe your proposal.
Neither is better, IMHO, unless you have some reason to be seriously
paranoid. It is understandable why one would want to encrypt a portable >>> machine that you use a lot while travelling, but a home desktop? Think
about whether encryption here is really a useful thing - adding layers
of complexity does not make anything faster, and it makes it a whole lot >>> more difficult if something goes wrong or if you want to recover your
files from a different system.
Well, in this day and age it seemed like a reasonable idea.
It can be fun to play around with this sort of thing, but that doesn't
mean it is a good idea if you are aiming for a useful system.
One particular thing that can be a serious pain with such complex setups
is if something goes wrong. If something breaks badly, you might have
to put the disks in a different machine to recover the data, or boot the
same machine from a live USB. If you have a lvm cache or bcache in
writeback mode and you have both the SSD and the HD online, and you may
need a system with the right kernel and utilities in order to get the
disks working. Even then, it's easy to accidentally corrupt things
along the way such as by writing to the HD while there are uncommitted changes on the SSD part. I have enough experience with complicated recoveries to know that when something goes wrong, you'll be glad you
kept things simple - and that you documented everything :-)
Create a logical volume on top of the above to use as the cache device. >>>>Yep, if you use lvmcache, maybe bcache can do as well.
Again, that's unnecessary complexity for a system like this. The
benefits of lvmcache and bcache are debatable even for loads that match
them.
So it is about as fast as it will get now?
I've a spare hdd and ssd so I'll have a play when I get time. Need to
think about a useful benchmark.
It's all about the type of load you are using. lvmcache and/or bcache
can help a lot in some cases - but be no faster than a single HD in
other cases. And for many of the things that /do/ run faster with the caches, you could just as easily run them entirely from the SSD
significantly faster - or get even better results by adding more RAM.
The key to getting /really/ optimal systems is, as you say, useful benchmarks. There is no benefit in looking at the timings of a test
that writes lots of small blocks from lots of threads unless that really
is what you are doing in practice. There is no benefit in running a benchmark after clearing your caches, because you don't clear your
caches in practice - but if you run without clearing the caches, you are
not testing your disk performance. The only "true" benchmark is to run
your system with your typical real tasks and see how it performs.
You only want the raid1 at one level. Your choice is raid1 on btrfs for >>> best performance and efficiency (since only the actual useful data is
mirrored, rather than the entire raw disk), or raid10,far on the md
layer (for greater large file streaming read speed). This can be a big
issue with SSDs - with btrfs raid1 you avoid initially copying over an
entire diskful of data from one device to the other.
replacing a disk and the associated balance took a week.
If you have a lot of data, it takes a while to copy it all over. And a re-balance actually does a good deal more work than just a plain copy.
But at least it doesn't copy the unused space too.
Loop devices work well in testing, especially for seeing how to add
disks, replace disks, resize things, etc. Of course they are of little
help in speed testing.
I usually testing using loop devices made in a tmpfs mount - but then,
my machines normally have lots of ram.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 32:48:05 |
Calls: | 6,648 |
Calls today: | 3 |
Files: | 12,193 |
Messages: | 5,328,599 |