• [gentoo-user] Synchronous writes over the network.

    From Mark Knecht@21:1/5 to All on Mon Dec 20 20:00:02 2021
    I wonder if someone can help me get educated about synchronous writes
    to a file server over the network? Is this something that's designed
    into specific apps or is this something that I have control of at the
    sys admin level?

    I've recently built 2 TrueNAS file servers. The first (and main) unit
    runs all the time and serves to backup my home user machines.
    Generally speaking I (currently) put data onto it using rsync but it
    also has an NFS mount that serves as a location for my Raspberry Pi to
    store duplicate copies of astrophotography pictures live as they come
    off the DSLR in the middle of the night.

    The second TrueNAS machine serves to back up this first machine but
    resides at the other end of the house to protect data in case of fire. Eventually I'll probably backup all of this offsite but for now it's
    two old computers and a bunch of disks.

    The question about synchronous writes comes in the configuration of
    TrueNAS. TrueNAS supports what it calls a ZIL (ZFS Intent Log) which
    is a smaller SSD at the front end of the write data flow. The idea (as
    I understand it) is that the ZIL allows writes to the server to be
    cached quickly onto, in my case, an SSD, and then eventually written
    to spinning drives when the system gets around to it. Once new data
    arrives at the ZIL it remains until it's written and verified at which
    time the entries in the ZIL are removed. The ZIL does not do anything
    to speed up reads from the file server.

    The thing is that the ZIL is only used for synchronous writes and I
    don't know whether anything I'm doing to back up my user machines,
    which currently is just rsync commands, is synchronous or could be
    made synchronous, and I do not know if the NFS writes from the R_Pi
    are synchronous or could be made so.

    If someone can point me in the right direction in terms of reading and
    study I'd appreciate it.

    Thanks,
    Mark

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to markknecht@gmail.com on Mon Dec 20 21:00:01 2021
    On Mon, Dec 20, 2021 at 1:52 PM Mark Knecht <markknecht@gmail.com> wrote:

    I've recently built 2 TrueNAS file servers. The first (and main) unit
    runs all the time and serves to backup my home user machines.
    Generally speaking I (currently) put data onto it using rsync but it
    also has an NFS mount that serves as a location for my Raspberry Pi to
    store duplicate copies of astrophotography pictures live as they come
    off the DSLR in the middle of the night.

    ...

    The thing is that the ZIL is only used for synchronous writes and I
    don't know whether anything I'm doing to back up my user machines,
    which currently is just rsync commands, is synchronous or could be
    made synchronous, and I do not know if the NFS writes from the R_Pi
    are synchronous or could be made so.


    Disclaimer: some of this stuff is a bit arcane and the documentation
    isn't very great, so I could be missing a nuance somewhere.

    First, one of your options is to set sync=always on the zfs dataset,
    if synchronous behavior is strongly desired. That will force ALL
    writes at the filesystem level to be synchronous. It will of course
    also normally kill performance but the ZIL may very well save you if
    your SSD performs adequately. This still only applies at the
    filesystem level, which may be an issue with NFS (read on).

    I'm not sure how exactly you're using rsync from the description above
    (rsyncd, directly client access, etc). In any case I don't think
    rsync has any kind of option to force synchronous behavior. I'm not
    sure if manually running a sync on the server after using rsync will
    use the ZIL or not. If you're using sync=always then that should
    cover rsync no matter how you're doing it.

    Nfs is a little different as both the server-side and client-side have
    possible asynchronous behavior. By default the nfs client is
    asynchronous, so caching can happen on the client before the file is
    even sent to the server. This can be disabled with the mount option
    sync on the client side. That will force all data to be sent to the
    server immediately. Any nfs server or filesystem settings on the
    server side will not have any impact if the client doesn't transmit
    the data to the server. The server also has a sync setting which
    defaults to on, and it additionally has another layer of caching on
    top of that which can be disabled with no_wdelay on the export. Those server-side settings probably delay anything getting to the filesystem
    and so they would have precedence over any filesystem-level settings.

    As you can see you need to use a bit of a kill-it-with-fire approach
    to get synchronous behavior, as it traditionally performs so poorly
    that everybody takes steps to try to prevent it from happening.

    I'll also note that the main thing synchronous behavior protects you
    from is unclean shutdown of the server. It has no bearing on what
    happens if a client goes down uncleanly. If you don't expect server
    crashes it may not provide much benefit.

    If you're using ZIL you should consider having the ZIL mirrored, as
    any loss of the ZIL devices will otherwise cause data loss. Use of
    the ZIL is also going to create wear on your SSD so consider that and
    your overall disk load before setting sync=always on the dataset.
    Since the setting is at the dataset level you could have multiple
    mountpoints and have a different sync policy for each. The default is
    normal POSIX behavior which only syncs when requested (sync, fsync,
    O_SYNC, etc).

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Mark Knecht on Mon Dec 20 21:00:02 2021
    On 20/12/2021 18:52, Mark Knecht wrote:
    The thing is that the ZIL is only used for synchronous writes and I
    don't know whether anything I'm doing to back up my user machines,
    which currently is just rsync commands, is synchronous or could be
    made synchronous, and I do not know if the NFS writes from the R_Pi
    are synchronous or could be made so.

    "Synchronous writes" basically means "in the order they were written".

    And it might also mean blocking writes, which is why you don't want it
    on spinning rust. But it also means that it is (almost) guaranteed to
    get to permanent storage, which is why you do want it for mail,
    databases, etc.

    Your typical (asynchronous) app calls "write", chucks it at the kernel,
    and forgets about it. Hence "asynchronous" - "without regard to time".

    Your app which has switched on synchronicity will lock until the write
    has completed.

    Your understanding about the ZIL sounds about right - whatever you throw
    at the NAS will be saved to the ZIL before it gets written properly
    later. Your apps (rsync etc) don't need to worry, the kernel will cache
    stuff, flood it through to the ZIL, and the NAS will take it from there.

    The only thing I'd worry about is how "bursty" is the data being chucked
    at the NAS. A backup is likely to be a stream that could easily
    overwhelm the buffers, and that's not good. Do you have an rsync daemon
    on the NAS? The more you can make the writes smaller and bursty the
    better, and running an rsync daemon is one of the ways.

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to antlists@youngman.org.uk on Mon Dec 20 21:20:01 2021
    On Mon, Dec 20, 2021 at 2:52 PM Wols Lists <antlists@youngman.org.uk> wrote:

    And it might also mean blocking writes, which is why you don't want it
    on spinning rust. But it also means that it is (almost) guaranteed to
    get to permanent storage, which is why you do want it for mail,
    databases, etc.


    The reason that mail/databases/etc use synchronous behavior isn't
    because it is "almost" guaranteed to make it to storage. The reason
    they use it is because you have multiple hosts, and each host can
    guarantee non-loss of data internally, but synchronous behavior is
    necessary to ensure that data is not lost on a handoff.

    Take a mail server. If your SMTP connection goes down for any reason
    before the server communicates that the mail was accepted then the
    sender will assume the mail was not delivered, and will try again. So
    if the network goes down, or the SMTP server crashes, then the client
    will cache the mail and try again. Most mail servers will have the
    data already on-disk before even attempting to deliver mail, so even
    if all the computers involved go down during this handoff nothing is
    lost as it is still in the client cache on-disk.

    On the other hand, once the server confirms delivery then
    responsibility is handed off and the client can forget about the mail.
    It is important that the mail server not communicate that the mail was
    received until it can guarantee that it won't lose the mail. That is
    usually accomplished by the server syncing the mail file to the
    on-disk spool and blocking until that is successful before
    communicating back to the client that the mail was delivered.

    Database transactions behave similarly.

    If the userspace application either does a write on a file opened with
    O_SYNC or does an fsync system call on the file, and the system call
    returns, then the data is present on-disk and will be persistent even
    if the power is lost at the very next moment. It is acceptable for a filesystem to return the call if the data is in a persistent journal,
    which is what the ZIL is, as long as it is flushed to disk.

    Of course, you can still accept mail or implement a database
    asynchronously, but you lose a number of data protections that are
    otherwise designed into the software (well, assuming you're not
    storing your data in MyISAM...). :)

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Knecht@21:1/5 to rich0@gentoo.org on Thu Dec 23 18:00:01 2021
    On Mon, Dec 20, 2021 at 12:52 PM Rich Freeman <rich0@gentoo.org> wrote:

    On Mon, Dec 20, 2021 at 1:52 PM Mark Knecht <markknecht@gmail.com> wrote:

    I've recently built 2 TrueNAS file servers. The first (and main) unit
    runs all the time and serves to backup my home user machines.
    Generally speaking I (currently) put data onto it using rsync but it
    also has an NFS mount that serves as a location for my Raspberry Pi to store duplicate copies of astrophotography pictures live as they come
    off the DSLR in the middle of the night.

    ...

    The thing is that the ZIL is only used for synchronous writes and I
    don't know whether anything I'm doing to back up my user machines,
    which currently is just rsync commands, is synchronous or could be
    made synchronous, and I do not know if the NFS writes from the R_Pi
    are synchronous or could be made so.


    Disclaimer: some of this stuff is a bit arcane and the documentation
    isn't very great, so I could be missing a nuance somewhere.

    First, one of your options is to set sync=always on the zfs dataset,
    if synchronous behavior is strongly desired. That will force ALL
    writes at the filesystem level to be synchronous. It will of course
    also normally kill performance but the ZIL may very well save you if
    your SSD performs adequately. This still only applies at the
    filesystem level, which may be an issue with NFS (read on).

    I'm not sure how exactly you're using rsync from the description above (rsyncd, directly client access, etc). In any case I don't think
    rsync has any kind of option to force synchronous behavior. I'm not
    sure if manually running a sync on the server after using rsync will
    use the ZIL or not. If you're using sync=always then that should
    cover rsync no matter how you're doing it.

    Nfs is a little different as both the server-side and client-side have possible asynchronous behavior. By default the nfs client is
    asynchronous, so caching can happen on the client before the file is
    even sent to the server. This can be disabled with the mount option
    sync on the client side. That will force all data to be sent to the
    server immediately. Any nfs server or filesystem settings on the
    server side will not have any impact if the client doesn't transmit
    the data to the server. The server also has a sync setting which
    defaults to on, and it additionally has another layer of caching on
    top of that which can be disabled with no_wdelay on the export. Those server-side settings probably delay anything getting to the filesystem
    and so they would have precedence over any filesystem-level settings.

    As you can see you need to use a bit of a kill-it-with-fire approach
    to get synchronous behavior, as it traditionally performs so poorly
    that everybody takes steps to try to prevent it from happening.

    I'll also note that the main thing synchronous behavior protects you
    from is unclean shutdown of the server. It has no bearing on what
    happens if a client goes down uncleanly. If you don't expect server
    crashes it may not provide much benefit.

    If you're using ZIL you should consider having the ZIL mirrored, as
    any loss of the ZIL devices will otherwise cause data loss. Use of
    the ZIL is also going to create wear on your SSD so consider that and
    your overall disk load before setting sync=always on the dataset.
    Since the setting is at the dataset level you could have multiple
    mountpoints and have a different sync policy for each. The default is
    normal POSIX behavior which only syncs when requested (sync, fsync,
    O_SYNC, etc).

    --
    Rich


    Rich & Wols,
    Thanks for the responses. I'll post a single response here. I had
    thought of the need to mirror the ZIL but didn't have enough physical
    disk slots in the backup machine for the 2nd SSD. I do think this is a
    critical point if I was to use the ZIL at all.

    Based on inputs from the two of you I'm investigating a different
    overall setup for my home network:

    Previously - a new main desktop that holds all my data. Lots of disk
    space, lots of data. All of my big data work - audio recording
    sessions and astrophotography - are done on this machine. Two
    __backup__ machines. Desktop machines are backed up to machine 1,
    machine 1 backed up to machine 2, machine 2 eventually backed up to
    some cloud service.

    Now - a new desktop machine that holds audio recording data currently
    being recorded and used due to real-time latency requirements. Two new
    network machines: Machine 1 would be both a backup machine as well as
    a file server. The file server portion of this machine holds
    astrophotography data and recorded video files. PixInsight running on
    my desktop accesses and stores over the network to machine 1. Instead
    of a ZIL in machine 1 the SSD becomes a ZLOG cache most likely holding
    a cached copy of the currently active astrophotography projects.
    Machine 1 may also run a couple of VMs over time. Machine 2 is a pure
    backup machine of everything on Machine 1.

    FYI - Machine 1 will always be located close to my desktop machines
    and use the 1Gb/S wired network. iperf suggests I get about 850Mb/S on
    and off of Machine 1. Machine 2 will be remote and generally backed up overnight using wireless.

    As always I'm interested in your comments about what works or
    doesn't work about this sort of setup.

    Cheers,
    Mark

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Rich Freeman on Thu Dec 23 18:40:01 2021
    On 23/12/2021 17:26, Rich Freeman wrote:
    Plus it is an SSD that you're forcing a lot of writes
    through, so that is going to increase your risk of failure at some
    point.

    A lot of people can't get away from the fact that early SSDs weren't
    that good. And I won't touch micro-SD for that reason. But all the
    reports now are that a decent SSD is likely to outlast spinning rust.

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Knecht@21:1/5 to antlists@youngman.org.uk on Thu Dec 23 18:40:01 2021
    On Thu, Dec 23, 2021 at 10:35 AM Wols Lists <antlists@youngman.org.uk> wrote:

    On 23/12/2021 17:26, Rich Freeman wrote:
    Plus it is an SSD that you're forcing a lot of writes
    through, so that is going to increase your risk of failure at some
    point.

    A lot of people can't get away from the fact that early SSDs weren't
    that good. And I won't touch micro-SD for that reason. But all the
    reports now are that a decent SSD is likely to outlast spinning rust.

    Cheers,
    Wol


    I'll respond to Rich's points in a bit but on this point I think
    you're both right - new SSDs are very very reliable and I'm not overly
    worried, but it seems a given that forcing more and more writes to an
    SSD has to up the probability of a failure at some point. Zero writes
    is almost no chance of failure, trillions of writes eventually wears
    something out.

    Mark

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to markknecht@gmail.com on Thu Dec 23 18:30:01 2021
    On Thu, Dec 23, 2021 at 11:56 AM Mark Knecht <markknecht@gmail.com> wrote:

    Thanks for the responses. I'll post a single response here. I had
    thought of the need to mirror the ZIL but didn't have enough physical
    disk slots in the backup machine for the 2nd SSD. I do think this is a critical point if I was to use the ZIL at all.

    Yeah, I wouldn't run ZIL non-mirrored, especially if your underlying
    storage is mirrored. The whole point of sync is to sacrifice
    performance for reliability, and if all it does is force the write to
    the one device in the array that isn't mirrored that isn't helping.
    Plus if you're doing a lot of syncs then that ZIL could have a lot of
    data on it. Plus it is an SSD that you're forcing a lot of writes
    through, so that is going to increase your risk of failure at some
    point.

    Nobody advocates for non-mirrored ZIL, at least if your array itself
    is mirrored.

    Instead
    of a ZIL in machine 1 the SSD becomes a ZLOG cache most likely holding
    a cached copy of the currently active astrophotography projects.

    I think you're talking about L2ARC. I don't think "ZLOG" is a thing,
    and a log device in ZFS is just another name for ZIL (since that's
    what it is - a high performance data journal).

    L2ARC drives don't need to be mirrored and their failure is harmless.
    They generally only improve things, but of course they do nothing to
    improve write performance - just read performance.

    As always I'm interested in your comments about what works or
    doesn't work about this sort of setup.

    Ultimately it all comes down to your requirements and how you use
    stuff. What is the impact to you if you lose this real-time audio
    recording? If you will just have to record something over again but
    that isn't a big deal, then what you're doing sounds fine to me. If
    you are recording stuff that is mission-critical and can't be repeated
    and you're going to lose a lot of money or reputation if you lose a
    recording, then I'd have that recording machine be pretty reliable
    which means redundant everything (server grade hardware with fault
    tolerance and RAID/etc, or split the recording onto two redundant sets
    of cheap consumer hardware).

    I do something similar - all the storage I care about is on
    Linux/ZFS/lizardfs with redundancy and backup. I do process
    photos/video on a windows box on an NVMe, but that is almost never the
    only copy of my data. I might offload media to the windows box from
    my camera, but if I lose that then I still have the camera. I might
    do some processing on windows like generating thumbnails/etc on NVMe
    before I move it to network storage. In the end though it goes to zfs
    on linux and gets backed up and so on. If I need to process some
    videos I might copy data back to a windows NVMe for more performance
    if I don't want to directly spool stuff off the network, but my risks
    are pretty minimal if that goes down at any point. And this is just
    personal stuff - I care about it and don't want to lose it, but it
    isn't going to damage my career if I lose it. If I were dealing with
    data professionally it still wouldn't be a bad arrangement but I might
    invest in a few things differently.

    Just ask yourself what hardware needs to fail for you to lose
    something you care about at any moment of time. If you can tolerate
    the loss of just about any individual piece of hardware that's a
    pretty good first step for just about anything, and is really all you
    need for most consumer stuff. Backups are fine as long as they're
    recent enough and you don't mind redoing work.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to markknecht@gmail.com on Thu Dec 23 19:00:01 2021
    On Thu, Dec 23, 2021 at 12:39 PM Mark Knecht <markknecht@gmail.com> wrote:

    I'll respond to Rich's points in a bit but on this point I think
    you're both right - new SSDs are very very reliable and I'm not overly worried, but it seems a given that forcing more and more writes to an
    SSD has to up the probability of a failure at some point. Zero writes
    is almost no chance of failure, trillions of writes eventually wears something out.


    Every SSD has a rating for total writes. This varies and the ones
    that cost more will get more writes (often significantly more), and
    wear pattern matters a great deal. Chia fortunately seems to have
    died off pretty quickly but there is still a ton of data from those
    who were speculating on it, and they were buying high end SSDs and
    treating them as expendable resources - and plotting Chia is actually
    a fairly ideal use case as you write a few hundred GB and then you
    trim it all when you're done, so the entirety of the drive is getting
    turned over regularly. People plotting Chia were literally going
    through cases of high-end SSDs due to write wear, running them until
    failure in a matter of weeks.

    Obviously if you just write something and read it back constantly then
    wear isn't an issue.

    Just googled the Samsung Evo 870 and they're rated to 600x their
    capacity in writes, for example. If you write 600TB to the 1TB
    version of the drive, then it is likely to fail on you not too long
    after.

    Sure, it is a lot better than it used to be, and for typical use cases
    I agree that they last longer than spinning disks. However, a ZIL is
    not a "typical use case" as such things are measured.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Mark Knecht on Thu Dec 23 19:00:01 2021
    On 23/12/2021 16:56, Mark Knecht wrote:
    Rich & Wols,
    Thanks for the responses. I'll post a single response here. I had
    thought of the need to mirror the ZIL but didn't have enough physical
    disk slots in the backup machine for the 2nd SSD. I do think this is a critical point if I was to use the ZIL at all.

    Okay, how heavily are you going to hammer the server writing to it? If
    you aren't going to stress it, don't bother with the ZIL.

    Based on inputs from the two of you I'm investigating a different
    overall setup for my home network:

    Previously - a new main desktop that holds all my data. Lots of disk
    space, lots of data. All of my big data work - audio recording
    sessions and astrophotography - are done on this machine. Two
    __backup__ machines. Desktop machines are backed up to machine 1,
    machine 1 backed up to machine 2, machine 2 eventually backed up to
    some cloud service.

    Now - a new desktop machine that holds audio recording data currently
    being recorded and used due to real-time latency requirements.

    Sounds good...

    < Two new
    network machines: Machine 1 would be both a backup machine as well as
    a file server. The file server portion of this machine holds
    astrophotography data and recorded video files. PixInsight running on
    my desktop accesses and stores over the network to machine 1. Instead
    of a ZIL in machine 1 the SSD becomes a ZLOG cache most likely holding
    a cached copy of the currently active astrophotography projects.

    Actually, it sounds like the best use of the SSD would be your working directory in your desktop.

    Machine 1 may also run a couple of VMs over time.

    Whatever :-) Just make sure that it's easy to back up! I'd be inclined
    to have a bunch of raid-5'd disks ...

    Machine 2 is a pure
    backup machine of everything on Machine 1.

    I'd say don't waste your money. You don't need a *third* machine. Spend
    the money on some large disk drives, an eSATA card for machine 1, and a
    hard disk docking station ...

    FYI - Machine 1 will always be located close to my desktop machines
    and use the 1Gb/S wired network. iperf suggests I get about 850Mb/S on
    and off of Machine 1. Machine 2 will be remote and generally backed up overnight using wireless.

    As always I'm interested in your comments about what works or
    doesn't work about this sort of setup.

    My main desktop/server currently has two 4TB drives split 1TB/3TB. The
    two 3TB partitions are raid-5'd with a 3TB drive to give me 6TB of /home
    space.

    I'm planning to buy an 8TB drive as a backup. The plan is it will go
    into a test-bed machine, that will be used for all sorts of stuff, but
    it will at least keep a copy of my data off my main machine.

    But you get the idea. If you get two spare drives you can back up on to
    them. I don't know what facilities ZFS offers for sync'ing filesystems,
    but if you're go somewhere regularly, where you can stash a hard disk
    (even a shed down the bottom of the garden :-), you back up onto disk 1,
    swap it for disk 2, back up on to disk 1, swap it for disk 2 ...

    AND YOUR BACKUP IS OFF SITE!

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Knecht@21:1/5 to rich0@gentoo.org on Thu Dec 23 23:00:01 2021
    On Thu, Dec 23, 2021 at 10:27 AM Rich Freeman <rich0@gentoo.org> wrote:

    On Thu, Dec 23, 2021 at 11:56 AM Mark Knecht <markknecht@gmail.com> wrote:

    <SNIP>

    Instead
    of a ZIL in machine 1 the SSD becomes a ZLOG cache most likely holding
    a cached copy of the currently active astrophotography projects.

    I think you're talking about L2ARC. I don't think "ZLOG" is a thing,
    and a log device in ZFS is just another name for ZIL (since that's
    what it is - a high performance data journal).


    Thank you. Yes, L2ARC.

    L2ARC drives don't need to be mirrored and their failure is harmless.
    They generally only improve things, but of course they do nothing to
    improve write performance - just read performance.

    As always I'm interested in your comments about what works or
    doesn't work about this sort of setup.

    Ultimately it all comes down to your requirements and how you use
    stuff. What is the impact to you if you lose this real-time audio
    recording? If you will just have to record something over again but
    that isn't a big deal, then what you're doing sounds fine to me.

    Actually, no.

    If
    you are recording stuff that is mission-critical and can't be repeated
    and you're going to lose a lot of money or reputation if you lose a recording, then I'd have that recording machine be pretty reliable
    which means redundant everything (server grade hardware with fault
    tolerance and RAID/etc, or split the recording onto two redundant sets
    of cheap consumer hardware).

    Closer to mission critical.

    When recording live music, most especially in situations with
    lots of musicians, you don't want to miss a good take. In cases where
    you are just capturing a band playing it's just about getting it on disk, however in cases where you are adding to music that's already on disk,
    say a vocalist singing live over the top of music the band played earlier
    then having the hardware screw up a good take is really a downer.


    I do something similar - all the storage I care about is on Linux/ZFS/lizardfs with redundancy and backup. I do process
    photos/video on a windows box on an NVMe, but that is almost never the
    only copy of my data. I might offload media to the windows box from
    my camera, but if I lose that then I still have the camera. I might
    do some processing on windows like generating thumbnails/etc on NVMe
    before I move it to network storage. In the end though it goes to zfs
    on linux and gets backed up and so on. If I need to process some
    videos I might copy data back to a windows NVMe for more performance
    if I don't want to directly spool stuff off the network, but my risks
    are pretty minimal if that goes down at any point. And this is just
    personal stuff - I care about it and don't want to lose it, but it
    isn't going to damage my career if I lose it. If I were dealing with
    data professionally it still wouldn't be a bad arrangement but I might
    invest in a few things differently.


    In the case of recording audio it just gets down to how large a
    project you are working on. 3 minute pop songs aren't much of an
    issue. 10-20 stereo tracks at 96KHz isn't all that large. For those
    the audio might fit in DRAM. However if you're working on some
    wonderful 30 minute prog rock piece with 100 or more stereo tracks
    it can get a lot larger but (in my mind anyway) the main desktop
    machine will have some sort of M.2 and maybe it fits in there
    and it gets read off hard disk before the session starts and there's
    probably no problem.

    I haven't given this a huge amount of worry because my current
    machine does an almost perfect job with 8-9 year old technology.

    In the case of astrophotography I will have multiple copies of the
    original photos. The process of stacking the individual photos can
    create gigabytes of intermediate files but as long as the originals
    are safe then it's just a matter of starting over. In my astrophotography
    setup I create about 50Mbyte per minute and take pictures for hours
    so a set of photos coming in at 1-2GB and up to maybe 10GB isn't
    uncommon. I might create 30-50GB of intermediate files which
    eventually get deleted but they can reside on the server while I'm
    working. None of that has to be terribly fast.

    Just ask yourself what hardware needs to fail for you to lose
    something you care about at any moment of time. If you can tolerate
    the loss of just about any individual piece of hardware that's a
    pretty good first step for just about anything, and is really all you
    need for most consumer stuff. Backups are fine as long as they're
    recent enough and you don't mind redoing work.

    Agreed.

    Thanks,
    Mark

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Wols Lists@21:1/5 to Mark Knecht on Thu Dec 23 23:30:01 2021
    On 23/12/2021 21:50, Mark Knecht wrote:
    In the case of astrophotography I will have multiple copies of the
    original photos. The process of stacking the individual photos can
    create gigabytes of intermediate files but as long as the originals
    are safe then it's just a matter of starting over. In my astrophotography setup I create about 50Mbyte per minute and take pictures for hours
    so a set of photos coming in at 1-2GB and up to maybe 10GB isn't
    uncommon. I might create 30-50GB of intermediate files which
    eventually get deleted but they can reside on the server while I'm
    working. None of that has to be terribly fast.

    :-)

    Seeing as I run lvm, that sounds a perfect use case. Create an LV, dump
    the files on it, when you're done unmount and delete the LV.

    I'm thinking of pulling the same stunt with wherever gentoo dumps its
    build files etc. Let it build up til I think I need a clearout, then
    create a new lv and scrap the old one.

    Cheers,
    Wol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)