• Serial, concurrent, parallel

    From Don Y@21:1/5 to All on Tue Jan 14 11:10:06 2025
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes. I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    I've been finding problems with their implementations that
    would not be present if they TRULY "thought in parallel". Things
    that should be obvious seem to slip right past them. This came
    to a head when I asked a colleague to explain multitasking:

    "Well, FIRST, the processor does... and THEN..."

    "Ah, 'THEN'! So, you are thinking about multitasking as the
    SERIALIZATION of multiple tasks instead of the CONCURRENT
    execution of them! So, when your code is run in true parallel
    form (multiple cores, multiple processors), all of those hidden
    assumptions that you've baked into your design fall apart because
    they are no longer implicitly serialized!

    "THIS is what your algorithm does (see Petri net X) and this is what
    you THINK it does (see Petri net Y). See the differences? The
    *hardware* does! And THIS is where the bug manifests..."

    I tackled the problem with folks failing to understand how an IPC/RPC/RMI
    can fail *in* the (apparent) function invocation -- by sugarizing the
    syntax to make RMI's more obvious. One source of problems (largely) gone.

    But, there still seems to be a problem wrapping people's heads around
    the fact that something is truly executing in parallel -- and, likely
    on some other processor (with all that entails).

    I.e., unlike a multicore/SMP implementation where the other processor
    is there, beside you (so, "available" as much as 'YOUR' processor is),
    an RMI can actually get started and then fail "later" (e.g., if the
    remote node is powered down, disconnected, etc.). So, you can't
    rely on the fact that the invocation succeeded to be indicative of the
    actual method running to completion. This is particularly important
    for asynchronous actions where your thread doesn't virtually migrate
    to the remote node but, effectively, forks, expecting a later join.

    TL;DR. I need a mental model that is easy for folks to consult to "remind/reinforce" this true parallelism. I'm hoping to discuss it at
    an upcoming off-site where they can feed off each other's understanding
    of it.

    Perhaps "inspired" by the travel issue, I'm thinking along the lines
    of "taking a trip" and the steps involved:

    Contact airline to arrange travel
    Airline (phone/www) may not be available at that time. Or,
    you may lose the connection (voice/data) during the
    transaction. How to recover: Spin? Reschedule? Spawn
    a separate task to handle that contact (i.e., ask your
    secretary to do it WHILE you do something else)
    Book hotel
    No vacancies. Find another?
    Pack clothing/supplies
    Some items may need to be washed/purchased. Spin? Reschedule?
    Spawn a task to handle those dependencies (i.e., have spouse
    run out and pick up those few items needed)
    Go to airport
    Family car? Taxi? Uber?
    Check luggage
    Board flight
    Fly
    Arrive destination
    Deplane
    Collect luggage
    ...

    I.e., some of these things are truly serial (can't "fly" until after
    you have boarded the flight). Others can be handled concurrently.
    And, some are truly parallel. Order is flexible WITHIN a set of
    dependency constraints (e.g., you can easily imagine packing BEFORE
    having your travel reservations. Or, while "on hold" with the
    travel agent.)

    NEEDLESSLY serializing them just means it will take you longer
    (wall time) to get them done. At the extreme, you may miss your
    intended arrival time/date!

    *But*, the more important issue that hides in these dependencies
    is that they can abend and you (the traveler) will have to catch
    those "exceptions" and handle them if you really want to ensure
    your travel!

    E.g., your secretary could tell you (after you'd "delegated" that
    task) that there are no flights available on your desired travel
    days. YOU, then, have to decide on a new strategy/timetable. Or,
    the flight could get canceled AFTER you have a confirmed reservation.
    Your luggage might not arrive WITH your flight. The hotel might
    have double-booked your room. Etc.

    So, WHILE you are proceeding on what you think is a successful
    path to your goal, exceptions can be thrown ASYNCHRONOUSLY that
    require your attention. In some cases, you can ignore them until
    they manifest as hard failures (i.e., you get to the airport and
    THEN realize the flight was canceled and the SMS that you ignored
    likely was a notification of this event!)

    This is complex (in writing) but trivial -- and relatable -- in a
    discussion. I *think* it draws attention to the fact that things
    are actually happening in parallel despite the fact that you might
    WANT to think of this as a SERIES of steps towards a goal.

    You certainly wouldn't describe it as "I called the airline AND THEN..." because you would know that many things were (or could) be happening
    at literally the same time.

    Can anyone suggest a simpler "real world" problem to promote these "situations/scenarios"?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Tue Jan 14 19:43:04 2025
    Don Y <blockedofcourse@foo.invalid> wrote:

    Can anyone suggest a simpler "real world" problem to promote these "situations/scenarios"?

    Yes, good cooks do it all the time.:

    Put the kettle on to boil water because you can't do anything until you
    have boiling water. Work out what is going to take longest to cook
    (potatoes) and put them on first. Cut up the meat and fry it while the potatoes cook but as soon as it is in the frying pan, put a saucepan of
    water on the hob to bring it to the boil.

    While the water is coming to the boil, start cutting up the green
    vegetables.so they are ready when the water boils. With 10 minutes to
    go, put in the vegetables and, with luck, the whole meal will be ready
    to serve at the same time.

    I had a friend who used to cook serially. Her fried eggs had been
    sitting cold and rubbery for 15 minutes by the time the bacon was ready.
    For a while she cooked for a lodger - on one occasion he sneaked out and
    bought himself some fish and chips, ate them in his room and went to
    bed. She was very upset when she called him down for his evening meal
    and found he wasn't hungry and had been asleep for over half an hour.


    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Tue Jan 14 21:12:33 2025
    Don Y <blockedofcourse@foo.invalid> wrote:


    Imagine, instead, someone (sous chef?) running out to BUY some potatoes
    for the meal and:
    - coming home empty handed ("they ran out of potatoes")

    I have had a similar scenario when cooking for two of us in my van for
    12 days. My companion was in the early stages of dementia and things
    kept going 'missing', only to turn up next day in the most unexpected
    places. That included items of food, so changes to a planned meal had
    to be improvised as I went along and I had to always bear in mind what
    to do if something we had bought for supper just wasn't there when I
    needed it.

    Although cooking on a one-burner diesel stove in the back of the van is
    a serial affair, it mimics a parallel one when several different items
    have to be heated up or kept hot at once. I sometimes found I was
    literally operating my hands in parallel, getting ready to take
    something off the stove with one hand whilst preparing the next item to
    go on it with the other - then a quick swap-over.

    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Tue Jan 14 13:44:44 2025
    On 1/14/2025 12:43 PM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    Can anyone suggest a simpler "real world" problem to promote these
    "situations/scenarios"?

    Yes, good cooks do it all the time.:

    Put the kettle on to boil water because you can't do anything until you
    have boiling water. Work out what is going to take longest to cook (potatoes) and put them on first.

    But, that's only because you want all of those items to be ready at the
    same time. One could alternatively decide to prepare the desserts, first,
    and throw them in the refrigerator so they are out of the way...

    Cut up the meat and fry it while the
    potatoes cook but as soon as it is in the frying pan, put a saucepan of
    water on the hob to bring it to the boil.

    While the water is coming to the boil, start cutting up the green vegetables.so they are ready when the water boils. With 10 minutes to
    go, put in the vegetables and, with luck, the whole meal will be ready
    to serve at the same time.

    But that's concurrency (from the cook's perspective). It is analagous
    to multitasking -- the COOK'S actions are serialized.

    The parallelism happens because the appliances can "apply heat",
    unattended.

    And, there is no (practical) possibility of the frying of the meat
    abending without the potatoes also suffering a similar "interruption".

    [In my travel example, it is obvious that there are other "actors" ("processors") involved, beyond the equipment that assists them.]

    Imagine, instead, someone (sous chef?) running out to BUY some potatoes
    for the meal and:
    - coming home empty handed ("they ran out of potatoes")
    - not coming home promptly ("stuck in traffic")
    - NEVER coming home ("ran away with the babysitter!" :> )
    In some of those cases, you can be "notified" of the "failure"
    so you don't end up proceeding AS IF all is well and only discovering
    the problem when you actually NEED the potatoes.

    Or, imagine ORDERING a pizza, for delivery, and starting to prepare
    a salad and sides while awaiting its delivery. But, the order gets
    misplaced or the driver gets waylaid.

    I'm trying to show the value of dealing with these asynchronous exceptions/notifications so you don't find yourself AT a wait-point
    only to discover the thing you are waiting on is not coming. This
    allows you to design more robust implementations.

    I.e., if the pizza delivery guy texts you to say he was in a car
    wreck, you can take action to acquire/make something in lieu of the
    pizza (maybe even order another!) BEFORE the expect4ed delivery time
    arrives and you find out the hard way that you're SoL.

    Most folks who think serially would never imagine that "memory"
    could suddenly disappear. Or, that a file they had opened would
    suddenly close itself. Or, have entirely different contents than
    those YOU had written (i.e., that's why we have locks!).

    I had a friend who used to cook serially. Her fried eggs had been
    sitting cold and rubbery for 15 minutes by the time the bacon was ready.

    Mashed potatoes are the classic test of good meal planning.

    For a while she cooked for a lodger - on one occasion he sneaked out and bought himself some fish and chips, ate them in his room and went to
    bed. She was very upset when she called him down for his evening meal
    and found he wasn't hungry and had been asleep for over half an hour.

    I tend to *eat* serially, rather than "around the plate". Of course,
    I'm often eating AT the stove while cooking something else to plate
    (concurrent eating and cooking?? :> )

    I used to use "changing a flat tire" to explain algorithms to
    manglement. It tends to fit the serial pattern of thought as
    there are few REAL opportunities for concurrency (as changing
    a flat tends to be a one-man operation).

    Perhaps the actions of a pit crew would be a better analogy on
    those lines? But, again, all the actors are visible in such
    scenarios; you never have to *wonder* if the vehicle was gassed
    up as you can SEE the guy who's responsible for that task.

    The travel example hides actors from direct observation -- yet
    you know they are there, supposedly acting on your behalf. (And,
    will, hopefully, contact you if they encounter a problem doing so:
    "We've had a fire in the building and we'll be closed for two
    weeks while we make repairs. We've taken the liberty of arranging
    for you to stay at a nearby hotel -- at our expense -- to minimize
    the inconvenience to you." Or, "We've had a fire. Find someplace
    else to stay!" The "old" alternative would be to show up and
    find the building "closed for repairs" and have to process that
    event LATE.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Tue Jan 14 20:22:23 2025
    On 1/14/2025 2:12 PM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    Imagine, instead, someone (sous chef?) running out to BUY some potatoes
    for the meal and:
    - coming home empty handed ("they ran out of potatoes")

    I have had a similar scenario when cooking for two of us in my van for
    12 days. My companion was in the early stages of dementia and things
    kept going 'missing', only to turn up next day in the most unexpected
    places. That included items of food, so changes to a planned meal had
    to be improvised as I went along and I had to always bear in mind what
    to do if something we had bought for supper just wasn't there when I
    needed it.

    I've learned to double-check the ACTUAL availability of ingredients
    when baking. Here, "extracts" tend to be packaged in the same little
    1 oz bottles. So, spying such a bottle in the cupboard SUGGESTS that
    I have said extract on hand. Unfortunately, I have taken to saving
    the empty (glass) bottles as convenient containers in which to mix
    extracts from concentrates (e.g., just add grain alcohol).

    So, there are times when I go to reach for a bottle and find it awfully
    *light* -- i.e., empty! And, manage the panic by reaching for ANOTHER...
    only to find it, also, empty!

    Yes, I can mix up some extract on-the-fly, but, usually, I am at a point
    where I need it *now*, not 3 minutes hence. <frown>

    Utensils *tend* to be better behaved -- unless SWMBO has taken it upon
    herself to make something. As I tend to (safely) assume *I* was the last person to use the kitchen, if something isn't where it SHOULD be, I
    will run through my most recent activities to recall when I last used it
    and what might have become of it. If, OTOH, *she* has intervened, then
    all bets are off!

    And, of course, rarely used items stress the memory: where the hell did I
    put the canoli forms? When did I last use them? (particularly difficult
    to recall as I don't eat the things)

    But, that makes it more of an adventure than a chore! :-/

    Although cooking on a one-burner diesel stove in the back of the van is
    a serial affair, it mimics a parallel one when several different items
    have to be heated up or kept hot at once. I sometimes found I was
    literally operating my hands in parallel, getting ready to take
    something off the stove with one hand whilst preparing the next item to
    go on it with the other - then a quick swap-over.

    I'm that way when making pancakes (w/sausage links). Keeping the
    sausage links from burning (and/or splattering grease), the skillet
    greased (butter), the pancakes from burning all WHILE eating my
    share of them (while cooking SWMBO's) is a juggling act. And, like
    the mashed potatoes example, not something you want to let get cold
    as the appeal quickly fades with temperature.

    [But, as mentioned before, I will eat the pancakes, THEN the sausage
    links, then the second stack of pancakes, etc. while standing at the
    stove instead of sitting down to a regular "meal". I suspect trying
    to make pancakes for MANY people seated at the same time would be a
    bigger effort.]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Martin Brown@21:1/5 to Don Y on Thu Jan 16 10:14:37 2025
    On 14/01/2025 18:10, Don Y wrote:
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes.  I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    Most people do think in a very linear fashion so I'm not too surprised
    at your finding. Good realtime programmers are as rare as hen's teeth.

    I've been finding problems with their implementations that
    would not be present if they TRULY "thought in parallel".  Things
    that should be obvious seem to slip right past them.  This came
    to a head when I asked a colleague to explain multitasking:

      "Well, FIRST, the processor does... and THEN..."

      "Ah, 'THEN'!  So, you are thinking about multitasking as the
      SERIALIZATION of multiple tasks instead of the CONCURRENT
      execution of them!  So, when your code is run in true parallel
      form (multiple cores, multiple processors), all of those hidden
      assumptions that you've baked into your design fall apart because
      they are no longer implicitly serialized!

    There are major shrink wrap programs out there that have these bugs.

    Excel 2007 as first released on a multi core machine and programmed via
    VBA the charting package would quite happily try to plot data points
    *before* the axis scales had been established. This did not end well and
    broke a lot of previously stable working commercial code.

    It could be "solved" by the addition of suitable small delays here and
    there to prevent the race condition triggering. Heavy users went back to
    XL2003 which I recall was a particularly good vintage.
    Can anyone suggest a simpler "real world" problem to promote these "situations/scenarios"?

    For things to get interesting you need at least 4 threads with some of
    them depending on synchronisation with the others. Building a house or
    putting up one of those insert tab A into slot B tents might do it.

    A house without foundations will obviously fall down and putting the
    roof on before plastering and fitting out the interior is essential.

    A variant of this cartoon might also get people's attention:

    https://www.businessballs.com/amusement-stress-relief/tree-swing-cartoon-pictures-early-versions/

    --
    Martin Brown

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Thu Jan 16 11:53:27 2025
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/14/2025 2:12 PM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    Imagine, instead, someone (sous chef?) running out to BUY some potatoes
    for the meal and:
    - coming home empty handed ("they ran out of potatoes")

    I have had a similar scenario when cooking for two of us in my van for
    12 days. My companion was in the early stages of dementia and things
    kept going 'missing', only to turn up next day in the most unexpected places. That included items of food, so changes to a planned meal had
    to be improvised as I went along and I had to always bear in mind what
    to do if something we had bought for supper just wasn't there when I
    needed it.

    I've learned to double-check the ACTUAL availability of ingredients
    when baking.

    The problem was that the vegetables were there on the worktop when I
    started. -they just weren't there when I came to put them in the
    saucepan. A thorough search of the van failed to find them.

    The next day I discovered that my frend had put them underneath the van
    instead of the milk, which was normally kept there because it stayed
    cooler.


    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Martin Brown on Thu Jan 16 04:41:39 2025
    On 1/16/2025 3:14 AM, Martin Brown wrote:
    On 14/01/2025 18:10, Don Y wrote:
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes.  I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    Most people do think in a very linear fashion so I'm not too surprised at your
    finding. Good realtime programmers are as rare as hen's teeth.

    When you think of a circuit diagram, do you "track" an electron through
    the wiring? Don't you conceptualize "this block does this WHILE this
    other block is doing that"?

    People seem to tolerate the notion of an ISR running WHILE their code
    is running. They don't seem to think of it as "my code is running and
    THEN an interrupt comes along and does...".

    Yet, when they think of multitasking (and beyond), they seem to
    intentionally serialize the actors' actions. Why the difference
    in mindsets?

    I've been finding problems with their implementations that
    would not be present if they TRULY "thought in parallel".  Things
    that should be obvious seem to slip right past them.  This came
    to a head when I asked a colleague to explain multitasking:

       "Well, FIRST, the processor does... and THEN..."

       "Ah, 'THEN'!  So, you are thinking about multitasking as the
       SERIALIZATION of multiple tasks instead of the CONCURRENT
       execution of them!  So, when your code is run in true parallel
       form (multiple cores, multiple processors), all of those hidden
       assumptions that you've baked into your design fall apart because
       they are no longer implicitly serialized!

    There are major shrink wrap programs out there that have these bugs.

    Excel 2007 as first released on a multi core machine and programmed via VBA the
    charting package would quite happily try to plot data points *before* the axis
    scales had been established. This did not end well and broke a lot of previously stable working commercial code.

    Well, that's sad. But, is similar to what I am seeing in my colleagues' efforts.

    It's like folks "suddenly" working in a preemptive RTOS and being surprised that the instruction on line N+1 doesn't happen right after the one on line N. E.g., setting up <something> before you've set up the control structures
    that it uses and then being surprised when it starts running "undirected"!

    It could be "solved" by the addition of suitable small delays here and there to
    prevent the race condition triggering. Heavy users went back to XL2003 which I
    recall was a particularly good vintage.

    Dunno. I've not used a spreadsheet since Quattro. Never really saw the
    appeal (if I need some set of values calculated, I'll just write a bit of
    code to do it unambiguously -- instead of wondering what quirks the
    spreadsheet imposes). Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    Can anyone suggest a simpler "real world" problem to promote these
    "situations/scenarios"?

    For things to get interesting you need at least 4 threads with some of them depending on synchronisation with the others. Building a house or putting up one of those insert tab A into slot B tents might do it.

    A house without foundations will obviously fall down and putting the roof on before plastering and fitting out the interior is essential.

    Yes, but the foundation won't get poured if the cement factory is "closed".
    Or, the trucks won't be able to deliver it if there is a road closure along
    the route.
    Or...

    Any of these things can affect that dependency. And, many of them can be signaled BEFORE the actual "need" (deadline) arises!

    A variant of this cartoon might also get people's attention:

    https://www.businessballs.com/amusement-stress-relief/tree-swing-cartoon-pictures-early-versions/

    One of the issues that they seem to have universal problems with is the fact that another (parallel) actor may "go away" while it is working on their
    behalf or BEFORE it has been tasked with some activity of theirs. My RTOS notifies all clients of objects when those objects (or the servers backing them) disappear (a task declares all of its dependencies to be sure it HAS
    them as well as is ENTITLED to them before starting). The thinking being
    that a client would want to take remedial action BEFORE the "need" for the object manifests.

    I've set default exception handlers to kill off any task whose dependencies disappear (seems like a logical approach) with the thinking that a task
    that *can* handle such a disappearance will register a different/better exception handler.

    I.e., there are no "failing to test the return value of malloc" scenarios.

    My system is designed to dynamically adapt to the resources available
    so it willingly sheds tasks that it deems "less important" in any given situation (there are upwards of 50K tasks in a system; powering down
    a node -- or, losing it in some other way -- easily kills off several hundred!).

    A well-designed task knows this and checkpoints itself so it can pick
    up where it left off when resources are again plentiful enough to support
    it's resurrection ("Lazarus service"). As any object on which you rely
    is backed by SOME task, SOMEwhere, you should be prepared for those objects
    to "disappear" at any time. If you know one has gone, you can adapt to
    that "loss" -- even if it means checkpointing just prior to the point where
    you access it (so you can resume at that point and benefit from all of your efforts-to-date, later). Similarly, if you are DONE using an object,
    you should release your dependency on it so you WON'T be notified of
    its demise.

    I don't see how to "remind" them of this short of deliberately throwing ALL of those exceptions early in the task's lifecycle -- to ensure they are handled "as desired" (instead of being surprised by their "random" occurrences,
    later!)

    The travel analogy works because one can imagine WANTING to be notified
    (by the airline, hotel, etc.) that your reservation has been changed
    or canceled BEFORE you actually expect to "use" it. Then, you can
    take a different approach to the "problem" instead of being stuck in
    an airport or hotel lobby and abending, there!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Thu Jan 16 12:03:53 2025
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together


    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Thu Jan 16 06:49:55 2025
    On 1/16/2025 4:53 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/14/2025 2:12 PM, Liz Tuddenham wrote:

    I have had a similar scenario when cooking for two of us in my van for
    12 days. My companion was in the early stages of dementia and things
    kept going 'missing', only to turn up next day in the most unexpected
    places. That included items of food, so changes to a planned meal had
    to be improvised as I went along and I had to always bear in mind what
    to do if something we had bought for supper just wasn't there when I
    needed it.

    I've learned to double-check the ACTUAL availability of ingredients
    when baking.

    The problem was that the vegetables were there on the worktop when I
    started. -they just weren't there when I came to put them in the
    saucepan. A thorough search of the van failed to find them.

    Ah. "Can you wait outside (in another room) while I make dinner?" :>
    SWMBO often takes out things she THINKS I will need. Or, puts away
    something that she thinks is "needless clutter" -- that *I* have taken
    out for use. Trying to keep track of someone else's actions is
    very distracting -- especially when you are trying to "juggle".

    The next day I discovered that my frend had put them underneath the van instead of the milk, which was normally kept there because it stayed
    cooler.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Thu Jan 16 06:46:57 2025
    On 1/16/2025 5:03 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Databases (relational ones) are *so* much more. In addition to
    strict typing on each "column", you can define relationships
    between columns and specific values in columns, writing code
    to enforce constraints so that certain values are not accepted
    in certain places based on other values, etc.

    E.g., I can state that "fertile" can not be true if "sex" is not "female".
    On a per-record basis. So, Bob can never be marked as "fertile" but
    Becky might!

    Or, that the city can not be "chicago" unless the state is "Illinois"
    (I am assuming there are no chicagos in other states; "Springfield"
    tends to be a popular city name -- but, there is no Springfield in
    Alaska so if someone tries to enter an address of Springfield,
    Alaska, it is known to be invalid and shouldn't be accepted.).

    Or, that a social security number must be of the form XXX-##-####
    *and* XXX can't be 000, 333, 666, etc.

    Or, that a mother's birth date must precede those of her biological
    children by at least 5 years but the father's must precede his
    biological children by 9 years.

    Etc.

    And, I can ensure data is not accepted (by the database) if those
    specific conditions aren't met.

    I can also issue an upcall when certain constraints are met or violated.

    I can have a billion records, easily accessed (indexes), each with
    1000 plus columns. And, of course, tie as many of these "tables"
    together as if one larger entity (1:1, 1:N, etc.).

    E.g., I don't support "files" in my system as then EACH consumer would
    have to either assume they were intact and of the correct format *or*
    verify their contents prior to using them. Instead, I use "tables"
    (relations) and let the RDBMS ensure everything is correct. So, an
    application doesn't have to parse the syntax of a particular file and
    verify particular "parameters/data" are correct; instead, it can use
    them directly on the assurances of the RDBMS that it would only
    ALLOW valid data AND would *maintain* that so the application doesn't
    have to verify everything is "still" OK.

    If some CONCURRENT client opts to change a record (assuming he has
    permission to do so), I can notify other clients of the change so
    they don't have to sit and poll the data for changes (e.g., make
    a whiteboard or publish/subscribe service). All the while ensuring
    that the changes made retain data validity.

    I can change multiple records in a single "transaction" such that
    if some other (concurrent) client tries to access any of that at
    the same time, he doesn't see partial results (atomicity). Or,
    roll back the entire transaction as if it never happened.

    I can store firmware for my devices as fields (BLOBs) in a table.
    E.g., create tuples like (ModuleType, FirmwareImage, Revision)
    and then just issue a query: "I need the greatest Revision number
    of FirmwareImage for a particular ModuleType"). If newer firmware
    becomes available for a particular ModuleType, the *database* will
    signal the applications that use that firmware of that new version
    so they can be updated -- the user doesn't have to know what
    goes where, etc.

    SWMBO tracked the capital expenditures for a local hospital for
    many years. The "Finance" people thought in terms of spreadsheets.
    So, any time top management wanted to know "how much did we spend
    on *paint* for the east wing renovation and how much was the associated
    labor cost", they would have to find the "paint" entries and then
    try to sort out which labor charges were associated with "painting".

    SWMBO would write a query and have the answer in a few minutes.
    Because ALL of the costs were in one place, not scattered around
    multiple spreadsheets. Just qualify the results for "Project = East
    Wing Renovation"

    She has been tracking our household expenses with the same set of
    tables. So, when we have to decide what level of membership we
    should purchase for Costco (a "members only" store where different
    classes of membership have different costs and "rewards"), she can
    identify how much we spent there, in any given year. As a first
    order approximation, we can then see if the end-of-year "rebate"
    associated with the membership class will cover the added cost of
    that membership class. AND, can further refine that by spitting
    out the big ticket items that we may have purchased in that year:
    "How likely are we to purchase ANOTHER set of tires THIS year? If
    not, then our expenses will likely be $1000 less and the rebate
    proportionately so!"

    I use a set of tables to track all of the files on all of my disks
    (including files inside "archives", VHDs, VMDKs, etc.). So, if
    I am looking for something, I can type a query and see which media
    are likely to contain the file(s) desired.

    I can also use this to tell me if there is another copy of a particular
    file, elsewhere (same size and hash -- even if the NAME is different!
    Try doing that for just ONE computer, with a spreadsheet. E.g., THIS
    computer (with just WWW browser, MUA and a few token apps) currently has 302,146 files in 35,180 folders. And, that doesn't count any of the
    files *inside* ZIP, CAB, TGZ, etc. archives that appear "once" in that
    count.

    How would you do this with a spreadsheet? N columns for the N files
    that might be *inside* a particular file? Isn't it easier to:
    ID
    Name
    ID of Container
    and let the query build a full pathname, *if* you need it:
    ...
    (5, "C:", 0)
    ...
    (9, "Windows", 5)
    ...
    (45, "notepad.exe", 9)
    to encode:
    C:\Windows\notepad.exe

    My "address book" tracks family relationships. So, I can ask for
    a list of my uncle Joe's kids' *spouses'* names. How would you
    address that in a spreadsheet -- some large number of column-PAIRS
    that, hopefully, exceed the largest family size for each
    child+spouse? And, their birthdates? Phone numbers? Addresses?
    Photos?

    Isn't it simpler to have:
    ID of person
    First Name
    Last Name
    ID of mother
    ID of father
    Birthdate
    ID of spouse
    Address
    and let the query *build* the family tree by resolving each of these
    "IDs" *in* the query?

    And, as there are likely fewer addresses than persons, maybe track addresses
    in a separate table and just link to an "ID of address" for each person?
    As there are a finite number of City names, maybe a "ID of city" in the
    address table? Ditto for "ID of state"?

    Of course, you don't need to *see* all of these "linkages"; the query
    hides them and just shows you how they *resolve* (a City ID of "327"
    means "Riverside")

    IMO, people like spreadsheets because they can "see" all the data.
    So what? Can you spot any that is bogus? Missing? Duplicated?
    If there are 302K rows, how long for you to scroll through them all?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Thu Jan 16 15:16:58 2025
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/16/2025 5:03 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Databases (relational ones) are *so* much more. In addition to
    strict typing on each "column", you can define relationships
    between columns and specific values in columns, writing code
    to enforce constraints so that certain values are not accepted
    in certain places based on other values, etc.

    E.g., I can state that "fertile" can not be true if "sex" is not "female".
    On a per-record basis. So, Bob can never be marked as "fertile" but
    Becky might!

    Or, that the city can not be "chicago" unless the state is "Illinois"
    (I am assuming there are no chicagos in other states; "Springfield"
    tends to be a popular city name -- but, there is no Springfield in
    Alaska so if someone tries to enter an address of Springfield,
    Alaska, it is known to be invalid and shouldn't be accepted.).

    Or, that a social security number must be of the form XXX-##-####
    *and* XXX can't be 000, 333, 666, etc.

    Or, that a mother's birth date must precede those of her biological
    children by at least 5 years but the father's must precede his
    biological children by 9 years.

    Etc.

    All of those can be done on a spreadsheet. ...and similar checks can be
    done between cells in different rows. The check formula is written into
    a 'hidden' cell and the final result is displayed in a 'locked' cell.
    If someone puts in faulty data, the spreadsheet can't stop them but it
    can ensure that the dud data doesn't appear in the output.

    I use spreadsheets for all sorts of things: calculating component
    values, customers and accounts, encoding sensitive information,
    addressing envelopes, uploading invoices to the Web ...etc.

    I tried using a database to keep track of my Christmas Cards, but find
    it slow and restrictive compared with a spreadsheet.


    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From john larkin@21:1/5 to All on Thu Jan 16 07:57:03 2025
    On Thu, 16 Jan 2025 15:10:37 +0000, brian <nospam@b-howie.co.uk>
    wrote:

    In message <vm69a2$2h52d$1@dont-email.me>, Don Y ><blockedofcourse@foo.invalid> writes
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes. I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    It's simple project engineering and resource management . PERT ,
    Critical path analysis, lead times , Gantt charts, even Microsoft
    Project if you must. You must be talking to non-engineers.

    Brian

    The more such management tools that you use, the slower a project will
    go.

    The key observation is that when things are serialized, they happen sequentially.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From brian@21:1/5 to blockedofcourse@foo.invalid on Thu Jan 16 15:10:37 2025
    In message <vm69a2$2h52d$1@dont-email.me>, Don Y
    <blockedofcourse@foo.invalid> writes
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes. I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    It's simple project engineering and resource management . PERT ,
    Critical path analysis, lead times , Gantt charts, even Microsoft
    Project if you must. You must be talking to non-engineers.

    Brian
    --
    Brian Howie

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Thu Jan 16 09:01:23 2025
    On 1/16/2025 8:16 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/16/2025 5:03 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Databases (relational ones) are *so* much more. In addition to
    strict typing on each "column", you can define relationships
    between columns and specific values in columns, writing code
    to enforce constraints so that certain values are not accepted
    in certain places based on other values, etc.

    E.g., I can state that "fertile" can not be true if "sex" is not "female". >> On a per-record basis. So, Bob can never be marked as "fertile" but
    Becky might!

    Or, that the city can not be "chicago" unless the state is "Illinois"
    (I am assuming there are no chicagos in other states; "Springfield"
    tends to be a popular city name -- but, there is no Springfield in
    Alaska so if someone tries to enter an address of Springfield,
    Alaska, it is known to be invalid and shouldn't be accepted.).

    Or, that a social security number must be of the form XXX-##-####
    *and* XXX can't be 000, 333, 666, etc.

    Or, that a mother's birth date must precede those of her biological
    children by at least 5 years but the father's must precede his
    biological children by 9 years.

    Etc.

    All of those can be done on a spreadsheet. ...and similar checks can be
    done between cells in different rows. The check formula is written into
    a 'hidden' cell and the final result is displayed in a 'locked' cell.
    If someone puts in faulty data, the spreadsheet can't stop them but it
    can ensure that the dud data doesn't appear in the output.

    It's a kludge. How do you handle the 302K files on THIS machine?
    Given that a PATHNAME can be thousands of characters in length?
    Tell me where all of the files named "Readme" are located...
    How quickly for you to get a result from the tool? Or, how many
    have the hash "2094230953408573847503485034038298028023984"?

    I use spreadsheets for all sorts of things: calculating component
    values, customers and accounts, encoding sensitive information,
    addressing envelopes, uploading invoices to the Web ...etc.

    I use databases (tables/relations) for all of that. Wanna know the
    last time I telephoned XYZ Corporation? Or, who's associated
    with a particular phone number? Or, how many billable hours I
    charged on project XYZ in the week of January 25, 1990 (to
    prepare an invoice)?

    "Tables" are particularly attractive in constraining data. E.g.,
    if a user types in a city name, I can present a list of *potential*
    state names (because I know which city names are valid in each state).
    Or, a list of associated ZIP (postal) codes, from which I could
    determine the state.

    This is particularly valuable with things like speech I/O as it
    is much easier to recognize words in a limited/constrained vocabulary
    than it is to recognize unconstrained speech. "Call <entity>"
    knows that "entity" has to be one of the entities listed in the
    address book -- otherwise, it wouldn't make sense. So, if the
    speech recognizer THINKS the user said "rolf", it would know that
    to be incorrect if there are no "Rolfs" in the address book -- maybe
    "Ralph"?

    I tried using a database to keep track of my Christmas Cards, but find
    it slow and restrictive compared with a spreadsheet.

    "Databases" are just tables. So, you're doing something wrong.

    You have to have a "front-end" that acts on your behalf to manage
    the interface. E.g., I wouldn't try to manually piece together
    C: \ Windows \ notepad.exe
    or:
    C: \ Users \ Don \ Desktop \ apc_1500va.zip \ schem \ page5.png
    by examining each tuple. Instead, I let a query do all of that
    and give me the result AS IF it had been stored there, directly.

    I can do this in real-time as I am scanning a mounted volume
    to verify that every file is "intact" (compute hash, compare to
    stored hash, store date/time of this verification, advance to next
    file) at the speed of the medium. When "done", email a list
    of corrupt/missing files to $USER so they can take action to
    restore them from duplicates on other media (which the database
    can identify based on stored sizes and hashes: "Mount volume
    XYZ so I can restore the following entities...")

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to brian on Thu Jan 16 09:03:00 2025
    On 1/16/2025 8:10 AM, brian wrote:
    In message <vm69a2$2h52d$1@dont-email.me>, Don Y <blockedofcourse@foo.invalid>
    writes
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes.  I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    It's simple project engineering and resource management . PERT , Critical path
    analysis, lead times , Gantt charts, even Microsoft Project if you must. You must be talking to non-engineers.

    Different class of activities, entirely.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Thu Jan 16 09:03:37 2025
    On 1/16/2025 8:16 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    [...]
    Ah. "Can you wait outside (in another room) while I make dinner?" :>

    Tricky in a 5ft x 8ft van conversion.
    < http://www.poppyrecords.co.uk/Van/vanconversion.htm>

    That;s what "outside" is for! :>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Martin Brown@21:1/5 to Liz Tuddenham on Thu Jan 16 16:16:23 2025
    On 16/01/2025 12:03, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Real databases implement some form of content addressable storage.

    --
    Martin Brown

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Martin Brown@21:1/5 to Don Y on Thu Jan 16 16:27:31 2025
    On 16/01/2025 11:41, Don Y wrote:
    On 1/16/2025 3:14 AM, Martin Brown wrote:
    On 14/01/2025 18:10, Don Y wrote:
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes.  I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    Most people do think in a very linear fashion so I'm not too surprised
    at your finding. Good realtime programmers are as rare as hen's teeth.

    When you think of a circuit diagram, do you "track" an electron through
    the wiring?  Don't you conceptualize "this block does this WHILE this
    other block is doing that"?

    People seem to tolerate the notion of an ISR running WHILE their code
    is running.  They don't seem to think of it as "my code is running and
    THEN an interrupt comes along and does...".

    An ISR is sufficiently small and so mission critical that if it doesn't
    save and restore the registers it affects properly the OS dies PDQ.

    Yet, when they think of multitasking (and beyond), they seem to
    intentionally serialize the actors' actions.  Why the difference
    in mindsets?

    Most people can't imagine the various tasks running at different speeds
    either timesliced or by priority. There is always a tendency amongst
    programmer to think that their task is *the* most important one. The
    thing you learn quickly on truly massively parallel hardware is that the manager task that keeps all of the allocated workers busy is by far the
    highest priority.

    It could be "solved" by the addition of suitable small delays here and
    there to prevent the race condition triggering. Heavy users went back
    to XL2003 which I recall was a particularly good vintage.

    Dunno.  I've not used a spreadsheet since Quattro.  Never really saw the appeal (if I need some set of values calculated, I'll just write a bit of code to do it unambiguously -- instead of wondering what quirks the

    I like spreadsheets for making test data. The sort of mistakes you can
    make in a spreadsheet implementation are almost entirely orthogonal to
    those you can make in a conventional programming language. As such it
    makes a great scratch pad for developing tricky algorithms with all of
    the internal workings clearly visible on the screen.

    spreadsheet imposes).  Especially as so many people seem to use
    spreadsheets
    in lieu of (real) databases.  :<

    Sigh - yes I know they do :(
    Up to a couple of thousand lines it isn't *too* bad but after that it
    goes downhill very quickly. Doesn't stop people - typically middle
    managers with very limited skills having silly sized ones though.


    --
    Martin Brown

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Martin Brown on Thu Jan 16 11:06:04 2025
    On 1/16/2025 9:16 AM, Martin Brown wrote:
    On 16/01/2025 12:03, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases.  :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Real databases implement some form of content addressable storage.

    Atomicity
    Consistency
    Isolation
    Durability

    These don't just apply to the *data* stored therein but, also, to
    any other mechanisms that are associated with the data (e.g., indexes, constraints, triggers, etc.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Edward Rawde@21:1/5 to Liz Tuddenham on Thu Jan 16 12:57:11 2025
    "Liz Tuddenham" <liz@poppyrecords.invalid.invalid> wrote in message news:1r69dpn.1bbzdc81ozzc9wN%liz@poppyrecords.invalid.invalid...
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/16/2025 5:03 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ...

    All of those can be done on a spreadsheet. ...and similar checks can be
    done between cells in different rows. The check formula is written into
    a 'hidden' cell and the final result is displayed in a 'locked' cell.
    If someone puts in faulty data, the spreadsheet can't stop them but it
    can ensure that the dud data doesn't appear in the output.

    I use spreadsheets for all sorts of things: calculating component
    values, customers and accounts, encoding sensitive information,
    addressing envelopes, uploading invoices to the Web ...etc.

    I tried using a database to keep track of my Christmas Cards, but find
    it slow and restrictive compared with a spreadsheet.

    Not sure what you're doing wrong but once I'd learned to use https://mariadb.org/ And
    https://www.heidisql.com/
    I never went anywhere near spreadsheets again.
    The last time Excel opened was accidentally when I was trying to see the contents of a csv file.

    The only time I make a spreadsheet is when I want to send data to someone else and they can't handle a database.
    I wouldn't put customers and accounts anywhere near a spreadsheet.
    I can remember using a spreadsheet to model an op amp circuit but that was many years before I had LTSpice.

    Databases can also be backed up as human readable text files.
    Just export data as SQL.
    This gives me peace of mind that whatever software is or isn't available in the future I can always read my data.



    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Martin Brown on Thu Jan 16 11:25:18 2025
    On 1/16/2025 9:27 AM, Martin Brown wrote:
    On 16/01/2025 11:41, Don Y wrote:
    On 1/16/2025 3:14 AM, Martin Brown wrote:
    On 14/01/2025 18:10, Don Y wrote:
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes.  I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    Most people do think in a very linear fashion so I'm not too surprised at >>> your finding. Good realtime programmers are as rare as hen's teeth.

    When you think of a circuit diagram, do you "track" an electron through
    the wiring?  Don't you conceptualize "this block does this WHILE this
    other block is doing that"?

    People seem to tolerate the notion of an ISR running WHILE their code
    is running.  They don't seem to think of it as "my code is running and
    THEN an interrupt comes along and does...".

    An ISR is sufficiently small and so mission critical that if it doesn't save and restore the registers it affects properly the OS dies PDQ.

    My point being that people think of it as running WHILE their code
    is running. Even though the processor is technically single-threaded
    and it is actually operating in series.

    Yet, when they think of multitasking (and beyond), they seem to
    intentionally serialize the actors' actions.  Why the difference
    in mindsets?

    Most people can't imagine the various tasks running at different speeds either
    timesliced or by priority. There is always a tendency amongst programmer to think that their task is *the* most important one. The thing you learn quickly
    on truly massively parallel hardware is that the manager task that keeps all of
    the allocated workers busy is by far the highest priority.

    Yes. In my case, the workload manager has to also decide when to bring additional resources on-line (power up other nodes) or shed surplus
    resources (migrate tasks off of nodes so they can be powered DOWN)

    It could be "solved" by the addition of suitable small delays here and there
    to prevent the race condition triggering. Heavy users went back to XL2003 >>> which I recall was a particularly good vintage.

    Dunno.  I've not used a spreadsheet since Quattro.  Never really saw the >> appeal (if I need some set of values calculated, I'll just write a bit of
    code to do it unambiguously -- instead of wondering what quirks the

    I like spreadsheets for making test data. The sort of mistakes you can make in
    a spreadsheet implementation are almost entirely orthogonal to those you can make in a conventional programming language. As such it makes a great scratch pad for developing tricky algorithms with all of the internal workings clearly
    visible on the screen.

    Again, I rely on tables/relations. Especially as I can just keep
    adding to them AND can have their values "fed" to the UUT (with
    results programmatically compared to the stored results).

    spreadsheet imposes).  Especially as so many people seem to use spreadsheets
    in lieu of (real) databases.  :<

    Sigh - yes I know they do :(
    Up to a couple of thousand lines it isn't *too* bad but after that it goes downhill very quickly. Doesn't stop people - typically middle managers with very limited skills having silly sized ones though.

    I just fail to see the appeal. How does seeing row after row of data
    make ANYTHING "better"?

    I like RDBMSs because I can keep building on what I've already got.
    E.g., if I knew Bob's wife's name BEFORE his divorce, I can still
    keep a record of it AFTER their divorce! AND, accommodate his
    potential to remarry.

    E.g., my address book started off with tables for city names and state
    names to allow each to be reduced to an integer "ID" and ensure I never
    had to worry about someone entering "Lnodon" as city name. Then, I opted
    to build a "localities" table that enumerated all of the possible, valid (CityID, StateID) combinations, replacing them with a LocalityID (and eliminating the need for CityID and StateID in actual address book entries.

    Once I had localities, I could create a ZIPcode-to-Locality relation
    so if you indicated you were in (Chicago, IL), I knew which subset
    of ZIP codes would be valid for that locality.

    And, which time zone.

    And, whether or not "summer time" was observed.

    etc.

    So, the chance of entering "bad data" was greatly reduced as well as
    giving me a "live" reference to consult: Who might I have called
    in area code 212? What time is it at Bob's house, now? What's his
    NEW wife's name? And, the names of his kids by his first wife?

    All of these things can reside in separate tables yet be called on
    ("joined") to suit specific needs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From ehsjr@21:1/5 to Liz Tuddenham on Thu Jan 16 16:00:24 2025
    On 1/16/2025 10:16 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    [...]
    Ah. "Can you wait outside (in another room) while I make dinner?" :>

    Tricky in a 5ft x 8ft van conversion.
    < http://www.poppyrecords.co.uk/Van/vanconversion.htm>


    AWESOMWE!!
    Ed

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From brian@21:1/5 to JL@gct.com on Thu Jan 16 23:02:28 2025
    In message <rraiojlip0hl9vjaa742fmtq9dpmo2v2l1@4ax.com>, john larkin <JL@gct.com> writes
    On Thu, 16 Jan 2025 15:10:37 +0000, brian <nospam@b-howie.co.uk>
    wrote:

    In message <vm69a2$2h52d$1@dont-email.me>, Don Y >><blockedofcourse@foo.invalid> writes
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes. I don't
    *believe* people think strictly "serially" but I am beginning to
    question that belief as I witness smart/capable people stuck in
    that mindset!

    It's simple project engineering and resource management . PERT ,
    Critical path analysis, lead times , Gantt charts, even Microsoft
    Project if you must. You must be talking to non-engineers.

    Brian

    The more such management tools that you use, the slower a project will
    go.

    The key observation is that when things are serialized, they happen >sequentially.


    PERT and the like handles parallel processing paths , in fact it almost
    forces you do it . When things are concurrent they happen in parallel .

    B .
    --
    Brian Howie

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Edward Rawde on Thu Jan 16 16:02:33 2025
    On 1/16/2025 10:57 AM, Edward Rawde wrote:
    Databases can also be backed up as human readable text files.

    To be fair, a spreadsheet's contents can similarly be exported (CSV).
    I'm not sure how the formulae, formats, etc. are handled, though.
    And, any non-text entries (that might be supported).

    Just export data as SQL.
    This gives me peace of mind that whatever software is or isn't available in the future I can always read my data.

    The value of SQL as a "dump" format is that you can *import* it into
    another SQL DBMS -- as long as the recipient supports the same (or
    greater) level of features. And, the import process is nothing more than piping the dump file to the recipient DBMS's "console" as it consists
    entirely of SQL commands.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From john larkin@21:1/5 to All on Thu Jan 16 15:36:09 2025
    On Thu, 16 Jan 2025 23:02:28 +0000, brian <nospam@b-howie.co.uk>
    wrote:

    In message <rraiojlip0hl9vjaa742fmtq9dpmo2v2l1@4ax.com>, john larkin ><JL@gct.com> writes
    On Thu, 16 Jan 2025 15:10:37 +0000, brian <nospam@b-howie.co.uk>
    wrote:

    In message <vm69a2$2h52d$1@dont-email.me>, Don Y >>><blockedofcourse@foo.invalid> writes
    I am surprised (disturbed?) at the problems people seem to have
    sorting these things out in their thought processes. I don't
    *believe* people think strictly "serially" but I am beginning to >>>>question that belief as I witness smart/capable people stuck in
    that mindset!

    It's simple project engineering and resource management . PERT ,
    Critical path analysis, lead times , Gantt charts, even Microsoft
    Project if you must. You must be talking to non-engineers.

    Brian

    The more such management tools that you use, the slower a project will
    go.

    The key observation is that when things are serialized, they happen >>sequentially.


    PERT and the like handles parallel processing paths , in fact it almost >forces you do it . When things are concurrent they happen in parallel .

    B .

    More software management tools not only slow down a design project,
    you need people to drive the tools too.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Edward Rawde@21:1/5 to Don Y on Thu Jan 16 23:22:48 2025
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmc36h$3mt5f$1@dont-email.me...
    On 1/16/2025 10:57 AM, Edward Rawde wrote:
    Databases can also be backed up as human readable text files.

    To be fair, a spreadsheet's contents can similarly be exported (CSV).
    I'm not sure how the formulae, formats, etc. are handled, though.
    And, any non-text entries (that might be supported).

    In other words exporting a database as SQL is not in any way similar to exporting a spreadsheet as csv.
    You might be better off unzipping the xlsx and parsing the xml if you really want your spreadsheet elsewhere.


    Just export data as SQL.
    This gives me peace of mind that whatever software is or isn't available in the future I can always read my data.

    The value of SQL as a "dump" format is that you can *import* it into
    another SQL DBMS -- as long as the recipient supports the same (or
    greater) level of features. And, the import process is nothing more than piping the dump file to the recipient DBMS's "console" as it consists entirely of SQL commands.

    I frequently paste an entire SQL dump into the Query tab in HeidiSQL.




    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Edward Rawde on Thu Jan 16 23:31:33 2025
    On 1/16/2025 9:22 PM, Edward Rawde wrote:
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmc36h$3mt5f$1@dont-email.me...
    On 1/16/2025 10:57 AM, Edward Rawde wrote:
    Databases can also be backed up as human readable text files.

    To be fair, a spreadsheet's contents can similarly be exported (CSV).
    I'm not sure how the formulae, formats, etc. are handled, though.
    And, any non-text entries (that might be supported).

    In other words exporting a database as SQL is not in any way similar to exporting a spreadsheet as csv.

    Of course it is. The difference lies in WHAT you are exporting.
    What does a 3MB photo (stored as a BLOB) look like when exported in SQL?
    Is it any more recognizable in that form?

    You might be better off unzipping the xlsx and parsing the xml if you really want your spreadsheet elsewhere.


    Just export data as SQL.
    This gives me peace of mind that whatever software is or isn't available in the future I can always read my data.

    The value of SQL as a "dump" format is that you can *import* it into
    another SQL DBMS -- as long as the recipient supports the same (or
    greater) level of features. And, the import process is nothing more than
    piping the dump file to the recipient DBMS's "console" as it consists
    entirely of SQL commands.

    I frequently paste an entire SQL dump into the Query tab in HeidiSQL.

    My dumps tend to be too large to cut-and-paste. Instead, I just
    pipe them to the console (or, tell it to "read input from file").

    Of course, this assumes everything will work well. If the shit hits
    the fan when you are 5MB into restoring/moving a 25MB dump, recovery
    can be difficult; how much of the dump was accepted? why did it
    abend? how can I use what HAS been accepted while adding to it
    the portions that have not?

    That's no worse than a spreadsheet import that abends; what recourse do you have, there? Manually inspecting all the cells?

    The drawback with databases is that there is a non-trivial skillset
    to be learned to make effective and efficient use of them. You not
    only need to learn another programming language (or three), but,
    also, need to think about how the data is organized and stored
    vs. how you will want to access it.

    E.g., why not store everything as a TEXT field? What value the added
    costs of BPCHARs over TEXTs? How to store an imprecise date/time (e.g.,
    if you know a person's birthdate is: "in september", "in 1970", "on
    march 15", etc. but don't have a COMPLETE date to store)

    What form of index(es) should be supported? And, on what field(s)
    (or combinations thereof)?

    Extra credit: what's the best way to store MD5 hashes? And, why?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Liz Tuddenham@21:1/5 to Don Y on Fri Jan 17 09:36:02 2025
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/16/2025 8:16 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:

    On 1/16/2025 5:03 AM, Liz Tuddenham wrote:
    Don Y <blockedofcourse@foo.invalid> wrote:


    ... Especially as so many people seem to use spreadsheets
    in lieu of (real) databases. :<

    As far as I can see, a databse is just a spreadsheet that has been
    crippled by having all the cells on each line tied together

    Databases (relational ones) are *so* much more. In addition to
    strict typing on each "column", you can define relationships
    between columns and specific values in columns, writing code
    to enforce constraints so that certain values are not accepted
    in certain places based on other values, etc.

    E.g., I can state that "fertile" can not be true if "sex" is not "female". >> On a per-record basis. So, Bob can never be marked as "fertile" but
    Becky might!

    Or, that the city can not be "chicago" unless the state is "Illinois"
    (I am assuming there are no chicagos in other states; "Springfield"
    tends to be a popular city name -- but, there is no Springfield in
    Alaska so if someone tries to enter an address of Springfield,
    Alaska, it is known to be invalid and shouldn't be accepted.).

    Or, that a social security number must be of the form XXX-##-####
    *and* XXX can't be 000, 333, 666, etc.

    Or, that a mother's birth date must precede those of her biological
    children by at least 5 years but the father's must precede his
    biological children by 9 years.

    Etc.

    All of those can be done on a spreadsheet. ...and similar checks can be done between cells in different rows. The check formula is written into
    a 'hidden' cell and the final result is displayed in a 'locked' cell.
    If someone puts in faulty data, the spreadsheet can't stop them but it
    can ensure that the dud data doesn't appear in the output.

    It's a kludge. How do you handle the 302K files on THIS machine?
    Given that a PATHNAME can be thousands of characters in length?
    Tell me where all of the files named "Readme" are located...
    How quickly for you to get a result from the tool? Or, how many
    have the hash "2094230953408573847503485034038298028023984"?

    I use spreadsheets for all sorts of things: calculating component
    values, customers and accounts, encoding sensitive information,
    addressing envelopes, uploading invoices to the Web ...etc.

    I use databases (tables/relations) for all of that. Wanna know the
    last time I telephoned XYZ Corporation? Or, who's associated
    with a particular phone number? Or, how many billable hours I
    charged on project XYZ in the week of January 25, 1990 (to
    prepare an invoice)?

    "Tables" are particularly attractive in constraining data. E.g.,
    if a user types in a city name, I can present a list of *potential*
    state names (because I know which city names are valid in each state).
    Or, a list of associated ZIP (postal) codes, from which I could
    determine the state.

    This is particularly valuable with things like speech I/O as it
    is much easier to recognize words in a limited/constrained vocabulary
    than it is to recognize unconstrained speech. "Call <entity>"
    knows that "entity" has to be one of the entities listed in the
    address book -- otherwise, it wouldn't make sense. So, if the
    speech recognizer THINKS the user said "rolf", it would know that
    to be incorrect if there are no "Rolfs" in the address book -- maybe
    "Ralph"?

    I tried using a database to keep track of my Christmas Cards, but find
    it slow and restrictive compared with a spreadsheet.

    "Databases" are just tables. So, you're doing something wrong.

    You have to have a "front-end" that acts on your behalf to manage
    the interface. E.g., I wouldn't try to manually piece together
    C: \ Windows \ notepad.exe
    or:
    C: \ Users \ Don \ Desktop \ apc_1500va.zip \ schem \ page5.png
    by examining each tuple. Instead, I let a query do all of that
    and give me the result AS IF it had been stored there, directly.

    I can do this in real-time as I am scanning a mounted volume
    to verify that every file is "intact" (compute hash, compare to
    stored hash, store date/time of this verification, advance to next
    file) at the speed of the medium. When "done", email a list
    of corrupt/missing files to $USER so they can take action to
    restore them from duplicates on other media (which the database
    can identify based on stored sizes and hashes: "Mount volume
    XYZ so I can restore the following entities...")

    You appear to be conflating the actual storage systems with the ways of
    getting access to them. All the things that you like about databases
    could equally well be done with spreadsheets - it's just that database
    access software manufacturers have tailored their products to particular
    needs whereas spreadsheets haven't attracted the same attention.

    Fundamentally, spreadsheets are just the same as databases but without
    the constraints of the cells on each line being tied together. To put
    it another way: a spreadsheet is a non-relational database. The
    software to manipulate them, which is what you have been describing, is
    a totally different matter.

    Spreadsheets allow you to move columns and cells around individually -
    which is a very dangerous procedure if the spatial relationship between
    the cells is key to the information being stored. Databases are safer
    in this respect but less versatile if your data is the type which does
    not depend on cell position in a table.


    --
    ~ Liz Tuddenham ~
    (Remove the ".invalid"s and add ".co.uk" to reply)
    www.poppyrecords.co.uk

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Edward Rawde@21:1/5 to Don Y on Fri Jan 17 07:19:31 2025
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmctge$3u72f$1@dont-email.me...
    On 1/16/2025 9:22 PM, Edward Rawde wrote:
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmc36h$3mt5f$1@dont-email.me...
    On 1/16/2025 10:57 AM, Edward Rawde wrote:
    Databases can also be backed up as human readable text files.

    To be fair, a spreadsheet's contents can similarly be exported (CSV).
    I'm not sure how the formulae, formats, etc. are handled, though.
    And, any non-text entries (that might be supported).

    In other words exporting a database as SQL is not in any way similar to exporting a spreadsheet as csv.

    Of course it is. The difference lies in WHAT you are exporting.
    What does a 3MB photo (stored as a BLOB) look like when exported in SQL?
    Is it any more recognizable in that form?

    You may as well ask what it looks like when a jpg is opened in a text editor.


    You might be better off unzipping the xlsx and parsing the xml if you really want your spreadsheet elsewhere.


    Just export data as SQL.
    This gives me peace of mind that whatever software is or isn't available in the future I can always read my data.

    The value of SQL as a "dump" format is that you can *import* it into
    another SQL DBMS -- as long as the recipient supports the same (or
    greater) level of features. And, the import process is nothing more than >>> piping the dump file to the recipient DBMS's "console" as it consists
    entirely of SQL commands.

    I frequently paste an entire SQL dump into the Query tab in HeidiSQL.

    My dumps tend to be too large to cut-and-paste. Instead, I just
    pipe them to the console (or, tell it to "read input from file").

    Of course, this assumes everything will work well. If the shit hits
    the fan when you are 5MB into restoring/moving a 25MB dump, recovery
    can be difficult; how much of the dump was accepted? why did it
    abend? how can I use what HAS been accepted while adding to it
    the portions that have not?

    That's no worse than a spreadsheet import that abends; what recourse do you have, there? Manually inspecting all the cells?

    The drawback with databases is that there is a non-trivial skillset
    to be learned to make effective and efficient use of them. You not
    only need to learn another programming language (or three), but,
    also, need to think about how the data is organized and stored
    vs. how you will want to access it.

    E.g., why not store everything as a TEXT field? What value the added
    costs of BPCHARs over TEXTs? How to store an imprecise date/time (e.g.,
    if you know a person's birthdate is: "in september", "in 1970", "on
    march 15", etc. but don't have a COMPLETE date to store)

    What form of index(es) should be supported? And, on what field(s)
    (or combinations thereof)?

    Extra credit: what's the best way to store MD5 hashes? And, why?



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Edward Rawde on Fri Jan 17 13:04:06 2025
    On 1/17/2025 5:19 AM, Edward Rawde wrote:
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmctge$3u72f$1@dont-email.me...
    On 1/16/2025 9:22 PM, Edward Rawde wrote:
    "Don Y" <blockedofcourse@foo.invalid> wrote in message news:vmc36h$3mt5f$1@dont-email.me...
    On 1/16/2025 10:57 AM, Edward Rawde wrote:
    Databases can also be backed up as human readable text files.

    To be fair, a spreadsheet's contents can similarly be exported (CSV).
    I'm not sure how the formulae, formats, etc. are handled, though.
    And, any non-text entries (that might be supported).

    In other words exporting a database as SQL is not in any way similar to exporting a spreadsheet as csv.

    Of course it is. The difference lies in WHAT you are exporting.
    What does a 3MB photo (stored as a BLOB) look like when exported in SQL?
    Is it any more recognizable in that form?

    You may as well ask what it looks like when a jpg is opened in a text editor.

    You touted the fact that YOU can see your data as human readable files
    (in the event <something> happens to the database software). That
    this is an asset that DBMSs have over spreadsheets.

    My point is that this only partially works as you expect.
    A spreadsheet's contents can be exported as <whatever>-delimited
    form which will make the same sorts of data that are readable when
    exported in an SQL dump "readable", readable.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Liz Tuddenham on Fri Jan 17 13:50:46 2025
    On 1/17/2025 2:36 AM, Liz Tuddenham wrote:
    You appear to be conflating the actual storage systems with the ways of getting access to them. All the things that you like about databases
    could equally well be done with spreadsheets

    No, you can't. You would have to change what it means to be
    a spreadsheet by augmenting their functionality to the point where
    they *became* DBMSs.

    - it's just that database
    access software manufacturers have tailored their products to particular needs whereas spreadsheets haven't attracted the same attention.

    Fundamentally, spreadsheets are just the same as databases but without
    the constraints of the cells on each line being tied together. To put
    it another way: a spreadsheet is a non-relational database. The
    software to manipulate them, which is what you have been describing, is
    a totally different matter.

    No. A spreadsheet is just a grid of cells. The data in one cell
    can have entirely different meaning than the data in the cell below
    it.

    I can create 10 small, unrelated "spreadsheets" on a single spreadsheet
    "page". One can occupy the area bounded by A1 and C25 -- 3 columns, 25 rows. Another A40 to D57 (4 columns, 18 rows). Another from B26 to G38
    (6 columns, 13 rows). And, another from E1 to L10 (8 columns, 10 rows).

    I.e., rows 1-18 pass through two DIFFERENT "spreadsheets".
    Column A does, as well. Column B passes through *three*.
    The contents of each of those columns and rows in different
    spreadsheets need not be related or even of the same data
    types.

    Furthermore, within a single of those "(mini)spreadsheets", the
    type of data in one row need not be related to the type of data
    in the row that follows it, EVEN IF PART OF THE SAME SPREADSHEET.

    In a database, rows are *records* that are treated as units (tuples).
    Columns are fields that have a single *defined* datatype that is
    enforced by the DBMS.

    If you want to have the equivalent of mini-databases in a DBMS,
    then you have to create (and define) mini tables (relations/tuples).
    Each will have a row 1 and column 1 -- but, they will be totally
    unrelated in data type and size/length.

    E.g., my Cities table lists all of the names of the cities
    found in any state/territory in the US. The States table
    just lists the names of states/territories. Entirely different
    lengths and meanings to the data. The Localities table
    contains pairs of (city, state) identifiers (the identifiers
    defined and mapped to specific cities/states by those
    respective tables).

    You can TRY to build a spreadsheet that LOOKS like this.
    But, it won't GUARANTEE that the identifiers for the city
    (and state) defined for "locality 1" are valid city (and
    state) identifiers. This doesn't just apply to entering
    the data but, also, modifying it and deleting it!

    E.g., if you try to *delete* the city "Chicago" from the
    city table, the database won't let you -- because there
    is a locality that references it: (ChicagoID, IllinoisID)

    I can augment the State table with a "Capital" table;
    it contains the ID of the City that names the capital
    of that state. Again, the DBMS won't let me pick
    a city that doesn't exist in the city table. And,
    it won't let me pick a city that isn't named as a locality
    *in* that state (because that would imply that it was ALSO
    a locality -- why not named as such?)

    The DBMS will enforce rules as to which "roles" can perform
    which actions on which tables.

    It will also ensure that multistep operations all happen
    as an indivisible (atomic) "transaction" so no other
    user (accessing the database at that same instant)
    will ever see "partial results". E.g., adjusting the
    population of locality 5 down by 321 souls to reflect the
    folks who have RELOCATED to locality 88 (you will never
    see 321 *extra* people nor 88 *fewer* people, depending
    on the order in which the *individual* population counts
    were "adjusted".)

    Spreadsheets allow you to move columns and cells around individually -
    which is a very dangerous procedure if the spatial relationship between
    the cells is key to the information being stored. Databases are safer
    in this respect but less versatile if your data is the type which does
    not depend on cell position in a table.

    "Position" is a partial constraint in a DBMS. ALL the data in a
    particular column (field) is of the same type and has the same
    constraints applied to it. You wouldn't want someone to be able
    to move a "birthdate datum" to an "account balance" column (field).
    WHERE the column sits is immaterial, relative to the other columns;
    I can move the birthdate column to column 4 or 8 or 87. But, there
    will never be any "empty/undefined" columns between existing columns!

    Similarly, fields in a row have a RELATIONship to each other.
    The population of locality #1 is specified in THAT row -- not
    in the row associated with locality #33. WHICH "physical"
    row doesn't matter but it is treated as a coherent unit.
    Again, no "empty" rows -- unless they are intentionally
    defined to contain "all NULL" data (which means they are not
    empty but just "full of NULL")

    DATA is of the utmost importance in a database. It's integrity,
    access, "role", etc.

    Spreadsheets are just "scraps of paper" with no constraints on
    what happens, where. Just like one can doodle in the margins
    of a sales receipt -- those doodles having nothing to do with
    the purchase described by that receipt.

    Many spreadsheet products are inherently "visual" -- owing to
    the emphasis the developers placed on that aspect of the UI.
    By contrast, most (serious) DBMSs are not; one relies on
    some other tool to give the user a GUI.

    Because (serious) RDBMSs have a standardized definition language,
    one can build such a tool and apply it to a variety of DBMSs.
    E.g., I can draw an Entity Relationship Diagram (ERD) that
    shows how a set of tables are /(inter)RELATED/. This can
    be used to define -- and link (join) -- those tables in a
    yet-to-be-selected DBMS as their "code generators" implement
    the same mechanisms regardless of RDBMS. With that knowledge,
    the tool can provide a spreadsheet-style "table" to enter
    data -- even if the individual data end up being distributed
    to several different tables, in the process. Or, sort the
    PRESENTATION of the rows in these conjoined tables -- even
    though the actual data in the DBMS isn't "moving".

    Spreadsheets emphasize *these* features, at the expense of the
    data's integrity.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Don Y on Fri Jan 17 13:59:26 2025
    On 1/17/2025 1:50 PM, Don Y wrote:
    It will also ensure that multistep operations all happen
    as an indivisible (atomic) "transaction" so no other
    user (accessing the database at that same instant)
    will ever see "partial results".  E.g., adjusting the
    population of locality 5 down by 321 souls to reflect the
    folks who have RELOCATED to locality 88 (you will never
    see 321 *extra* people nor 88 *fewer* people, depending

    s/88/321/

    on the order in which the *individual* population counts
    were "adjusted".)

    I.e., I can:
    adjust the population of locality 5 down by 321
    THEN, adjust the poulation of locality 88 *up* by 321
    OR
    adjust the population of locality 88 up by 321
    THEN, adjust the population of locality 5 down by 321

    If <someone> peeks at this data between the first "step"
    and the second, they will either see 321 fewer people
    living in the country (!) or 321 *more* -- when the
    total population hasn't actually changed (as is evident
    before and after the "transaction")

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Don Y@21:1/5 to Don Y on Fri Jan 17 14:08:53 2025
    On 1/17/2025 1:50 PM, Don Y wrote:
    E.g., my Cities table lists all of the names of the cities
    found in any state/territory in the US.  The States table
    just lists the names of states/territories.  Entirely different
    lengths and meanings to the data.  The Localities table
    contains pairs of (city, state) identifiers (the identifiers
    defined and mapped to specific cities/states by those
    respective tables).

    You can TRY to build a spreadsheet that LOOKS like this.
    But, it won't GUARANTEE that the identifiers for the city
    (and state) defined for "locality 1" are valid city (and
    state) identifiers.  This doesn't just apply to entering
    the data but, also, modifying it and deleting it!

    E.g., if you try to *delete* the city "Chicago" from the
    city table, the database won't let you -- because there
    is a locality that references it:  (ChicagoID, IllinoisID)

    The DBMS will *guarantee* (if the scheme so declares) that
    no two cities (or states) have the same NAME *or* identifier.
    It will also insist that every name *has* an identifier.

    So, you can't have two Alaskas defined -- with different
    identifiers. Or, two Chicagos.

    How are you going to do that in a spreadsheet?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)