• Q direct observation of statistical comparison

    From Cosine@21:1/5 to All on Mon Jun 12 08:40:39 2023
    Hi:

    A formal way to determine if the effect of a random variable is greater than another is to perform the hypothesis to check whether the difference or ratio of the metric is greater and whether this fact is significant.

    However, are there special cases in which one could determine whether the effect of a random variable is greater than that of another without performing the above formal procedure?

    For example, when comparing the salary of the domestic and foreign groups, the average salaries and the associated standard errors of the two groups are: (Avg_d, Se_d) and (Avg_f, Se_f). Could we quickly answer the question of greater salary by
    directly observing the numeric data given above? Say, the confidence interval of the two average salaries overlaps greatly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Mon Jun 12 12:24:53 2023
    On Mon, 12 Jun 2023 08:40:39 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Hi:

    A formal way to determine if the effect of a random variable is greater than another is to perform the hypothesis to check whether the difference or ratio of the metric is greater and whether this fact is significant.

    However, are there special cases in which one could determine whether the effect of a random variable is greater than that of another without performing the above formal procedure?

    For example, when comparing the salary of the domestic and foreign groups, the average salaries and the associated standard errors of the two groups are: (Avg_d, Se_d) and (Avg_f, Se_f). Could we quickly answer the question of greater salary by
    directly observing the numeric data given above? Say, the confidence interval of the two average salaries overlaps greatly.

    Hmm. You say "effect" a couple of times, suggesting
    something more complicated, before you ask about means.

    Means and their SDs are the basis of ordinary t-tests.

    "Directly observing" the data? Do you want something like this?

    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
    Tukey's Quick Test can be used when:

    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and the
    lowest value, nor can both samples have the same high or low value.

    By adding the counts of the number of unmatched points on either end,
    one can determine the 5%, 1% and 0.1% critical values as roughly 7,
    10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two
    sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S test
    on ranks.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to rich.ulrich@comcast.net on Wed Jun 14 00:43:08 2023
    On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich
    <rich.ulrich@comcast.net> wrote:

    On Mon, 12 Jun 2023 08:40:39 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Hi:

    A formal way to determine if the effect of a random variable is greater than another is to perform the hypothesis to check whether the difference or ratio of the metric is greater and whether this fact is significant.

    However, are there special cases in which one could determine whether the effect of a random variable is greater than that of another without performing the above formal procedure?

    For example, when comparing the salary of the domestic and foreign groups, the average salaries and the associated standard errors of the two groups are: (Avg_d, Se_d) and (Avg_f, Se_f). Could we quickly answer the question of greater salary by
    directly observing the numeric data given above? Say, the confidence interval of the two average salaries overlaps greatly.

    Hmm. You say "effect" a couple of times, suggesting
    something more complicated, before you ask about means.

    Means and their SDs are the basis of ordinary t-tests.

    "Directly observing" the data? Do you want something like this?

    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
    Tukey's Quick Test can be used when:

    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and the
    lowest value, nor can both samples have the same high or low value.

    By adding the counts of the number of unmatched points on either end,
    one can determine the 5%, 1% and 0.1% critical values as roughly 7,
    10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two
    sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S test
    on ranks.

    By the way -- I remembered the Tukey Quick Test because I
    kept it in mind and used it a number of times, for my own
    confirmation when browsing data.

    I've seen a text book (I forget whose) that had an appendix
    with different cutoffs for various pairs of sample Ns. But I
    would not suggest trying to publish something relying on it.

    I speculate that the "4:3" ratio of Ns (mentioned above) is a
    pretty good match to where the cutoffs are exact.

    Tukey's two sentences did not specify the ratio of sample sizes,
    and called it 'approximate'.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bruce Weaver@21:1/5 to Rich Ulrich on Wed Jun 28 11:27:08 2023
    I don't recall hearing about this test before. Apparently, it is sometimes called the Tukey-Duckworth (quick) test.

    https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test


    On Wednesday, June 14, 2023 at 12:43:15 AM UTC-4, Rich Ulrich wrote:
    On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich
    <rich....@comcast.net> wrote:

    On Mon, 12 Jun 2023 08:40:39 -0700 (PDT), Cosine <ase...@gmail.com>
    wrote:

    Hi:

    A formal way to determine if the effect of a random variable is greater than another is to perform the hypothesis to check whether the difference or ratio of the metric is greater and whether this fact is significant.

    However, are there special cases in which one could determine whether the effect of a random variable is greater than that of another without performing the above formal procedure?

    For example, when comparing the salary of the domestic and foreign groups, the average salaries and the associated standard errors of the two groups are: (Avg_d, Se_d) and (Avg_f, Se_f). Could we quickly answer the question of greater salary by
    directly observing the numeric data given above? Say, the confidence interval of the two average salaries overlaps greatly.

    Hmm. You say "effect" a couple of times, suggesting
    something more complicated, before you ask about means.

    Means and their SDs are the basis of ordinary t-tests.

    "Directly observing" the data? Do you want something like this?

    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/ >Tukey's Quick Test can be used when:

    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and the >lowest value, nor can both samples have the same high or low value.

    By adding the counts of the number of unmatched points on either end,
    one can determine the 5%, 1% and 0.1% critical values as roughly 7,
    10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two >sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S test
    on ranks.
    By the way -- I remembered the Tukey Quick Test because I
    kept it in mind and used it a number of times, for my own
    confirmation when browsing data.

    I've seen a text book (I forget whose) that had an appendix
    with different cutoffs for various pairs of sample Ns. But I
    would not suggest trying to publish something relying on it.

    I speculate that the "4:3" ratio of Ns (mentioned above) is a
    pretty good match to where the cutoffs are exact.

    Tukey's two sentences did not specify the ratio of sample sizes,
    and called it 'approximate'.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to bweaver@lakeheadu.ca on Thu Jun 29 00:15:16 2023
    On Wed, 28 Jun 2023 11:27:08 -0700 (PDT), Bruce Weaver
    <bweaver@lakeheadu.ca> wrote:

    I don't recall hearing about this test before. Apparently, it is sometimes called the Tukey-Duckworth (quick) test.

    https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test

    Top-posting? Okay.

    Okay. It adds that Duckworth requested a simple test, usable in
    the field, and this is what Tukey provided. I'm not surprised if he
    gave us some other Quick tests -- so someone added Duckworth?

    Tukey was a prolific statistican, with a different perspective from
    most of us. I gained useful insights from reading his textbooks,
    though I still wonder if they are 'simple' enough to be used in
    the intro courses they are written for. I think I got much of my
    perspective on the proper use of transformations from his chapters
    on the subject.

    There is some paper on presenting data with useful graphics (IIRC
    the topic rightly) which lists Tukey, whose ideas it presented, as
    author #9; a statistician friend said that his professors had referred
    to it as "et al. and Tukey" .




    On Wednesday, June 14, 2023 at 12:43:15?AM UTC-4, Rich Ulrich wrote:
    On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich

    < snip, original problem >

    "Directly observing" the data? Do you want something like this?

    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
    Tukey's Quick Test can be used when:

    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and the
    lowest value, nor can both samples have the same high or low value.

    By adding the counts of the number of unmatched points on either end,
    one can determine the 5%, 1% and 0.1% critical values as roughly 7,
    10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two
    sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S test
    on ranks.
    By the way -- I remembered the Tukey Quick Test because I
    kept it in mind and used it a number of times, for my own
    confirmation when browsing data.

    I've seen a text book (I forget whose) that had an appendix
    with different cutoffs for various pairs of sample Ns. But I
    would not suggest trying to publish something relying on it.

    I speculate that the "4:3" ratio of Ns (mentioned above) is a
    pretty good match to where the cutoffs are exact.

    Tukey's two sentences did not specify the ratio of sample sizes,
    and called it 'approximate'.


    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Jones@21:1/5 to Rich Ulrich on Thu Jun 29 11:56:53 2023
    Rich Ulrich wrote:

    On Wed, 28 Jun 2023 11:27:08 -0700 (PDT), Bruce Weaver
    <bweaver@lakeheadu.ca> wrote:

    I don't recall hearing about this test before. Apparently, it is
    sometimes called the Tukey-Duckworth (quick) test.

    https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test

    Top-posting? Okay.

    Okay. It adds that Duckworth requested a simple test, usable in
    the field, and this is what Tukey provided. I'm not surprised if he
    gave us some other Quick tests -- so someone added Duckworth?

    Tukey was a prolific statistican, with a different perspective from
    most of us. I gained useful insights from reading his textbooks,
    though I still wonder if they are 'simple' enough to be used in
    the intro courses they are written for. I think I got much of my
    perspective on the proper use of transformations from his chapters
    on the subject.

    There is some paper on presenting data with useful graphics (IIRC
    the topic rightly) which lists Tukey, whose ideas it presented, as
    author #9; a statistician friend said that his professors had referred
    to it as "et al. and Tukey" .




    On Wednesday, June 14, 2023 at 12:43:15?AM UTC-4, Rich Ulrich wrote:
    On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich

    < snip, original problem >

    "Directly observing" the data? Do you want something like this?


    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
    Tukey's Quick Test can be used when: >> >
    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and
    the >> >lowest value, nor can both samples have the same high or low
    value. >> >
    By adding the counts of the number of unmatched points on either
    end, >> >one can determine the 5%, 1% and 0.1% critical values as
    roughly 7, >> >10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two
    sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S
    test >> >on ranks.
    By the way -- I remembered the Tukey Quick Test because I
    kept it in mind and used it a number of times, for my own
    confirmation when browsing data.

    I've seen a text book (I forget whose) that had an appendix
    with different cutoffs for various pairs of sample Ns. But I
    would not suggest trying to publish something relying on it.

    I speculate that the "4:3" ratio of Ns (mentioned above) is a
    pretty good match to where the cutoffs are exact.

    Tukey's two sentences did not specify the ratio of sample sizes,
    and called it 'approximate'.


    A problem seems to be in "One sample cannot contain both the highest
    and the lowest value, nor can both samples have the same high or low
    value."

    Is a test a test, if you can't always apply it? Is there some action
    advised if the test can't be applied?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to dajhawkxx@nowherel.com on Fri Jun 30 14:09:05 2023
    On Thu, 29 Jun 2023 11:56:53 -0000 (UTC), "David Jones" <dajhawkxx@nowherel.com> wrote:

    Rich Ulrich wrote:

    On Wed, 28 Jun 2023 11:27:08 -0700 (PDT), Bruce Weaver
    <bweaver@lakeheadu.ca> wrote:

    I don't recall hearing about this test before. Apparently, it is
    sometimes called the Tukey-Duckworth (quick) test.

    https://en.wikipedia.org/wiki/Tukey%E2%80%93Duckworth_test

    Top-posting? Okay.

    Okay. It adds that Duckworth requested a simple test, usable in
    the field, and this is what Tukey provided. I'm not surprised if he
    gave us some other Quick tests -- so someone added Duckworth?

    Tukey was a prolific statistican, with a different perspective from
    most of us. I gained useful insights from reading his textbooks,
    though I still wonder if they are 'simple' enough to be used in
    the intro courses they are written for. I think I got much of my
    perspective on the proper use of transformations from his chapters
    on the subject.

    There is some paper on presenting data with useful graphics (IIRC
    the topic rightly) which lists Tukey, whose ideas it presented, as
    author #9; a statistician friend said that his professors had referred
    to it as "et al. and Tukey" .




    On Wednesday, June 14, 2023 at 12:43:15?AM UTC-4, Rich Ulrich wrote:
    On Mon, 12 Jun 2023 12:24:53 -0400, Rich Ulrich

    < snip, original problem >

    "Directly observing" the data? Do you want something like this?


    https://www.qimacros.com/hypothesis-testing/tukey-quick-test-excel/
    Tukey's Quick Test can be used when: >> >
    There are two unpaired samples of similar size that overlap each
    other. Ratio of sizes should not exceed 4:3.
    One sample contains the highest value, the other sample contains
    the lowest value. One sample cannot contain both the highest and
    the >> >lowest value, nor can both samples have the same high or low
    value. >> >
    By adding the counts of the number of unmatched points on either
    end, >> >one can determine the 5%, 1% and 0.1% critical values as
    roughly 7, >> >10, and 13 points.

    IIRC, the textbook that first showed me this test quoted Tukey
    exactly. Tukey described the test AND its critical values in two
    sentences. I was disappointed, a few years later, when I saw
    that the newer edition of the textbook had dropped the topic.


    If you want a full test on ranks, editors will prefer the K-S
    test >> >on ranks.
    By the way -- I remembered the Tukey Quick Test because I
    kept it in mind and used it a number of times, for my own
    confirmation when browsing data.

    I've seen a text book (I forget whose) that had an appendix
    with different cutoffs for various pairs of sample Ns. But I
    would not suggest trying to publish something relying on it.

    I speculate that the "4:3" ratio of Ns (mentioned above) is a
    pretty good match to where the cutoffs are exact.

    Tukey's two sentences did not specify the ratio of sample sizes,
    and called it 'approximate'.


    A problem seems to be in "One sample cannot contain both the highest
    and the lowest value, nor can both samples have the same high or low
    value."

    Is a test a test, if you can't always apply it?

    A philosophical question? "Can't" or "shouldn't, because there
    is no power or useful table of p-values"?

    Pragmatically -- If I have a computer program for it, my program
    will give SOME answer. The table of p-values must be a problem,
    but it can return '0' for the sum of counts as a safe answer when
    there's a doubt. I wonder how robust the Quick test is when the
    data are discrete and (therefore) can have a tie at one end, while
    the other end can be counted? Pragmatically, I don't know if the
    test is robust against that assumption. Monte Carlo randomization
    on all the data values could provide an ad-hoc assessment of p.

    Assumptions?
    The K-S rank test as a test for location has the ASSUMPTION that
    the distributions are otherwise similar and differ by the location
    parameter. When variances are vastly different, the KS test can
    'reject' in either direction, depending on which end the counting
    starts from.

    No Power?
    I've seen a lot of t-tests and contingency tables computed when
    the power is virtually nil. For contingency tables and 'exact' tests,
    the power for alpha= 0.05 might be exactly nil, for Ns too small.

    I have told consultees, "You don't really have a test there, because
    the N is too small."

    Is there some action
    advised if the test can't be applied?

    Use a test with other assumptions?

    --
    Rich Ulrich


    You are going to be a stickler about assumptions and the
    table of p-values?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)