• Q finding the best screening methods

    From Cosine@21:1/5 to All on Tue Jun 29 04:47:38 2021
    Hi:

    How do we conduct statistical tests to find the best screening method among a set of methods?

    For example, we have 3 new methods of screening. We tested them in the same group of patients and verified the screening result of each method against a clinical standard method. Will we be sure to find the best method in the following way?

    2 ^ 1>3 -> 1 the best
    2 ^ 1<3 -> 3 the best
    1<2 ^ 1>3 -> 2 the best
    1<2 ^ 1<3 ^ 2>3 -> 2 the best
    2<3 -> 3 the best

    Are there other easier/faster ways to find the best method?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Wed Jun 30 14:38:58 2021
    Cosine 在 2021年6月29日 星期二下午7:47:40 [UTC+8] 的信中寫道:
    Hi:

    How do we conduct statistical tests to find the best screening method among a set of methods?

    For example, we have 3 new methods of screening. We tested them in the same group of patients and verified the screening result of each method against a clinical standard method. Will we be sure to find the best method in the following way?

    2 ^ 1>3 -> 1 the best
    2 ^ 1<3 -> 3 the best
    1<2 ^ 1>3 -> 2 the best
    1<2 ^ 1<3 ^ 2>3 -> 2 the best
    2<3 -> 3 the best

    Are there other easier/faster ways to find the best method?

    A relevant question is how do we know the level of confidence?

    For example, we have methods 1, 2, and 3.

    1>2 with 95% confidence and 1>3 with 90% confidence

    then 1 is the best by logic, but with what statistical confidence?

    Even we have 1>2 w/ 95% and 1>3 w/ 95%, could we be sure that 1 is the best with 95%? Why?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Duffy@21:1/5 to Cosine on Fri Jul 2 02:46:44 2021
    Cosine <asecant@gmail.com> wrote:
    Cosine ??? 2021???6???29??? ???????????????7:47:40 [UTC+8] ??????????????????
    How do we conduct statistical tests to find the best screening method
    among a set of methods?
    For example, we have 3 new methods of screening. We tested them in
    Even we have 1>2 w/ 95% and 1>3 w/ 95%, could we be sure that 1 is the best with 95%? Why?

    "Best" depends on the setting - Sens may be more important than Spec for
    a screen, so need to test screen and follow-ups simultaneously versus cost-benefit. Can give likelihood to each ordering, so can say 1-2-3 is 5x
    more likely than 2-1-3.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to davidD@qimr.edu.au on Fri Jul 2 02:18:21 2021
    On Fri, 2 Jul 2021 02:46:44 +0000 (UTC), David Duffy
    <davidD@qimr.edu.au> wrote:

    Cosine <asecant@gmail.com> wrote:
    Cosine ??? 2021???6???29??? ???????????????7:47:40 [UTC+8] ??????????????????
    How do we conduct statistical tests to find the best screening method
    among a set of methods?
    For example, we have 3 new methods of screening. We tested them in
    Even we have 1>2 w/ 95% and 1>3 w/ 95%, could we be sure that 1 is the best with 95%? Why?

    "Best" depends on the setting - Sens may be more important than Spec for
    a screen, so need to test screen and follow-ups simultaneously versus >cost-benefit. Can give likelihood to each ordering, so can say 1-2-3 is 5x >more likely than 2-1-3.

    The original and followup questions might be a candidate
    for "best combination of MULTIPLE considerations" for
    statistical decision-making.

    Three competitors instead of two.

    "best screening" combines Sens and Spec, with cost-benefit
    (as David notes), along with choice of population to sample.
    And the "cost" can be concrete, in dollars per test, or it can be
    as subjective as the "benefit" by starting out as the projected
    number of cases missed or mistakenly mis-attributed.

    The cost-benefit must incorporate the purpose of the
    decision-making for the particular sample. - "Ideal" screening
    varies between samples with low and high prevalence.

    Ranking of results can raise the question of whether 1>2
    and 2>3 always implies 1>3; but you might have skipped that
    complication.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Fri Jul 2 02:45:09 2021
    Rich Ulrich 在 2021年7月2日 星期五下午2:18:29 [UTC+8] 的信中寫道:
    On Fri, 2 Jul 2021 02:46:44 +0000 (UTC), David Duffy
    ...
    Ranking of results can raise the question of whether 1>2
    and 2>3 always implies 1>3; but you might have skipped that
    complication.


    Well, that is also an issue.

    We actually tested by samples to get the result showing that 1>2 w/ 95% confidence.

    The same for 2>3 w/ 95%.

    But we did NOT do any test to get an actual result showing that 1>3 w/ some confidence.

    Does it mean that we still require to test if 1>3 and to get the statistical confidence?

    Or there are some ways to show that 1>3 w/ some confidence based on the results of
    2 w/ 95% and 2>3 w/ 90%?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Fri Jul 2 22:29:02 2021
    On Fri, 2 Jul 2021 02:45:09 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Rich Ulrich ? 2021?7?2? ?????2:18:29 [UTC+8] ??????
    On Fri, 2 Jul 2021 02:46:44 +0000 (UTC), David Duffy
    ...
    Ranking of results can raise the question of whether 1>2
    and 2>3 always implies 1>3; but you might have skipped that
    complication.


    Well, that is also an issue.

    No, using CI's was not what I was thinking of. I don't remember
    details, but there are /some/ complicated comparisons that are not
    transitive, doing the brute comparisons by pairs.

    One awkward scoring that I do recall something about is the
    scoring for women's Olympic Ice Skating. Skaters are ranked
    in each of several events. Those rank-scores are later combined
    (in some fashion... weighting?) to get a final ranking to determine
    a winner. I watched a competition where, at the time that the
    final skater did her final event, it was possible that (IIRC) any of
    the three skaters at the top could end up as #1, #2, or #3.


    We actually tested by samples to get the result showing that 1>2 w/ 95% confidence.


    Stating such-and-so "with 95% confidence" is a syntax that will
    grate with a large number of good statisticians. The parameter
    (or difference) is not the proper object of "95%"; that describes
    the CI. You can find some classical quotes on this in the Wiki
    artiicle at https://en.wikipedia.org/wiki/Confidence_interval , under "misunderstandings". By the way, the article (all in all) could
    benefit from expert re-writing, as mentioned in the head-notes
    by Wiki overseers.

    The same for 2>3 w/ 95%.

    But we did NOT do any test to get an actual result showing that 1>3 w/ some confidence.

    Does it mean that we still require to test if 1>3 and to get the statistical confidence?

    Or there are some ways to show that 1>3 w/ some confidence based on the results of
    2 w/ 95% and 2>3 w/ 90%?

    I will mention another ranking complication. When you use SNK for
    "post-hoc" range testing, the formal derivation requires that you
    test from the outside, heading in. If the extremes do not differ,
    you never test the middle value. Of course, the SNK tests" here use
    different cutoff values when comparing low to next / low to high.

    If you are "merely" using several two-group tests, then here is a
    place where paradoxes might seem to arise: two-group tests, with
    extreme differences in variance, and groups of vastly different size.
    Oh, and when there are "paried" measurements, your correlations
    may differ and that can have consequences.

    If your question gets reduced to a question of How does /this/
    test behave, comparing A to B, B to C, and inferring A vs C:
    You probably can set limits showing for your question above
    that A has to differ from C (for that test), even when using
    "p-level" as an effect-size indicator. The demonstration may
    be different for "pooled variance" tests and "separate variance"
    tests.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)