• Q comparing the two groups in the same or different publications

    From Cosine@21:1/5 to All on Sat Oct 9 17:08:53 2021
    Hi:

    Suppose we did a study. In this study, we tested the effects of drugs A and B and of the placebo to treat the disease Z. We could use the t-test statistic of the random variables A and B to see if the difference between the two drugs is statistically
    significant. The formula requires the sample means, standard errors, and the numbers of samples of the two samples of the drug A and B.

    Now, suppose we found another study that tested the effects of drugs C and D and of the placebo to treat the disease Z. Could we determine if there are differences in treating the disease Z between drug A and C and between drug A and D again by using
    the t-test statistic, given only those sample information but not the raw data?

    Thank you,

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Duffy@21:1/5 to Cosine on Sun Oct 10 00:50:21 2021
    Cosine <asecant@gmail.com> wrote:
    Suppose we did a study. In this study, we tested the effects of drugs A and B
    Now, suppose we found another study that tested the effects of drugs C and D
    See "network meta-analysis".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Sun Oct 10 05:23:11 2021
    What if the purpose is to compare the drug A published in paper 1, drug B in paper 2, and so on?

    Could we again use the t-test for comparing the data from different papers?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Sun Oct 10 14:14:19 2021
    On Sat, 9 Oct 2021 17:08:53 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Hi:

    Suppose we did a study. In this study, we tested the effects of drugs A and B and of the placebo to treat the disease Z. We could use the t-test statistic of the random variables A and B to see if the difference between the two drugs is statistically
    significant. The formula requires the sample means, standard errors, and the numbers of samples of the two samples of the drug A and B.

    Now, suppose we found another study that tested the effects of drugs C and D and of the placebo to treat the disease Z. Could we determine if there are differences in treating the disease Z between drug A and C and between drug A and D again by using
    the t-test statistic, given only those sample information but not the raw data?


    Most studies only test ONE drug against placebo. They
    care about one drug, and they want all their "power" to
    go to that comparison.

    For the purpose of your question, comparing A to C
    (or to D), you would be looking at the performance
    of each drug in comparison to pbo.

    Describing the studies as having "two drugs" is a red
    herring, or it is a non-informative complication.

    Here is a modern form of your question, of current interest --

    If one Covid vaccine shows 95% protection in its main study
    and another vaccine shows 90% protection in its study, can
    we conclude that the first is better than the second? What
    about, compared to 80%?

    Well, as a mechanical proposition, we certainly can take the
    estimates and their SEs and generate a test. But we KNOW
    that the samples differed (location; age/sex/ethnicity?). If they
    were in a different time frame (or, even if not), maybe they
    were tested against a different dominate mutation of the virus.
    The instructions for case-ascertainment may have differed.
    And so on.

    95% vs 90% is based on small enough numbers that, if p < 0.05,
    it probably is not p< 0.001 (or better). So that "tested" difference
    is unpersuasive. We /know/ that uncontrolled factors /exist/
    and thus could be responsible. For establishing one is better,
    a test is necessary but not sufficient. We would have heard more
    if one of the vaccines had come in at only (say) 75%, which
    a-priori, before the studies, based on flu vaccines, did not seem
    like a terrible effiicacy.

    We want to see an "effect size" large enough that it is unlikely
    to have happened by chance. If those "confounding factors"
    see small, or if they exist such that they would bias /against/
    the better performing drug, then a test on their difference
    showing a bigger difference can be a bit persuasive. There's
    all those (educated) readers whom you have to convince.

    For Covid, they seem to use all three obvious criteria --
    getting symptoms, getting hospitalized, dying. A vaccine
    does look better if it looks better on all three criteria.
    Performance in whole populations (states, countries) also
    washes out the idiosyncracies of the original studies.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Sun Oct 10 12:37:51 2021
    Let's try the case for developing a new AI algorithm to help screen/detect/diagnose the disease, e.g., CoVid-19. The algorithm could use the medical images as input or use all other relevant information.

    Now we would face the question of comparing the performances of different algorithms. As a standard practice, we would need to compare the newly developed algorithm against the state-of-the-art algorithms. We could implement those published algorithms
    and then compare them with the new one using the same dataset we have. A more convenient alternative is to compare the performances of the new one we produced with those of the published paper using other datasets. Could we perform the second approach
    using the t-test or what else should we use?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Sun Oct 10 19:03:45 2021
    On Sun, 10 Oct 2021 12:37:51 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:


    Let's try the case for developing a new AI algorithm to help screen/detect/diagnose the disease, e.g., CoVid-19. The algorithm
    could use the medical images as input or use all other relevant
    information.

    The picture of the lung is relatively specific. But Covid reportedly
    affects a whole slew of systems. I wonder how many of them are
    easy to examine and compare.


    Now we would face the question of comparing the performances of
    different algorithms. As a standard practice, we would need to compare
    the newly developed algorithm against the state-of-the-art algorithms.
    We could implement those published algorithms and then compare them
    with the new one using the same dataset we have.

    Yes - I think that any "algorithm" approach will always apply all
    algorithms to the same data. There is ENORMOUSLY more power
    in doing the "paired" comparisons than comparing to something
    derived on some other sets of data, no matter how well defined
    their sampling is. Presumably, you look for sensitivity and
    specificity, and have to make some judgment on the cases where
    two algorithms disagree (which is not possible, for two samplings).

    "Gold standards" of dx may figure in, somewhere.

    A more convenient
    alternative is to compare the performances of the new one we produced
    with those of the published paper using other datasets. Could we
    perform the second approach using the t-test or what else should we
    use?

    What do you imagine comparing, for two different samples and
    two different algorithms?
    If they come up with different rates of disease, you won't know
    why.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Mon Oct 11 08:00:33 2021
    Let's clarify some points for the AI algorithms based on the dataset of patient images.

    A general pattern of this kind of researches is: a new algorithm was proposed and its performance was investigated, e.g., sensitivity or specificity. This was realized by comparing the AI results against the gold standard, e.g., the PCR test or something
    else. In addition to that, the paper will also present the results of other published AI algorithms to show that the proposed one is better.

    If the paper implemented the published algorithms, then the standard t-test for the difference of the random variables is performed. However, sometimes, the paper chose to compare its own results with the results published in other papers. Apparently,
    one cannot directly compare the sensitivity/specificity of the proposed algorithm with those of other published papers. How do we formally do this comparison then?

    A sad truth is that, for CoVid-19, the publicly available and large datasets of patient images are still scarce. Maybe this is why some papers chose to compare their own results of the proposed algorithm based on a small to medium dataset with the
    results of the published paper based on a large dataset.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Tue Oct 12 12:49:08 2021
    On Mon, 11 Oct 2021 08:00:33 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Let's clarify some points for the AI algorithms based on the dataset of patient images.

    A general pattern of this kind of researches is: a new algorithm was
    proposed and its performance was investigated, e.g., sensitivity or specificity. This was realized by comparing the AI results against the
    gold standard, e.g., the PCR test or something else. In addition to
    that, the paper will also present the results of other published AI algorithms to show that the proposed one is better.

    Sensitivity/specificity go hand in hand. There is a whole curve to
    compare. The test that is best at one extreme may not be best
    at the other. One Covid-antigen survey in California, mid-2020,
    used two different cut-offs for "yes, this person has been infected"
    - depending on the base-rate of illness in that region. The final
    estimates of disease prevalence made efforts (applied formulas)
    to account for false-positives and false-negatives in the raw data.


    If the paper implemented the published algorithms, then the
    standard t-test for the difference of the random variables is
    performed.

    - paired tests - Good power, and no question about "sample"
    differences.

    However, sometimes, the paper chose to compare its own
    results with the results published in other papers. Apparently, one
    cannot directly compare the sensitivity/specificity of the proposed
    algorithm with those of other published papers. How do we formally
    do this comparison then?

    You write, "One cannot directly [do A]... How do we formally [do A]?"

    As I wrote last time: You can do the test. Then you have to argue
    that your "significant" effect is large enough that it would be robust
    against the likely or possible /confounding/ differences between
    samples.

    Your best chance of that is when the potential replacement is
    tested in conditions that provide /lower/ expectations of good
    outcome.


    A sad truth is that, for CoVid-19, the publicly available and
    large datasets of patient images are still scarce. Maybe this is why
    some papers chose to compare their own results of the proposed
    algorithm based on a small to medium dataset with the results of the published paper based on a large dataset.

    Exploratory work. "We think we have a good competitor" because
    it is cheaper and uses better science.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cosine@21:1/5 to All on Tue Oct 12 11:51:15 2021
    Rich Ulrich 在 2021年10月13日 星期三上午12:49:13 [UTC+8] 的信中寫道:
    On Mon, 11 Oct 2021 08:00:33 -0700 (PDT), Cosine
    wrote:
    ....
    However, sometimes, the paper chose to compare its own
    results with the results published in other papers. Apparently, one
    cannot directly compare the sensitivity/specificity of the proposed algorithm with those of other published papers. How do we formally
    do this comparison then?
    You write, "One cannot directly [do A]... How do we formally [do A]?"

    As I wrote last time: You can do the test. Then you have to argue
    that your "significant" effect is large enough that it would be robust against the likely or possible /confounding/ differences between
    samples.

    By "we cannot directly compare ..." I meant that we cannot compare directly mu1 > mu2
    and then claim that algorithm-1 performs better. However, if the other paper provided mu2, SE2,
    and n2 (sample number,) we should be able to use this information to calculate the statistical
    significance of the random variable (mu1-mu2) by using the t-test, since the formula of the t-test
    used only those three variables of the two samples: mu, SE, and n to form a new random variable.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Thu Oct 14 13:59:58 2021
    On Tue, 12 Oct 2021 11:51:15 -0700 (PDT), Cosine <asecant@gmail.com>
    wrote:

    Rich Ulrich ? 2021?10?13? ?????12:49:13 [UTC+8] ??????
    On Mon, 11 Oct 2021 08:00:33 -0700 (PDT), Cosine
    wrote:
    ....
    However, sometimes, the paper chose to compare its own
    results with the results published in other papers. Apparently, one
    cannot directly compare the sensitivity/specificity of the proposed
    algorithm with those of other published papers. How do we formally
    do this comparison then?
    You write, "One cannot directly [do A]... How do we formally [do A]?"

    As I wrote last time: You can do the test. Then you have to argue
    that your "significant" effect is large enough that it would be robust
    against the likely or possible /confounding/ differences between
    samples.

    By "we cannot directly compare ..." I meant that we cannot compare directly mu1 > mu2
    and then claim that algorithm-1 performs better. However, if the other paper provided mu2, SE2,
    and n2 (sample number,) we should be able to use this information to calculate the statistical
    significance of the random variable (mu1-mu2) by using the t-test, since the formula of the t-test
    used only those three variables of the two samples: mu, SE, and n to form a new random variable.


    Okay, "directly" meant "with no test".

    Do keep in mind my warning,
    Then you have to argue
    that your "significant" effect is large enough that it would be robust against the likely or possible /confounding/ differences between
    samples.


    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)