Suppose we did a study. In this study, we tested the effects of drugs A and BSee "network meta-analysis".
Now, suppose we found another study that tested the effects of drugs C and D
Hi:significant. The formula requires the sample means, standard errors, and the numbers of samples of the two samples of the drug A and B.
Suppose we did a study. In this study, we tested the effects of drugs A and B and of the placebo to treat the disease Z. We could use the t-test statistic of the random variables A and B to see if the difference between the two drugs is statistically
Now, suppose we found another study that tested the effects of drugs C and D and of the placebo to treat the disease Z. Could we determine if there are differences in treating the disease Z between drug A and C and between drug A and D again by usingthe t-test statistic, given only those sample information but not the raw data?
Let's try the case for developing a new AI algorithm to help screen/detect/diagnose the disease, e.g., CoVid-19. The algorithm
could use the medical images as input or use all other relevant
information.
Now we would face the question of comparing the performances of
different algorithms. As a standard practice, we would need to compare
the newly developed algorithm against the state-of-the-art algorithms.
We could implement those published algorithms and then compare them
with the new one using the same dataset we have.
A more convenient
alternative is to compare the performances of the new one we produced
with those of the published paper using other datasets. Could we
perform the second approach using the t-test or what else should we
use?
Let's clarify some points for the AI algorithms based on the dataset of patient images.
A general pattern of this kind of researches is: a new algorithm was
proposed and its performance was investigated, e.g., sensitivity or specificity. This was realized by comparing the AI results against the
gold standard, e.g., the PCR test or something else. In addition to
that, the paper will also present the results of other published AI algorithms to show that the proposed one is better.
If the paper implemented the published algorithms, then the
standard t-test for the difference of the random variables is
performed.
However, sometimes, the paper chose to compare its own
results with the results published in other papers. Apparently, one
cannot directly compare the sensitivity/specificity of the proposed
algorithm with those of other published papers. How do we formally
do this comparison then?
A sad truth is that, for CoVid-19, the publicly available and
large datasets of patient images are still scarce. Maybe this is why
some papers chose to compare their own results of the proposed
algorithm based on a small to medium dataset with the results of the published paper based on a large dataset.
On Mon, 11 Oct 2021 08:00:33 -0700 (PDT), Cosine
wrote:
....
However, sometimes, the paper chose to compare its ownYou write, "One cannot directly [do A]... How do we formally [do A]?"
results with the results published in other papers. Apparently, one
cannot directly compare the sensitivity/specificity of the proposed algorithm with those of other published papers. How do we formally
do this comparison then?
As I wrote last time: You can do the test. Then you have to argue
that your "significant" effect is large enough that it would be robust against the likely or possible /confounding/ differences between
samples.
Rich Ulrich ? 2021?10?13? ?????12:49:13 [UTC+8] ??????
On Mon, 11 Oct 2021 08:00:33 -0700 (PDT), CosineBy "we cannot directly compare ..." I meant that we cannot compare directly mu1 > mu2
wrote:
....
However, sometimes, the paper chose to compare its ownYou write, "One cannot directly [do A]... How do we formally [do A]?"
results with the results published in other papers. Apparently, one
cannot directly compare the sensitivity/specificity of the proposed
algorithm with those of other published papers. How do we formally
do this comparison then?
As I wrote last time: You can do the test. Then you have to argue
that your "significant" effect is large enough that it would be robust
against the likely or possible /confounding/ differences between
samples.
and then claim that algorithm-1 performs better. However, if the other paper provided mu2, SE2,
and n2 (sample number,) we should be able to use this information to calculate the statistical
significance of the random variable (mu1-mu2) by using the t-test, since the formula of the t-test
used only those three variables of the two samples: mu, SE, and n to form a new random variable.
Then you have to argue
that your "significant" effect is large enough that it would be robust against the likely or possible /confounding/ differences between
samples.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 285 |
Nodes: | 16 (2 / 14) |
Uptime: | 77:46:56 |
Calls: | 6,489 |
Files: | 12,096 |
Messages: | 5,276,459 |