When we want to test the performance of the algorithms of machine
learning, we use the same set of data. Specifically, we take the
algorithm-1, divide the dataset into the training set and the testing
set, train the algorithm-1, and then test it. We do the same for the algorithm-2.
Now we need to compare the performance of the two algorithms. If
we use the student t-test, should we use the independent one or the
paired one? Why?
Hi:the same for the algorithm-2.
When we want to test the performance of the algorithms of machine learning, we use the same set of data. Specifically, we take the algorithm-1, divide the dataset into the training set and the testing set, train the algorithm-1, and then test it. We do
Now we need to compare the performance of the two algorithms. If we use the student t-test, should we use the independent one or the paired one? Why?
On Sun, 29 Dec 2019 13:45:57 -0800 (PST), Cosinedo the same for the algorithm-2.
wrote:
Hi:
When we want to test the performance of the algorithms of machine learning, we use the same set of data. Specifically, we take the algorithm-1, divide the dataset into the training set and the testing set, train the algorithm-1, and then test it. We
Now we need to compare the performance of the two algorithms. If we use the student t-test, should we use the independent one or the paired one? Why?
Use the paired test, since the data exist as pairs.
And look at the correlation (and 2x2 table, if that
is the form of prediction).
If there is not a high correlation, then you can
get getter results by combining (1) and (2) --
average score, if continuous; by reporting "agreed
results" and "mixed answer" for Yes/No.
Rich Ulrich? 2019?12?31???? UTC+8??3?14?48????do the same for the algorithm-2.
On Sun, 29 Dec 2019 13:45:57 -0800 (PST), Cosine
wrote:
Hi:
When we want to test the performance of the algorithms of machine learning, we use the same set of data. Specifically, we take the algorithm-1, divide the dataset into the training set and the testing set, train the algorithm-1, and then test it. We
diseased condition.
Now we need to compare the performance of the two algorithms. If we use the student t-test, should we use the independent one or the paired one? Why?
Use the paired test, since the data exist as pairs.
And look at the correlation (and 2x2 table, if that
is the form of prediction).
If there is not a high correlation, then you can
get getter results by combining (1) and (2) --
average score, if continuous; by reporting "agreed
results" and "mixed answer" for Yes/No.
Thank you for replying.
First, I'd like to clarify my questions to avoid potential misunderstandings.
By testing the performances of the two algorithms, it means we would like to find the algorithm (or more generally, the method) performing better. Say, we might want to test to see if X-ray imaging or Y-ray imaging is better in identifying a particular
A further question is that, is it true that as long as we test these two algorithms/methods by using the same data set, i.e., the same group of patients, then the t-test we use for comparison must be paired one?
Say, even if we use different cross-validation methods for these two algorithms, we still must use a paired t-test, since the data come from the same group of patients?
How do we check to verify if the type of t-test we used is correct?
Say, in the situation discussed, we should use the paired t-test, not
the independent one, but how do we verify this?
Cosine wrote:
How do we check to verify if the type of t-test we used is correct?
Say, in the situation discussed, we should use the paired t-test, not
the independent one, but how do we verify this?
In principle, you should know from the way you structured the
experiment which test to do. More pragmatically, the reason for doing a >paired t-test (when it is appropriate) is that the variance of the
difference of the means is much smaller than it would be if the means
were independent. The reason for the smaller variance might be
identified as arising from a positive correlation between the
individual values in the pairs. A negative correlation would be
unusual but possible, depending on circumstances. The t-test should use
an estimate of the variance of the difference that does estimate the
correct variance.
So, if you only have the one sample set, you can either:
(i) compare numerically the two versions of the estimates of the
variance of the difference in the two versions of the test (or the
estimates of the standard deviations that appear as the divisors in the
two versions of the t-statistic);
(ii) investigate the correlation of the individual values in the pairs.
If you have more than one sample set, or are prepared to subsample the
sample set, you could do something more extensive.
On Tue, 14 Jan 2020 19:19:57 +0000 (UTC), "David Jones" <dajhawk18@@googlemail.com> wrote:
Cosine wrote:
not >> the independent one, but how do we verify this?
How do we check to verify if the type of t-test we used is correct?
Say, in the situation discussed, we should use the paired t-test,
In principle, you should know from the way you structured the
experiment which test to do. More pragmatically, the reason for
doing a paired t-test (when it is appropriate) is that the variance
of the difference of the means is much smaller than it would be if
the means were independent. The reason for the smaller variance
might be identified as arising from a positive correlation between
the individual values in the pairs. A negative correlation would be unusual but possible, depending on circumstances. The t-test should
use an estimate of the variance of the difference that does
estimate the correct variance.
Yes. As I wrote at the start of my earlier post, the paired
test is the correct test, with the right error term, when
the data are paired.
So, if you only have the one sample set, you can either:
(i) compare numerically the two versions of the estimates of the
variance of the difference in the two versions of the test (or the estimates of the standard deviations that appear as the divisors in
the two versions of the t-statistic);
(ii) investigate the correlation of the individual values in the
pairs.
David, I don't know why you are offering a choice of
comparing error terms. Using the wrong test is not going
to be justified by saying it is "more powerful (although
it is wrong)." Look at the r to get the general idea.
If you want to justify using a more powerful test, argue for
using 10% or 20% cutoff value, instead of misusing the 5%.
"Convenience" is the main, best excuse, though not an
entirely good one, for using the wrong test. Maybe, to
display a whole slew of results.
Beyond that, I can half-way imagine having both sorts of
tests within one set of analyses, and wanting to use group-tests
throughout in order to make the apparent effect sizes
commensurable. Effect sizes get complicated to report
in mixed models, with both within- and between-effects.
And I have sympathy for the attempts to explain them.
I'm just about always curious about the size (and sign) of
the correlation. And that will tell you whether you are
gaining or losing power.
If you have more than one sample set, or are prepared to subsample
the sample set, you could do something more extensive.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 285 |
Nodes: | 16 (2 / 14) |
Uptime: | 77:39:34 |
Calls: | 6,489 |
Files: | 12,096 |
Messages: | 5,276,381 |