Forum: >>> Magnum BBS <<<

Q interpretations for different types of comparisons

From Cosine@21:1/5 to All on Sat Feb 11 08:07:26 2023

Hi:

We have a new method, A, and some benchmarks: B1, B2, and B3.

We compare the performances of the above methods. Each comparison uses a two-sided test.

Are the first two types of comparisons identical?

Is the interpretation of type-3 correct?

Type-1:
all significant: A > B1, A > B2, and A > B3 => claim: A is superior to all benchmarks, i.e., A is the best among all these 4 methods.

Type-2:

All significant: B1 > B2 and B1 > B3 => B1 is the best among all B's.
Significant: A > B1 => A is superior to all benchmarks, i.e., A is the best among all these 4 methods.

Type-3:

All significant: B1 > B2 and B1 > B3 => B1 is the best among all B's.
Non-significant: A > B1 => accepting the H0, i.e., performance of A and B1
has no difference => A is better than B2 and B3.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich Ulrich@21:1/5 to All on Sat Feb 11 14:43:15 2023

On Sat, 11 Feb 2023 08:07:26 -0800 (PST), Cosine <asecant@gmail.com>
wrote:

Hi:

We have a new method, A, and some benchmarks: B1, B2, and B3.

We compare the performances of the above methods. Each comparison uses a two-sided test.

Are the first two types of comparisons identical?

Is the interpretation of type-3 correct?

Type-1:
all significant: A > B1, A > B2, and A > B3 => claim: A is superior to all benchmarks, i.e., A is the best among all these 4 methods.

Clearly -

Type-2:

All significant: B1 > B2 and B1 > B3 => B1 is the best among all B's.
Significant: A > B1 => A is superior to all benchmarks, i.e., A is the best among all these 4 methods.

Not entirely CLEARLY. Have you ever drawn lines that underline
the 'not-different' groups, for post-hoc testing? The basic theory,
which gives no inconsistencies in real data, assumes that Ns and
variances are equal. Real data can yield 'weird' results if you look
at the separate two-group tests; so, the recommended algorithms
perform two-group tests that use the all-group variance, and fake
the group Ns to be the same.

So, this is "True" by inference which assumes 'nothing weird is
happening.'

I've done testing against a benchmark which entailed paired-tests;
for paired data, 'nothing weird' also assumes that the r's are not
different.

Type-3:

All significant: B1 > B2 and B1 > B3 => B1 is the best among all B's.
Non-significant: A > B1 => accepting the H0, i.e., performance of A and B1
has no difference => A is better than B2 and B3.

No. Doesn't even depend on varainces and Ns.

It is easy to imagine that B1 is slightly better than A, though
not significiant; and the difference is enough so that A is not
'better' (significantly) than B2 and B3. This is a common picture
in post-hoc drawings: (B1,A) underlined together as not-different,
and (A, B2, B3) underlined together as not-different

--
Rich Ulrich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Sat May 11 13:43:06 2024
  from Wales, Uk via Telnet
- Keyop
  Sun May 12 20:08:49 2024
  from Huddersfield, West Yorkshire via SSH
- Keyop
  Sun May 12 20:08:37 2024
  from Huddersfield, West Yorkshire via SSH
- Guest
  Mon May 13 05:37:58 2024
  from Any State via RLogin

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	183:32:27
Calls:	6,738
Calls today:	1
Files:	12,267
Messages:	5,365,852

Q interpretations for different types of comparisons

Who's Online

Recent Visitors

System Info