• Hypothesis Testing: the TEST STATISTIC

    From haishatosumah@gmail.com@21:1/5 to Reef Fish on Wed Apr 1 04:25:04 2020
    On Friday, August 18, 2006 at 4:34:41 PM UTC+1, Reef Fish wrote:
    Keywords: Hypothesis Testing; Test Statistic; Type I error;
    alpha=Pr(Type I error); p-value.

    This is a mini-lecture of an important topic at the Freshman Level.
    The lecture is prompted by the conspicuous silence of both the
    pros in industry and those who teach, at Afonso's latest blunders
    about the TEST STATISTIC in the problem of testing the
    equality of two proportions (a topic found in most first couses
    in statistics).

    The pros in industry seldom deal with this GENERAL topic of
    Hypothesis Testing. They just use the values of the test statistcs
    spewed out by SPSS or SAS or the p-values printed by the
    software packages associated with the test statistics.

    Several of those who are known professors and often speak
    up against Afonso's errors, or posted about OTHER statistical
    topics are conspicuously silent in seeing the same error made
    by Afonso repeatedly (about a dozen times in one day, amidst
    subject lines like "HORRIBILIS Bob Ling" , "The complete
    Bob Ling K.O.", and the latest, "REQUIEM for a Bob Ling
    BLUNDER". Even if they don't normally read anything posted
    by Afonso, those libelous subjects posted by Afonso surely
    should have attracted some attention, out of curiosity if
    nothing else.

    At least two of the professors' silence (Rubin and Tomsky) on
    Afonso's blunder was probably because they are so far up the
    stratoshperes of academia that they have never taught that
    lowly topic [testing the equality of two proportions] (usually
    at the Freshman level introductory course). The other
    professors are probably not even aware of the issue (or the
    topic), or not sure what the fuss is about. Else they surely
    could have pointed Afonso to a chapter or section of a
    textbook that deals with the particular topic.


    To ME, the fuss was about a BASIC GENERAL PRINCIPLE
    in Hypothesis Testing, in which the special topic of testing
    the equality of two proportions is the showcase to highlight
    that principle, which can easily be overlooked in the TEST
    STATISTIC of other hypothesis tests.


    So, here's the mini-lecture for ALL of the above reasons.

    In a classical Hypothesis Testing set up, the HIDDEN
    principle is that the TEST STATISTIC must assume Ho to
    be true in the execution of the test!

    Reason? The TEST STATISTIC is used to determine
    both the p-value and whether Ho is accepted or rejected
    at a fixed alpha level.

    alpha = P( Ho is rejected | Ho is true).

    Since the TEST STATISTIC is used to determine if Ho is
    to be accepted or rejected given the rejection region
    determned by alpha, it MUST assume the values of the
    tested parameters at Ho <hence assume Ho true>.

    It is perhaps even more obvious because the TEST
    STATISTIC determines the p-value of the test.

    p-value = Pr( TEST STATISTIC is "more extreme" than
    observed value of the test statistic WHEN Ho is TRUE).

    Therefore, whether one uses a fixed leval alpha or a
    p-value, in order for a TEST STATISTIC to serve its
    purpose relative to those measures of Type I error,
    the test statistic itself must incorporate the Ho values.


    For MOST tests of a hypothesis, such as Ho: mu = 0,
    the same FORM of the test statistic is used, for both
    hypothesis testing and confidence intervals for the
    parameter.

    In case of testing a mean, the T statistic is often used.
    "mu = 0" MUST be incorporated in the TEST STATISTIC.
    That is why the T-statistic is (xbar - 0)/(s/sqrt(n)), where
    s is the standard deviation of the variable X. Whatever
    the value of mu is in Ho does not affect the estimate s,
    which is independent of xbar, both of which enter s.

    So, for the execution of that test, one can in fact EITHER
    execute a formal test of the hypothsis Ho, OR construct
    a confidence interval for mu and see if it covers mu=0.
    The two methods are EQUIVALENT.


    That is in fact the source of Afonso's blunder in the
    problem of testing the equality of two proportions,
    because Afonso doesn't know the elements of a
    Hypothesis Test. He ALWAYS relates a test to the
    corresponding Confidence Interval, and his confidence
    intervals are always two-sided. (One tailed tests
    correspond to one-sided confidence intervals)

    The problem of the DIFFERENCE of two proportions
    is one exception to the usual rule of the equivalence of
    a test and a confidence interval. The are NOT
    equivalent in that case.

    The variance of a sample proportion p1-hat= x1/n1.
    denoted by p1* say, is p1*(1-p1*)/n1.
    Similarly for p2=x2/n2: p2*(1-p2*)/n2.
    Thus, for two independent sample proportions, the
    variance of the DIFFERENCE is

    vd = p1*(1-p1*)/n1 + p2*(1-p2*)/n2.

    vd (no pun intended <g>) is what Afonso has, and the
    ONLY variance of the difference he knows.

    For LARGE samples where Z approximates the sampling
    distribution of (p1* - p2*)/sqrt(vd), the two sided
    CONFIDENCE INTERVAL for (p1 - p2) is given by

    (p1* - p2*) +- z(alpha/2) * sqrt(vd), for the alpha level.


    But the GENERAL PRINCIPLE for Hypothesis testing
    that Ho must be incorporated into the TEST STATISTIC
    comes into play in the difference of proportions TEST,
    when testing the difference is ZERO.

    It is no longer appropriate to use the same vd for
    confidence interval of the difference as the variance
    in the denominator of the test statistic, because when
    Ho is true, p1 = p2, and vd uses two different estimates
    for the variance of a SINGLE proportion p.

    Therefore, for testing p1 = p2, one must use the variance
    of their common p in the TEST STATISTIC.

    In pooling the two samples to get ONE sample proportion,
    the common proportion is p** = (x1+x2)/(n1+n2),

    and the variance of which is given by v** = p** (1-p**)/(n1+n2)

    and the TEST STATISTIC = Z = (p1* - p2*)/sqrt(v**).

    For LARGE independent samples, the above statistic is
    found in EVERY elementary textbook of introductory
    statistics.

    So, THAT is what the fuss was all about. It's not about Afonso's
    faulty algebra or errors of computation, but it's about the GENERAL
    PRINCIPLE of Hypothesis Testing that is used in the problem of
    testing the difference of two proportions. For anyone interested,
    you may now to back to the thread

    "Question about the usage of the Binomial dist" now running
    under Afonso's heading of "REQUIEM for a Bob Ling's blunder",

    which now ran to 35 posts, in which Afonso's suggested test
    was in Post #2 of the thread (in Google). My correction of his
    errors was in Post #3.

    EVERYTHING in this mini-lecture had already been explained
    to Afonso, some several times, within the thread, by me.

    Afonso not only missed all the lessons but was devious enough
    to have made TWO posts under the subject

    "REQUIEM for a Bob Ling's blunder"

    in which he carefully avoided the mention of HYPOTHESIS
    TESTING, and misrepresented the entire discussion into
    his Confidence Interval claim that he used his vd as the
    sample variance, and Bob "insisted" the use of v**.

    Jack Tomsky entered into that issue for the first time, by
    a question and a comment, neither of which reflected the
    SUBSTANCE of the issue.

    Yes, Bob DID insist the use of v** for the reasons explained
    in this mini-lecture, and had explained it to Afonso half a
    dozen times, in the HYPOTHESIS TESTING context.

    This was what Bob POSTED, regarding the CONFIDENCE
    INTERVAL for the difference of two proportions:

    YESTERDAY, prior to Afonso's libelous REQUIEM subject today:

    In his noise, Afonso carefully omitted the Hypothesis test context:


    The discussion stays on the dilemma:
    ___The variance (of the difference of means) must be
    estimated by
    ___the pooled variance = (65+14)/ (161+32) =0.4093
    as Bob Ling insists,

    In TESTING the hypothesis Ho: p1 = P2.

    or by
    ___pA*(1-pA)/61 + pB*(1-pB)/32 = 0.0092
    as I claim. (pA=65/161, pB=14/32).

    That is INCORRECT for hypothesis testing, but
    correct if used to find a confidence interval for the
    =================================
    difference of the two proportions.

    Note how up to now, Afonso kept bringing up the
    red-herring of not knowing Ho or H1 is true, etc.,
    and suddenly he left out the HYPOTHESIS TESTING
    context altogether.

    The above would have been the appropriate response to
    Afonso's REQUIEM post today, except I had already posted it
    yesterday.

    This mini-lecture covered ALL of the BASIC elements of a
    hypothesis test in general, and about the role of the TEST
    STATISTIC in particular (that it must assume Ho to be true)
    that everyone SHOULD know, and know well.

    -- Reef Fish Bob.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)