• standard error=0 in stratified sampling?

    From poboxabcde@gmail.com@21:1/5 to All on Thu Aug 3 06:12:37 2017
    Suppose I have a population with 100 events and 900 non-events. Thus, the population’s event rate is 0.1.
    When I select 10% from these 100 events and also 10% from the 900 non-events, the sample’s event rate is 0.1.
    If I repeat the process 20 times to create 20 samples, each sample’s rate is 0.1. Then, the standard error (the square root of the variance of these 20 means) is 0 because all the 20 event rates is 0.1.
    Do I miss something or this is a legitimate stratified sampling. Please help. Thanks.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to poboxabcde@gmail.com on Thu Aug 3 13:26:59 2017
    XPost: sci.stat.math, sci.stat.edu

    I'm cross-posting this in the 3 groups where I see the identical
    message.

    On Thu, 3 Aug 2017 06:13:09 -0700 (PDT), poboxabcde@gmail.com wrote:

    Suppose I have a population with 100 events and 900 non-events. Thus, the population’s event rate is 0.1.
    When I select 10% from these 100 events and also 10% from the 900 non-events, the sample’s event rate is 0.1.
    If I repeat the process 20 times to create 20 samples, each sample’s rate is 0.1. Then, the standard error (the square root of the variance of these 20 means) is 0 because all the 20 event rates is 0.1.
    Do I miss something or this is a legitimate stratified sampling. Please help. Thanks.

    This does not look familiar to me, but you can do a lot of stuff
    if you can justify it. What is very clear is that you cannot use
    a variance (or SE) for any inference or testing after you have
    set it to zero by the design. What are you trying to estimate?

    A binomial rate has its own variance based on the mean, so those
    variances are ordinarily robust.

    So: If you are interested in the variance of the rate of events,
    that should not be very problematic unless you are stratifying by
    some /other/ variable that has a strong effect on the event rate.

    Stratification, jack-knife, bootstrap -- I've never done much with
    any of them, but it looks to me like you are confusing the ideas.
    Bootstrapping goes after difficult variances, but I don't picture
    that problem with a dichotomous outcome. (Jackknife, ditto.)

    You seem to have the whole sample in hand, so I don't see why your stratification is desirable. I see that as a sampling scheme which is
    used when resources are inadequate. Or, to avoid huge Ns (which
    is less often seen as a problem now, than when computers were
    1000 times slower).

    Hope this helps,

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)