• Equivalent of MIXED ANOVA FOR NON PARAMETRIC STATISTICS

    From yzz_812@hotmail.com@21:1/5 to luigi.b...@gmail.com on Tue May 12 17:40:28 2020
    On Monday, 17 November 2014 03:51:55 UTC-8, luigi.b...@gmail.com wrote:
    HI,
    I hope someone could help.
    My experiment:
    within subject (time) with 3 level
    between subject (condition) with 2 levels
    In each condition 10 subjects (total 20)
    I have different DP variables but I want to analyse these one by one.

    Unfortunately my data are not normally distributed (as expected)and even with the correction I cannot achieve this assumption. The idea to use a mixed anova is not possible. For sure I could use MANN-WHITNEY test.

    I was wondering if there is a different way to analyse the data or a sort of nonparametric GLM.

    Thank you very much

    L

    Hey I was wondering how did you solve your problem at the end?
    In fact, I have a very similar problem right now where I have a design in which it has both between subject and within subject component, and the distribution are not at all normal. I was struggling in deciding which non-parametric test to use.

    How did you tackle the stat at the end in your research? because if we tease apart the groups and within subject level, we can't detect interaction effect.

    best
    J

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to yzz_812@hotmail.com on Wed May 13 15:28:36 2020
    On Tue, 12 May 2020 17:40:28 -0700 (PDT), yzz_812@hotmail.com wrote:

    On Monday, 17 November 2014 03:51:55 UTC-8, luigi.b...@gmail.com wrote:
    HI,
    I hope someone could help.
    My experiment:
    within subject (time) with 3 level
    between subject (condition) with 2 levels
    In each condition 10 subjects (total 20)
    I have different DP variables but I want to analyse these one by one.

    Unfortunately my data are not normally distributed (as expected)and even with the correction I cannot achieve this assumption. The idea to use a mixed anova is not possible. For sure I could use MANN-WHITNEY test.

    I was wondering if there is a different way to analyse the data or a sort of nonparametric GLM.

    Thank you very much

    L

    Hey I was wondering how did you solve your problem at the end?
    In fact, I have a very similar problem right now where I have a design in which it has both between subject and within subject component, and the distribution are not at all normal. I was struggling in deciding which non-parametric test to use.

    How did you tackle the stat at the end in your research? because if we tease apart the groups and within subject level, we can't detect interaction effect.

    best
    J

    To the Questioner: Why is your distribution "not at all normal"?
    More importantly, do equal point-differences describe "equal
    intervals" of whatever is important in outcome? (If not, why
    not, and can't you do something sensible about that.)

    There is certainly not a one-size-fits all solution, especially when
    the problem arises from design that did not foresee it.


    It is highly unlikely that the hit-and-run questioner from 2014
    is still reading this group. So don't expect to hear what he did.

    If anyone is interested, the original two replies (from Bruce and
    from me) are at

    https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/ZCppmHoKMNE

    They read pretty well.

    Googling showed me a similar question in another forum, but I
    haven't looked at it.



    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bruce Weaver@21:1/5 to Rich Ulrich on Wed May 13 13:15:53 2020
    On Wednesday, May 13, 2020 at 3:28:43 PM UTC-4, Rich Ulrich wrote:
    --- snip ---

    It is highly unlikely that the hit-and-run questioner from 2014
    is still reading this group. So don't expect to hear what he did.

    If anyone is interested, the original two replies (from Bruce and
    from me) are at

    https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/ZCppmHoKMNE

    Good idea to post that link. It did not occur to me, because I am reading this via Google Groups, and so I could see all of the posts right back to the original. People reading via some other news server may not have access to all of the old posts.


    They read pretty well.

    I thought so too. I have nothing to add to what we posted in 2014.

    Bruce

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From yzz_812@hotmail.com@21:1/5 to Rich Ulrich on Wed May 13 17:17:53 2020
    On Wednesday, 13 May 2020 12:28:43 UTC-7, Rich Ulrich wrote:
    On Tue, 12 May 2020 17:40:28 -0700 (PDT), yzz_812@hotmail.com wrote:

    On Monday, 17 November 2014 03:51:55 UTC-8, luigi.b...@gmail.com wrote:
    HI,
    I hope someone could help.
    My experiment:
    within subject (time) with 3 level
    between subject (condition) with 2 levels
    In each condition 10 subjects (total 20)
    I have different DP variables but I want to analyse these one by one.

    Unfortunately my data are not normally distributed (as expected)and even with the correction I cannot achieve this assumption. The idea to use a mixed anova is not possible. For sure I could use MANN-WHITNEY test.

    I was wondering if there is a different way to analyse the data or a sort of nonparametric GLM.

    Thank you very much

    L

    Hey I was wondering how did you solve your problem at the end?
    In fact, I have a very similar problem right now where I have a design in which it has both between subject and within subject component, and the distribution are not at all normal. I was struggling in deciding which non-parametric test to use.

    How did you tackle the stat at the end in your research? because if we tease apart the groups and within subject level, we can't detect interaction effect.

    best
    J

    To the Questioner: Why is your distribution "not at all normal"?
    More importantly, do equal point-differences describe "equal
    intervals" of whatever is important in outcome? (If not, why
    not, and can't you do something sensible about that.)

    There is certainly not a one-size-fits all solution, especially when
    the problem arises from design that did not foresee it.


    It is highly unlikely that the hit-and-run questioner from 2014
    is still reading this group. So don't expect to hear what he did.

    If anyone is interested, the original two replies (from Bruce and
    from me) are at

    https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/ZCppmHoKMNE

    They read pretty well.

    Googling showed me a similar question in another forum, but I
    haven't looked at it.



    --
    Rich Ulrich

    Hello Rich,

    it was very nice of you to reply my question, as I actually didn't expect any response since it was a 6 year-old post.

    The OP's question basically hit most of the concerns I have with my current data analysis. unfortunately, I don't have a strong background in statistic and am in a process in self-learning most of the statistic knowledge. So i was having hard time
    understanding both your and Bruce's comments.

    I have used the Kolmogorov-Smirnov normality test.

    the reason the data was not normal largely because there are a lot of zeros in the data (continuous numeric data with absolute zero), which make it skewed positively. My supervisor advice me to try clean up the outlier, if possible, and then try data
    transformation. however, I have done both and yet the most of the dependent variables still violate assumption of normality.

    so I turned to non-parametric solution. my research has between subject component (2 groups), and time (3 time points) as within subject component.
    Since there is no non-parametric equivalence of mixed design ANOVA, I have to find a solution that is similar to what parametric ANOVA does and self-learn how to do that on SPSS.

    I have examined Friedman's test, Mann-Whitney U test, Kruskal-Wallis H test, and Wilcoxon signed rank test. as i saw it, most of them are based on the rank of the dataset and is only partial solution to my analyzing goal. So I was trying to find if there
    is any way to work around that. i wonder if i still can analyze the interaction effect (group x time) under this context?

    If run the between group and within group tests separately on my data, what problems/issue would follow by doing so?

    that's why I post the question and try to see how other researchers usually deal with these kind of situation.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Bruce Weaver@21:1/5 to yzz...@hotmail.com on Thu May 14 11:07:31 2020
    On Wednesday, May 13, 2020 at 8:17:55 PM UTC-4, yzz...@hotmail.com wrote:

    --- snip ---

    I have used the Kolmogorov-Smirnov normality test.

    As Graeme Ruxton (2006) put it, "it is generally unwise to decide whether to perform one statistical test on the basis of the outcome of another (Zimmerman 2004 and references therein)." He was talking specifically about a preliminary test of
    homogeneity of variance prior to a t-test, but the basic principal holds generally, and is certainly true about testing for normality as a precursor to another test. I have a short conference presentation on that topic that may interest you.

    Ruxton (2006): https://academic.oup.com/beheco/article/17/4/688/215960

    My presentation on testing for normality: https://www.researchgate.net/publication/299497976_Silly_or_Pointless_Things_People_Do_When_Analyzing_Data_1_Testing_for_Normality_as_a_Precursor_to_a_t-test


    the reason the data was not normal largely because there are a lot of zeros in the data (continuous numeric data with absolute zero), which make it skewed positively. My
  • From Rich Ulrich@21:1/5 to yzz_812@hotmail.com on Thu May 14 14:26:51 2020
    On Wed, 13 May 2020 17:17:53 -0700 (PDT), yzz_812@hotmail.com wrote:


    Hello Rich,

    it was very nice of you to reply my question, as I actually didn't expect any response since it was a 6 year-old post.

    The OP's question basically hit most of the concerns I have with my current data analysis. unfortunately, I don't have a strong background in statistic and am in a process in self-learning most of the statistic knowledge. So i was having hard time
    understanding both your and Bruce's comments.

    I have used the Kolmogorov-Smirnov normality test.

    the reason the data was not normal largely because there are a lot of zeros in the data (continuous numeric data with absolute zero), which make it skewed positively. My supervisor advice me to try clean up the outlier, if possible, and then try data
    transformation. however, I have done both and yet the most of the dependent variables still violate assumption of normality.

    so I turned to non-parametric solution. my research has between subject component (2 groups), and time (3 time points) as within subject component.
    Since there is no non-parametric equivalence of mixed design ANOVA, I have to find a solution that is similar to what parametric ANOVA does and self-learn how to do that on SPSS.

    I have examined Friedman's test, Mann-Whitney U test, Kruskal-Wallis H test, and Wilcoxon signed rank test. as i saw it, most of them are based on the rank of the dataset and is only partial solution to my analyzing goal. So I was trying to find if
    there is any way to work around that. i wonder if i still can analyze the interaction effect (group x time) under this context?

    If run the between group and within group tests separately on my data, what problems/issue would follow by doing so?

    that's why I post the question and try to see how other researchers usually deal with these kind of situation.

    The most common thing that researchers and statisticians
    do about non-normality is IGNORE IT. And for pretty
    good reasons.

    Taking a rank-transformation of the scores is the starting
    point for (all?) those tests you mention. When you replace
    your scores with their rank-transformed versions ... DO you
    get a set of numbers that improve the "interval" distances
    between what you think those original scores should
    represent? If the original scores look better, more "equal
    interval", then you don't want the loss of detail from converting
    to ranks.

    Non-parametric tests.
    - By the way, their complicated formulas (in their simple form)
    include the assumption that there are no ties. So, your
    data, with many zeroes, also fail to meet the assumptions
    for the "exact" nonparamentric tests. However, there are
    "approximations" available. About them -

    WJ Conover showed in the 1980s that most of the rank-
    order tests with complicated formulas can be replaced by
    performing ANOVA on the rank-transformed numbers.
    Conover showed that the ANOVA on ranks can be better
    (more accurate tests) than the approximations in use by
    stat packages, especially when there are many ties.

    Outliers and zeroes.
    You mention "outlier" as if (maybe) you have just one.
    Should you (maybe) drop that case, and mention it only
    as a case report, because the score is so very atypical?
    Or should one (or more) extreme be drawn in, scored as
    the next-highest value? What makes sense?

    "Many zeroes" is sometimes the justification to rescore
    everything as 0/1. Does that lose much sense in your
    data? Do those other values matter? I don't know of
    anybody giving advice on this topic, but if half your
    scores are zero, I think you have a good case to (at
    least) try out this alternate scoring.

    Non-normality.
    The problem with non-normality in the residuals of the
    model-fit is that the resulting F-test might not be accurate;
    it might reject too often, or it might reject too seldom.
    But ANOVA is pretty robust. Analyzing a 0/1 variable is
    not a problem between 20%-80%. And it is not a
    problem for rank-transformed data, if the transformation
    does not screw up the intervals more than it helps them.

    Now, if you have one or more cases where all their scores
    are zero, that could distort the picture. "Too many zeroes"
    in repeated measures is where the Greenhouse–Geisser
    correction is used.


    Hope this helps.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From sasa ZHAO@21:1/5 to All on Sat Sep 26 06:17:22 2020
    在 2020年5月14日星期四 UTC+2 上午2:17:55,<yzz...@hotmail.com> 写道:
    On Wednesday, 13 May 2020 12:28:43 UTC-7, Rich Ulrich wrote:
    On Tue, 12 May 2020 17:40:28 -0700 (PDT), yzz...@hotmail.com wrote:

    On Monday, 17 November 2014 03:51:55 UTC-8, luigi.b...@gmail.com wrote: >> HI,
    I hope someone could help.
    My experiment:
    within subject (time) with 3 level
    between subject (condition) with 2 levels
    In each condition 10 subjects (total 20)
    I have different DP variables but I want to analyse these one by one. >>
    Unfortunately my data are not normally distributed (as expected)and even with the correction I cannot achieve this assumption. The idea to use a mixed anova is not possible. For sure I could use MANN-WHITNEY test.

    I was wondering if there is a different way to analyse the data or a sort of nonparametric GLM.

    Thank you very much

    L

    Hey I was wondering how did you solve your problem at the end?
    In fact, I have a very similar problem right now where I have a design in which it has both between subject and within subject component, and the distribution are not at all normal. I was struggling in deciding which non-parametric test to use.

    How did you tackle the stat at the end in your research? because if we tease apart the groups and within subject level, we can't detect interaction effect.

    best
    J

    To the Questioner: Why is your distribution "not at all normal"?
    More importantly, do equal point-differences describe "equal
    intervals" of whatever is important in outcome? (If not, why
    not, and can't you do something sensible about that.)

    There is certainly not a one-size-fits all solution, especially when
    the problem arises from design that did not foresee it.


    It is highly unlikely that the hit-and-run questioner from 2014
    is still reading this group. So don't expect to hear what he did.

    If anyone is interested, the original two replies (from Bruce and
    from me) are at

    https://groups.google.com/forum/#!topic/comp.soft-sys.stat.spss/ZCppmHoKMNE

    They read pretty well.

    Googling showed me a similar question in another forum, but I
    haven't looked at it.



    --
    Rich Ulrich
    Hello Rich,

    it was very nice of you to reply my question, as I actually didn't expect any response since it was a 6 year-old post.

    The OP's question basically hit most of the concerns I have with my current data analysis. unfortunately, I don't have a strong background in statistic and am in a process in self-learning most of the statistic knowledge. So i was having hard time
    understanding both your and Bruce's comments.

    I have used the Kolmogorov-Smirnov normality test.

    the reason the data was not normal largely because there are a lot of zeros in the data (continuous numeric data with absolute zero), which make it skewed positively. My supervisor advice me to try clean up the outlier, if possible, and then try data
    transformation. however, I have done both and yet the most of the dependent variables still violate assumption of normality.

    so I turned to non-parametric solution. my research has between subject component (2 groups), and time (3 time points) as within subject component.
    Since there is no non-parametric equivalence of mixed design ANOVA, I have to find a solution that is similar to what parametric ANOVA does and self-learn how to do that on SPSS.

    I have examined Friedman's test, Mann-Whitney U test, Kruskal-Wallis H test, and Wilcoxon signed rank test. as i saw it, most of them are based on the rank of the dataset and is only partial solution to my analyzing goal. So I was trying to find if
    there is any way to work around that. i wonder if i still can analyze the interaction effect (group x time) under this context?

    If run the between group and within group tests separately on my data, what problems/issue would follow by doing so?

    that's why I post the question and try to see how other researchers usually deal with these kind of situation.

    John

    Hi John,

    I have met the similar problem as you(non-normal distribution, 2 way or 3 way mixed Design, trying to analyze the interaction effect). And I found some information which may be helpful:

    1) Wilcox’s robust ANOVA (R, the WRS2 package) https://www.researchgate.net/publication/333525893_Robust_statistical_methods_in_R_using_the_WRS2_package
    https://dornsife.usc.edu/labs/rwilcox/software/

    2) GLMM(Generalized linear mixed models) (R's LME4 package)

    Sasa

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)