• Longitudinal Bivariate Correlations

    From Regina@21:1/5 to All on Sun Feb 26 03:21:17 2017
    Looking for analysis help. The data are as follows:

    *There are 18-19 "respondents", with a few missing values.

    *Each "respondent" refers to a patient-therapist (PT) relationship and is coded by the patient (P) and therapist (T) IDs. Mostly it's one P per T, though 4 therapists saw two patients each.

    *Each patient has 12-17 sessions (it was supposed to be 16, but a few stopped early, and there were two cases of 17 sessions, which could be dropped).

    *For each session, there are two patient variables, each a single number, one before and one after the session. There are four therapist variables (4 subscales of a questionnaire) measured once (after the session).

    *Thus, there are 250-300 rows of data (18 or 19 * 16 or 17, with some missing data, and some PTs had fewer than 16 sessions), each row having patient ID, therapist ID, session number, patient before variable, patient after variable, and 4 therapist
    variables.

    *The goal is to get an overall measure of correlation for each of the 8 pairs of patient*therapist (2 patient*4 therapist) variables.

    *My main question is, given the data structure, how best to control for the fact that there are multiple sessions per PT, and thus the measures must be assumed to have a certain degree in dependence within each PT. I've seen suggestions to use
    covariance structure, but this is, I think, problematic given missing data and unequal number of sessions per PT.

    *I'd prefer something which could be done using SPSS.

    REGINA

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Sun Feb 26 13:19:37 2017
    On Sun, 26 Feb 2017 03:21:17 -0800 (PST), Regina <mms0608@gmail.com>
    wrote:

    Looking for analysis help. The data are as follows:

    *There are 18-19 "respondents", with a few missing values.

    *Each "respondent" refers to a patient-therapist (PT) relationship and is coded by the patient (P) and therapist (T) IDs. Mostly it's one P per T, though 4 therapists saw two patients each.

    *Each patient has 12-17 sessions (it was supposed to be 16, but a few stopped early, and there were two cases of 17 sessions, which could be dropped).

    *For each session, there are two patient variables, each a single number, one before and one after the session. There are four therapist variables (4 subscales of a questionnaire) measured once (after the session).

    *Thus, there are 250-300 rows of data (18 or 19 * 16 or 17, with some missing data, and some PTs had fewer than 16 sessions), each row having patient ID, therapist ID, session number, patient before variable, patient after variable, and 4 therapist
    variables.

    *The goal is to get an overall measure of correlation for each of the 8 pairs of patient*therapist (2 patient*4 therapist) variables.

    *My main question is, given the data structure, how best to control for the fact that there are multiple sessions per PT, and thus the measures must be assumed to have a certain degree in dependence within each PT. I've seen suggestions to use
    covariance structure, but this is, I think, problematic given missing data and unequal number of sessions per PT.

    *I'd prefer something which could be done using SPSS.


    With the data in the natural form of 12-17 rows per patient...

    The Discriminant procedure provides a correlation matrix
    that is "With Groups" -- That appears to be exactly what
    you are asking for, where a "Respondent" is what defines
    the Group for the discriminant analysis. And there are the
    six variables. What you are interested in is the matrix,
    and not the other parts of the analysis.

    If your Patient IDs are not consecutive, you can use
    AUTORECODE to get values 1-19.

    If the Missing are not all-six-measures at once, you will see
    that the analysis does drop any row with /any/ missing value.

    The 4 therapists who are repeated should not have undue
    effect on the results, unless (say) somebody is really weird.
    But that could be a problem with even one record. "Eyeball"
    the four pairs of records to see that they are not strange.


    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Regina@21:1/5 to Regina on Mon Feb 27 06:24:31 2017
    On Sunday, February 26, 2017 at 1:21:19 PM UTC+2, Regina wrote:
    Looking for analysis help. The data are as follows:

    *There are 18-19 "respondents", with a few missing values.

    *Each "respondent" refers to a patient-therapist (PT) relationship and is coded by the patient (P) and therapist (T) IDs. Mostly it's one P per T, though 4 therapists saw two patients each.

    *Each patient has 12-17 sessions (it was supposed to be 16, but a few stopped early, and there were two cases of 17 sessions, which could be dropped).

    *For each session, there are two patient variables, each a single number, one before and one after the session. There are four therapist variables (4 subscales of a questionnaire) measured once (after the session).

    *Thus, there are 250-300 rows of data (18 or 19 * 16 or 17, with some missing data, and some PTs had fewer than 16 sessions), each row having patient ID, therapist ID, session number, patient before variable, patient after variable, and 4 therapist
    variables.

    *The goal is to get an overall measure of correlation for each of the 8 pairs of patient*therapist (2 patient*4 therapist) variables.

    *My main question is, given the data structure, how best to control for the fact that there are multiple sessions per PT, and thus the measures must be assumed to have a certain degree in dependence within each PT. I've seen suggestions to use
    covariance structure, but this is, I think, problematic given missing data and unequal number of sessions per PT.

    *I'd prefer something which could be done using SPSS.

    REGINA


    Thanks so much, Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Regina@21:1/5 to All on Sun Mar 5 13:01:17 2017
    Rich: Okay, so now I have the within group correlations, how do I get the correlations overall, controlling for the within group correlations?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to All on Mon Mar 6 12:55:39 2017
    On Sun, 5 Mar 2017 13:01:17 -0800 (PST), Regina <mms0608@gmail.com>
    wrote:

    Rich: Okay, so now I have the within group correlations, how do I get the correlations overall, controlling for the within group correlations?

    On the one hand -- "That's complicated."

    On the other hand -- I'm not sure that you and I would
    be talking about the same thing. So I'm going to ramble a
    bit, and you can tell me if I don't cover what you have in mind.

    When I started considering within-group correlations, I
    started wondering about the parallel to ANOVA. That is,
    for variances: Total = Within + Between . How could I
    apply that to covariances?

    Well, the ultimate conclusion was that I should try to stay
    very aware that between-group associations are not always
    the same as within-group associations. "Correlation does
    not prove causation" comes out strongly, when the between
    group correlation is contradicted by the within-group value.
    And someone has drawn wrong conclusions from the Between.

    What is the Between-Group r? For your data where Persons
    are the groups, what you do is: aggregate data for each person,
    and then look at the correlation across persons.

    The Total r is what you have if you just pool all the data. [I am conceptualizing a model here, not working on the arithmetic. The
    arithmetic surely will not work out readily for unequal Ns.]


    Surveys have a "problem", that they start out with pooled data which
    can give them "Total" correlations that reflect group means, but
    might badly represent what happens at the level of influence and
    prediction within Groups -- be those groups recognized or not.
    In other words, surveys provide Total correlations by default, and
    often without appreciating how many assumptions they are making.


    Part of the complication of looking at r's for T=W+B arises because, especially when there are contradictions, the interesting Groups
    are apt to differ in means and variances, and /not/ necessarily
    the same way for both variables you are looking at.

    Any r is a measure of an association /in a particular sample/ or
    universe. Comparing any two r's is always vulnerable to differences
    in range -- which is why, for comparing samples, it is far better to
    look for "consistent regression coefficients" rather than comparing
    two r's. That is, the hypothesis of interest is that the regression
    lines are "consistent with" being parallel; differences in mean
    should not effect a test, and near-absence of variation in one
    variable should not affect a test.
    - That implies that one should be careful about drawing inferences
    from groups to persons, or vice-versa. I've been satisfied wtih
    making cautious and limited statements rather than introducing
    some test procedure to an audience.

    Hope this helps.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)