• Converting only part of string to numeric variable

    From Merel Bakker@21:1/5 to All on Mon May 30 06:42:27 2022
    I want to convert part of a string variabel into a numeric variabel.
    Not via the easy way with ''recode into different variables'', were for example mild -> 0, moderate -> 1, severe -> 2. Where I can change the numbers in labels with 'value' labels.
    But a string variable with a width of 2427 characters, with mostly long sentences in them. From those long sentences I want to extract 1 word, and recode that into a numeric variable. So for example from the string: ''there is an infection with
    toxoplasmosis'', or 'infection with parvo', I want the system to only select strings with sentences that contains the word 'toxoplasmosis'. So far I am not succesfull :)... do I need to write a specific syntax for that?

    Many many thanks in advance!

    Merel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Ulrich@21:1/5 to merelbakker@gmail.com on Fri Jun 3 14:01:47 2022
    On Mon, 30 May 2022 06:42:27 -0700 (PDT), Merel Bakker
    <merelbakker@gmail.com> wrote:

    I want to convert part of a string variabel into a numeric variabel.
    Not via the easy way with ''recode into different variables'', were
    for example mild -> 0, moderate -> 1, severe -> 2. Where I can change
    the numbers in labels with 'value' labels.

    But a string variable with a width of 2427 characters, with mostly
    long sentences in them. From those long sentences I want to extract 1
    word, and recode that into a numeric variable. So for example from the
    string: ''there is an infection with toxoplasmosis'', or 'infection
    with parvo', I want the system to only select strings with sentences
    that contains the word 'toxoplasmosis'. So far I am not succesfull
    :)... do I need to write a specific syntax for that?

    Yes, you probably need to write specific syntax. See documentation
    for char.index( ) -- beware that the matches are case-sensitive.

    The exception that I think of would be if your long strings were
    computer generated so that two long strings with the same clue-
    word would be identical. In that case, you could use "Autorecode"
    to obtain a set of numbers, 1-n, for n unique responses. I expect
    that the value labels, which Autorecode creates from the values,
    would be truncations of the long strings.

    --
    Rich Ulrich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)