• repeating the last record after meeting a condtion

    From raj@21:1/5 to All on Mon Jul 19 04:55:18 2021
    Hi,

    I have this sample file:

    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated

    I am trying to repeat the last record where $1 is not matching with the next record.

    awk -F, '{if(p!=$1){print $0; p=$1}print $0}'
    But this is repeating the very first record after the condition is met.

    Help will be appreciated.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to raj on Mon Jul 19 14:36:59 2021
    On 19.07.2021 13:55, raj wrote:
    Hi,

    I have this sample file:

    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated

    I am trying to repeat the last record where $1 is not matching with the next record.

    awk -F, '{if(p!=$1){print $0; p=$1}print $0}'
    But this is repeating the very first record after the condition is met.

    Help will be appreciated.

    Maybe something like this...

    awk -F, 'p && $1!=p {print d}; {p=$1; d=$0 }; 1; END{print d}'


    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From raj@21:1/5 to Janis Papanagnou on Mon Jul 19 06:01:34 2021
    On Monday, 19 July 2021 at 18:07:01 UTC+5:30, Janis Papanagnou wrote:
    On 19.07.2021 13:55, raj wrote:
    Hi,

    I have this sample file:

    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated

    I am trying to repeat the last record where $1 is not matching with the next record.

    awk -F, '{if(p!=$1){print $0; p=$1}print $0}'
    But this is repeating the very first record after the condition is met.

    Help will be appreciated.
    Maybe something like this...

    awk -F, 'p && $1!=p {print d}; {p=$1; d=$0 }; 1; END{print d}'


    Janis


    Thank you very much

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ed Morton@21:1/5 to raj on Mon Jul 19 07:33:47 2021
    On 7/19/2021 6:55 AM, raj wrote:
    Hi,

    I have this sample file:

    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated

    I am trying to repeat the last record where $1 is not matching with the next record.

    awk -F, '{if(p!=$1){print $0; p=$1}print $0}'
    But this is repeating the very first record after the condition is met.

    Help will be appreciated.

    $ awk -F',' '
    p1 != $1 { if (NR>1) print p0 }
    { print; p1=$1; p0=$0 }
    END { print p0 }
    ' file
    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated
    670182,JOHN , 408252.03 >>>>this record to be repeated

    Regards,

    Ed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to visitnag@gmail.com on Mon Jul 19 13:39:19 2021
    In article <29cb9f58-7a9a-4e2b-a732-3d9a4661ccaen@googlegroups.com>,
    raj <visitnag@gmail.com> wrote:
    Hi,

    I have this sample file:

    660312,MIKE , 138555.51
    660312,MIKE , 219132.05
    660312,MIKE , 246677.63
    660312,MIKE , 268489.41 >>>>this record to be repeated
    670182,JOHN , 155591.30
    670182,JOHN , 246753.39
    670182,JOHN , 279279.87
    670182,JOHN , 303745.03
    670182,JOHN , 408252.03 >>>>this record to be repeated

    I am trying to repeat the last record where $1 is not matching with the next record.

    awk -F, '{if(p!=$1){print $0; p=$1}print $0}'
    But this is repeating the very first record after the condition is met.

    Others have given solutions using flag variables and such, that are
    designed to work regardless of file size, but here's the way I'd look at
    this problem. I'd assume that the file size isn't an issue - i.e., that
    your file is probably not that big, and that the definition of a "big" file
    on modern machines is orders of magnitude larger than what your grandfather (and the original designers of AWK) would have considered to be a "big" file.

    A pretty large fraction of the AWK programs that I write follow a general pattern of:

    1) In the main pattern/action space, build up an array "myArray".

    2) In the END clause, dump out myArray.

    This pattern fits your problem well. I'd do something like (untested):

    --- Cut Here ---
    BEGIN { FS="," }
    { myArray[$1] = $0 } # Keep only the last one seen.
    END { setsort(2);for (i in myArray) print myArray[i] }
    --- Cut Here ---

    Now, the only issue is the order in which the final output comes. You may
    not care what order the output is, in which case, you can leave out the
    call to setsort(). Or, you can be aware that setsort(2) will make it come
    out in ascending numeric order.

    And, in fact, assuming you are using GAWK, it should come out in ascending numeric order even without using setsort(). This seems to be a barely-documented feature of GAWK - that if array indices are all numeric,
    they will come out sorted numerically, even without explicitly setting the
    sort order.

    --
    "You can safely assume that you have created God in your own image when
    it turns out that God hates all the same people you do." -- Anne Lamott

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)