• question about preg_replace

    From alex@21:1/5 to All on Wed Feb 23 15:56:35 2022
    $ echo center | sed -E 's#(.*)#start-\1-end#'
    start-center-end

    I tried to do the same thing with php, but the result is different

    $ php -r "echo preg_replace('#(.*)#', 'start-\0-end', 'center') . PHP_EOL;" start-center-endstart--end
    ^^^^^^^^^^

    Why?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to alex on Wed Feb 23 15:45:31 2022
    On Wed, 23 Feb 2022 15:56:35 +0100, alex wrote:

    $ echo center | sed -E 's#(.*)#start-\1-end#'
    start-center-end

    I tried to do the same thing with php, but the result is different

    $ php -r "echo preg_replace('#(.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-endstart--end
    ^^^^^^^^^^

    Why?

    First off, it is necessary to recognize that sed(1) uses, depending on
    the options given, either POSIX basic regular expressions (BREs) or
    POSIX extended regular expressions (EREs), while PHP's preg_* functions
    use Perl-compatable regular expressions (PCREs). These two types of
    RE (POSIX and PCRE) have differences in how they match REs, which
    would explain why you get different results from sed(1) and php
    preg_replace()


    As for PHP preg_replace(), the function will /repeat/ the substitution
    as often as possible, if you do not specify a limit. What
    preg_replace('#(.*)#', 'start-\0-end', 'center')
    does is
    - match .* to the entire string 'center',
    - because of the grouping brackets in the RE, and the reference in
    the replacement string, it now replaces 'center' with
    'start-center-end', and continues onward in the string. It now
    - matches .* to the empty string at the end of 'center' (remember,
    .* will match ZERO or more occurrences of ANY character), and
    - replaces that empty string with the replacement string. It then
    - runs out of string, and terminates.

    You have a couple of possible changes that you can apply to
    bring your PHP closer to sed(1):
    1) you can limit your RE to one occurrence:
    preg_replace('#(.*)#', 'start-\0-end', 'center',1)

    or

    2) you can anchor your RE to the start of the string:
    preg_replace('#^(.*)#', 'start-\0-end', 'center')
    preg_replace('#(^.*)#', 'start-\0-end', 'center')


    HTH
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to alex on Wed Feb 23 15:49:31 2022
    On Wed, 23 Feb 2022 16:45:53 +0100, alex wrote:

    Il 23/02/22 16:33, Mateusz Viste ha scritto:
    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center')
    .PHP_EOL;"


    Warning: preg_replace(): No ending delimiter '#' found in Command line
    code on line 1

    ~ $ php -r "echo preg_replace('#(^.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-end
    ~ $




    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Wed Feb 23 15:51:08 2022
    On Wed, 23 Feb 2022 15:49:31 +0000, Lew Pitcher wrote:

    On Wed, 23 Feb 2022 16:45:53 +0100, alex wrote:

    Il 23/02/22 16:33, Mateusz Viste ha scritto:
    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center')
    .PHP_EOL;"


    Warning: preg_replace(): No ending delimiter '#' found in Command line
    code on line 1

    ~ $ php -r "echo preg_replace('#(^.*)#', 'start-\0-end', 'center') . PHP_EOL;"
    start-center-end

    Sorry, wrong example from my testing. How about...

    ~ $ php -r "echo preg_replace('#^(.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-end
    ~ $

    It could be that we are running different versions of PHP.
    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to All on Wed Feb 23 16:33:01 2022
    On 23 Feb 2022 15:56:35 +0100 alex wrote:
    $ echo center | sed -E 's#(.*)#start-\1-end#'
    start-center-end

    I tried to do the same thing with php, but the result is different

    $ php -r "echo preg_replace('#(.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-endstart--end

    Why?

    Looks like the preg matches two times: first your 'center' content, and
    then what's left (an empty string). I can think of two ways to avoid
    that:

    Enforce that what's processed has at least one character:

    php -r "echo preg_replace('#.(.*)#', 'start-\0-end', 'center') .PHP_EOL;"

    Or make sure the preg consumes the entire line:

    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center') .PHP_EOL;"


    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From alex@21:1/5 to All on Wed Feb 23 16:45:53 2022
    Il 23/02/22 16:33, Mateusz Viste ha scritto:
    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center') .PHP_EOL;"


    Warning: preg_replace(): No ending delimiter '#' found in Command line
    code on line 1

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mateusz Viste@21:1/5 to alex on Wed Feb 23 18:06:54 2022
    On Wed, 23 Feb 2022 16:45:53 +0100
    alex <1j9448a02@lnx159sneakemail.com.invalid> wrote:

    Il 23/02/22 16:33, Mateusz Viste ha scritto:
    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center')
    .PHP_EOL;"

    Warning: preg_replace(): No ending delimiter '#' found in Command
    line code on line 1

    That's probably your shell replacing the '$#' pair with nothing.
    Escaping the dollar (\$) should help.

    Mateusz

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From alex@21:1/5 to All on Wed Feb 23 20:10:34 2022
    Il 23/02/22 16:51, Lew Pitcher ha scritto:
    On Wed, 23 Feb 2022 15:49:31 +0000, Lew Pitcher wrote:

    On Wed, 23 Feb 2022 16:45:53 +0100, alex wrote:

    Il 23/02/22 16:33, Mateusz Viste ha scritto:
    php -r "echo preg_replace('#^(.*)$#', 'start-\0-end', 'center')
    .PHP_EOL;"


    Warning: preg_replace(): No ending delimiter '#' found in Command line
    code on line 1

    ~ $ php -r "echo preg_replace('#(^.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-end

    Sorry, wrong example from my testing. How about...

    ~ $ php -r "echo preg_replace('#^(.*)#', 'start-\0-end', 'center') . PHP_EOL;"
    start-center-end
    ~ $

    It could be that we are running different versions of PHP.

    thanks

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From alex@21:1/5 to All on Wed Feb 23 20:14:04 2022
    Il 23/02/22 18:06, Mateusz Viste ha scritto:
    That's probably your shell replacing the '$#' pair with nothing.
    Escaping the dollar (\$) should help.

    Mateusz


    exact

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From alex@21:1/5 to All on Wed Feb 23 20:19:08 2022
    Il 23/02/22 16:45, Lew Pitcher ha scritto:
    On Wed, 23 Feb 2022 15:56:35 +0100, alex wrote:

    $ echo center | sed -E 's#(.*)#start-\1-end#'
    start-center-end

    I tried to do the same thing with php, but the result is different

    $ php -r "echo preg_replace('#(.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-endstart--end
    ^^^^^^^^^^

    Why?

    First off, it is necessary to recognize that sed(1) uses, depending on
    the options given, either POSIX basic regular expressions (BREs) or
    POSIX extended regular expressions (EREs), while PHP's preg_* functions
    use Perl-compatable regular expressions (PCREs). These two types of
    RE (POSIX and PCRE) have differences in how they match REs, which
    would explain why you get different results from sed(1) and php preg_replace()


    As for PHP preg_replace(), the function will /repeat/ the substitution
    as often as possible, if you do not specify a limit. What
    preg_replace('#(.*)#', 'start-\0-end', 'center')
    does is
    - match .* to the entire string 'center',
    - because of the grouping brackets in the RE, and the reference in
    the replacement string, it now replaces 'center' with
    'start-center-end', and continues onward in the string. It now
    - matches .* to the empty string at the end of 'center' (remember,
    .* will match ZERO or more occurrences of ANY character), and
    - replaces that empty string with the replacement string. It then
    - runs out of string, and terminates.

    You have a couple of possible changes that you can apply to
    bring your PHP closer to sed(1):
    1) you can limit your RE to one occurrence:
    preg_replace('#(.*)#', 'start-\0-end', 'center',1)

    or

    2) you can anchor your RE to the start of the string:
    preg_replace('#^(.*)#', 'start-\0-end', 'center')
    preg_replace('#(^.*)#', 'start-\0-end', 'center')


    HTH

    thanks

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Wed Feb 23 21:02:24 2022
    On Wed, 23 Feb 2022 15:45:31 +0000, Lew Pitcher wrote:

    On Wed, 23 Feb 2022 15:56:35 +0100, alex wrote:

    $ echo center | sed -E 's#(.*)#start-\1-end#'
    start-center-end

    I tried to do the same thing with php, but the result is different

    $ php -r "echo preg_replace('#(.*)#', 'start-\0-end', 'center') .
    PHP_EOL;"
    start-center-endstart--end
    ^^^^^^^^^^

    Why?

    [snip]

    You have a couple of possible changes that you can apply to bring your
    PHP closer to sed(1):
    1) you can limit your RE to one occurrence:
    preg_replace('#(.*)#', 'start-\0-end', 'center',1)

    or

    2) you can anchor your RE to the start of the string:
    preg_replace('#^(.*)#', 'start-\0-end', 'center')
    preg_replace('#(^.*)#', 'start-\0-end', 'center')

    You could also change the RE so that it matches ONE OR MORE characters, preventing it from matching the empty string at the end of the data.
    preg_replace('#(.+)#', 'start-\0-end', 'center')



    --
    Lew Pitcher
    "In Skills, We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)