• review of guillem/next/d-m-h-root

    From Helmut Grohne@21:1/5 to All on Sun Apr 19 08:10:02 2020
    Hi Guillem,

    you asked me to review your next/d-m-h-root branch. Thanks to all who've
    worked on this! I've looked at the two commits as one diff (12961967a563..6aa3bf8f98b8) without attributing individual hunks to the respective authors. This is what I found:

    * Diagnostic messages tend to include DPKG_ROOT. Is that a useful thing
    to do? My intuition was that I'd not include DPKG_ROOT there.

    For example:

    echo "Obsolete conffile ${DPKG_ROOT}$CONFFILE has been modified by you."

    Counter examples:

    echo "Restoring backup of $SYMLINK ..."
    echo "Restoring backup of $PATHNAME ..."

    At the very least, consistency seems in order here.

    * In dir_to_symlink, in the case for postrm, in the second (long)
    condition, I think there is a line that misses a trailing backslash.

    [ \( ! -h "${DPKG_ROOT}$PATHNAME" -a

    * In readlink_canon, in the final realpath the $dst argument likely
    should be quoted.

    * In readlink_canon I was wondering whether passing DPKG_ROOT inside a
    sed expression is a good idea. How about reworking the final line:

    -realpath $dst | sed -e "s:^$DPKG_ROOT::"
    +dst=$(realpath $dst)
    +echo ${dst#"$DPKG_ROOT"}

    The quoting of the variable removes any pattern meaning from its
    contents.

    * I observe that readlink -f and readlink_canon significantly differ in
    their behaviour. While readlink -f canonicalizes symbolic links in
    every component of the path, readlink_canon only canonicalizes
    symbolic links in the final component. It then uses realpath saying
    that no symlinks are involved, which may be untrue.

    I do wonder whether implementing this feature here is a useful thing
    to do. It seems that we need a chroot-aware readlink and/or realpath
    in more occasions. Possibly extending coreutils is a better solution?

    In any case, I'm attaching a version of readlink_canon that deals
    with this.

    * I observe that the coding style is inconsistent about using test
    expressions using -a/-o and &&/||. There is a slight reason to prefer
    the latter, see:

    https://github.com/koalaman/shellcheck/wiki/SC2166

    That's all. Looks like we can go ahead with this with minor changes.

    Helmut

    #!/bin/shreadlink_canon() { local src="$1" local root="$2" local loop=0 local result="$root" local component dst if [ "${src#"$root"}" = "$src" ]; then echo "link not withing root" return 1 fi src=${src#"$root"} while [ "$src" != "${src#/}
    " ]; do src=${src#/}; done while [ -n "$src" ]; do loop=$((loop + 1)) if [ "$loop" -gt 25 ]; then echo "too many levels of symbolic links" return 1 fi component=${src%%/*} src=${src#"$component"} while [ "$src" != "${src#/}" ]; do src=$
    {src#/}; done if [ "$component" = . ]; then : elif [ "$component" = .. ]; then result=${result%/*} if [ "${result#"$root"}" = "$result" ]; then result=$root fi elif [ -h "$result/$component" ]; then dst=$(readlink "$result/$
    component") case "$dst" in /*) result=$root src="$dst${src:+/$src}" while [ "$src" != "${src#/}" ]; do src=${src#/}; done ;; *) src="$dst${src:+/$src}" ;; esac else result="$result/$component" fi done
    echo "${result#"$root"}"}readlink_canon "$1" "$2"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Helmut Grohne on Wed Apr 29 11:30:02 2020
    Hi!

    On Sun, 2020-04-19 at 07:34:18 +0200, Helmut Grohne wrote:
    you asked me to review your next/d-m-h-root branch.

    Thanks for the review!

    This is what I found:

    * Diagnostic messages tend to include DPKG_ROOT. Is that a useful thing
    to do?

    I think not including DPKG_ROOT would be too weird and confusing, as
    then the user might end up acting on the host filesystem instead of
    the chrooted one after some diagnostic error.

    My intuition was that I'd not include DPKG_ROOT there.

    For example:

    echo "Obsolete conffile ${DPKG_ROOT}$CONFFILE has been modified by you."

    Counter examples:

    echo "Restoring backup of $SYMLINK ..."
    echo "Restoring backup of $PATHNAME ..."

    At the very least, consistency seems in order here.

    Right, fixed this now locally.

    * In dir_to_symlink, in the case for postrm, in the second (long)
    condition, I think there is a line that misses a trailing backslash.

    [ \( ! -h "${DPKG_ROOT}$PATHNAME" -a

    Done.

    * In readlink_canon, in the final realpath the $dst argument likely
    should be quoted.

    Done.

    * In readlink_canon I was wondering whether passing DPKG_ROOT inside a
    sed expression is a good idea. How about reworking the final line:

    -realpath $dst | sed -e "s:^$DPKG_ROOT::"
    +dst=$(realpath $dst)
    +echo ${dst#"$DPKG_ROOT"}

    The quoting of the variable removes any pattern meaning from its
    contents.

    Done, much better this way indeed.

    * I observe that readlink -f and readlink_canon significantly differ in
    their behaviour. While readlink -f canonicalizes symbolic links in
    every component of the path, readlink_canon only canonicalizes
    symbolic links in the final component. It then uses realpath saying
    that no symlinks are involved, which may be untrue.

    I do wonder whether implementing this feature here is a useful thing
    to do. It seems that we need a chroot-aware readlink and/or realpath
    in more occasions. Possibly extending coreutils is a better solution?

    Yes, I think also checked around at the time whether there was a
    "chroot"-aware canonicalizer around that could simply be used but
    found nothing. ISTR pondering about requesting this, but it seems I
    didn't.

    I'm also thinking that because we might indeed need it elswhere while
    adding DPKG_ROOT support, perhaps a better idea for now is to provide a
    very simple dpkg-readlink command or similar which would take DPKG_ROOT
    into account so that other packages can use it.

    In any case, I'm attaching a version of readlink_canon that deals
    with this.

    Thanks! I notice this is susceptible to directory traversals. I've
    amended it and added comments in the attached version. I'm thinking
    I'll need to add unit tests to cover for this among other similar
    issues.

    * I observe that the coding style is inconsistent about using test
    expressions using -a/-o and &&/||. There is a slight reason to prefer
    the latter, see:

    https://github.com/koalaman/shellcheck/wiki/SC2166

    Yeah, while several could be easily switched (I'll do that in a
    different commit), some I don't think can easily be done, as they are
    within parentheticals within test ([]). If this was bash we could use
    [[ ]] f.ex. If you have an alternative I'd be happy to use that though.

    That's all. Looks like we can go ahead with this with minor changes.

    Great, thanks for this!

    Regards,
    Guillem

    #!/bin/shset -ereadlink_canon() { local src="$1" local root="$2" local loop=0 local result="$root" local component dst if [ "${src#"$root"}" = "$src" ]; then echo "error: link not within root" >&2 return 1 fi # Remove
    prefixed root dir. src=${src#"$root"} # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do src=${src#/} done while [ -n "$src" ]; do loop=$((loop + 1)) if [ "$loop" -gt 25 ]; then echo "error: too many levels of
    symbolic links" >&2 return 1 fi # Get the first directory component. prefix=${src%%/*} # Remove the first directory component from src. src=${src#"$prefix"} # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do
    src=${src#/} done # Resolve the first directory component. if [ "$prefix" = . ]; then # Ignore, stay at the same directory. : elif [ "$prefix" = .. ]; then # Go up one directory. result=${result%/*} if [ "
    ${result#"$root"}" = "$result" ]; then echo "error: directory traversal attempt" >&2 return 1 fi elif [ -h "$result/$prefix" ]; then # Resolve the symlink within $result. dst=$(readlink "$result/$prefix") case "
    $dst" in /*) # Absolute pathname, reset result back to $root. result=$root src="$dst${src:+/$src}" # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do src=${src#/} done ;;
    *) # Relative pathname. src="$dst${src:+/$src}" ;; esac else # Otherwise append the prefix. result="$result/$prefix" fi done # We are done, print the resolved pathname, w/o $root. echo "${
    result#"$root"}"}readlink_canon "$1" "$DPKG_ROOT"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helmut Grohne@21:1/5 to Guillem Jover on Fri May 1 09:20:01 2020
    Hi Guillem,

    On Wed, Apr 29, 2020 at 11:28:08AM +0200, Guillem Jover wrote:
    Thanks! I notice this is susceptible to directory traversals. I've
    amended it and added comments in the attached version. I'm thinking
    I'll need to add unit tests to cover for this among other similar
    issues.

    I don't think your adaption is correct. Traversing the root directory is actually supported. /../ resolves to /. Returning an error there is not correct.

    And yeah, moving this into some tool definitely seems in order.

    Helmut

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Guillem Jover@21:1/5 to Helmut Grohne on Fri May 1 21:20:01 2020
    On Wed, 2020-04-29 at 11:52:54 +0200, Helmut Grohne wrote:
    On Wed, Apr 29, 2020 at 11:28:08AM +0200, Guillem Jover wrote:
    Thanks! I notice this is susceptible to directory traversals. I've
    amended it and added comments in the attached version. I'm thinking
    I'll need to add unit tests to cover for this among other similar
    issues.

    I don't think your adaption is correct. Traversing the root directory is actually supported. /../ resolves to /. Returning an error there is not correct.

    You are absolutely right, thanks! What tripped me over was a test result
    where the directory traversal was returning empty strings, which
    should have been obvious was not traversing anything. :)

    $ readlink /tmp/symlinks/root-dir/out
    ../../../../..

    I think the correct change is the one attached, which makes sure to
    return /.

    Regards,
    Guillem

    #!/bin/shset -ereadlink_canon() { local src="$1" local root="$2" local loop=0 local result="$root" local component dst if [ "${src#"$root"}" = "$src" ]; then echo "error: link not within root" >&2 return 1 fi # Remove
    prefixed root dir. src=${src#"$root"} # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do src=${src#/} done while [ -n "$src" ]; do loop=$((loop + 1)) if [ "$loop" -gt 25 ]; then echo "error: too many levels of
    symbolic links" >&2 return 1 fi # Get the first directory component. prefix=${src%%/*} # Remove the first directory component from src. src=${src#"$prefix"} # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do
    src=${src#/} done # Resolve the first directory component. if [ "$prefix" = . ]; then # Ignore, stay at the same directory. : elif [ "$prefix" = .. ]; then # Go up one directory. result=${result%/*} if [ "
    ${result#"$root"}" = "$result" ]; then result="$root" fi elif [ -h "$result/$prefix" ]; then # Resolve the symlink within $result. dst=$(readlink "$result/$prefix") case "$dst" in /*) # Absolute pathname,
    reset result back to $root. result=$root src="$dst${src:+/$src}" # Remove prefixed slashes. while [ "$src" != "${src#/}" ]; do src=${src#/} done ;; *) # Relative pathname.
    src="$dst${src:+/$src}" ;; esac else # Otherwise append the prefix. result="$result/$prefix" fi done # We are done, print the resolved pathname, w/o $root. result="${result#"$root"}" echo "${result:-/}"}
    readlink_canon "$1" "$DPKG_ROOT"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)