• Deprecation of non-normalized Dpkg::Vendor:: modules

    From Guillem Jover@21:1/5 to All on Fri Oct 7 04:40:01 2022
    Hi!

    Some days ago Niels Thykier pointed out that the handling of origin
    files and vendor modules was not consistent. When looking into this
    for the Dpkg::Vendor::<vendor> loading code, I realized it was not
    properly handling vendor names with problematic special characters
    such as [\s:;/], and that it was not capitalizing it to conform to
    the existing perl module naming convention. The origin file handling
    was also only mapping the _first_ space into a «-».

    So I've added code to remap anything that is not alphanumeric into
    «-» for origin files, and to use them as word separators for module
    names and capitalization points. And then added deprecation warnings
    for the origin filenames and vendor modules that contain
    non-alphanumeric characters, and for vendor modules also for names
    starting with lower-case letters. I've updated the documentation to
    make these clear in git HEAD.

    I've gone through the derivatives census, and it's not clear from
    there what derivatives have a Dpkg::Vendor module, but the change
    seemed safe given the currently listed vendor names. And in any case
    there's going to be a transition period.

    If this is going to cause some issues, please let me know and we can
    talk about possible solutions/alternatives or something. Even though
    the easy way out would be to provide both module names.


    But there are still some vague handling as there are multiple casing
    tries for file and module lookups (lower-cased, as-is, lower-cased
    then capitalized, capitalized), which can make this overly confusing.
    I'm pondering whether to restrict these names further to have extremely
    clear rules, although I'm afraid of this potentially causing issues? A
    simple rule would be that a vendor name can only contain alphanumeric characters in any casing, and dashes. No spaces or other special
    characters allowed, as that can also be problematic on say debian/rules
    when using $(filter ...). For origin filenames these would be mapped to lowercase, for the perl module name dashes would mark capitalization
    boundaries and then removed, so we'd have:

    Vendor origin-filename vendor-module
    ------ --------------- -------------
    Some-Vendor-OS some-vendor-os SomeVendorOs
    SOME-vendor-os some-vendor-os SomeVendorOs
    SomeVendorOS somevendoros Somevendoros

    My worry with the above proposal of further tightening the rules, is
    that this might affect code expecting specific vendor names, so it
    might not be feasible or desirable even with a transition period.
    A more lenient but still safe rule could be to allow other
    non-alphanumeric characters (say such as [-.:;/%]) as separators except
    for spaces, which would follow the same rule as the aforementioned «-»
    one. Otherwise the casing rules could still apply to both the
    origin-filename and the vendor-module, as that would imply just those
    two files, and no further interface fallout. Let me know what you think!

    Thanks,
    Guillem

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)