• Character.isIdentifierIgnorable

    From =?UTF-8?Q?Arne_Vajh=c3=b8j?=@21:1/5 to Jeff Higgins on Sat Apr 25 13:51:22 2020
    On 4/25/2020 1:39 PM, Jeff Higgins wrote:
    On 4/25/20 1:01 PM, Arne Vajhøj wrote:
    On 4/25/2020 12:57 PM, Arne Vajhøj wrote:
    On 4/24/2020 11:36 PM, Jeff Higgins wrote:
    from JLS-3.8 for JavaSE 9
    Two identifiers are the same only if they are identical, that is,
    have the same
    Unicode character for each letter or digit. Identifiers that have
    the same external
    appearance may yet be different.

    from JLS-3.8 for JavaSE 10
    Two identifiers are the same only if, after ignoring characters that
    are
    ignorable, the identifiers have the same Unicode character for each
    letter
    or digit. An ignorable character is a character for which the method
    Character.isIdentifierIgnorable(int) returns true. Identifiers that
    have the
    same external appearance may yet be different.

    But Java 8 javac did ignore them??

    So does Java 7 javac.

    Java 6 javac gives me an error when trying to create the class file,
    which likely means that it is not ignoring these characters.

    Thanks. I don't have earlier than 8 easily available.
    Was wondering.

    I was wondering when Character,isIdentifierIgnorable(int codepoint)
    became available. The earliest javadoc that i can find is 5 and is
    available there. A Sun press release from 1997 features
    "Global language support based on Unicode 2.0 standards" so maybe
    available from then?

    http://www-evasion.imag.fr/Membres/Francois.Faure/enseignement/ressources/java/jdk1.3/api/index.html

    shows it in 1.3 and says "Since: JDK1.1".

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeff Higgins@21:1/5 to All on Sat Apr 25 13:39:33 2020
    On 4/25/20 1:01 PM, Arne Vajhøj wrote:
    On 4/25/2020 12:57 PM, Arne Vajhøj wrote:
    On 4/24/2020 11:36 PM, Jeff Higgins wrote:
    On 4/24/20 10:24 PM, Arne Vajhøj wrote:
    On 4/24/2020 8:37 PM, Jeff Higgins wrote:
    On 4/24/20 8:29 PM, Arne Vajhøj wrote:
    I think we are in the far out corners of the language.

    Well, yeah, but still.
    And still leaves me questioning motivation for ignorable chars.

    We would need to find someone working at SUN back in the mid 90's
    to know.

    If I were to guess then somebody got the "smart" idea that
    the language should not distinguish between names where
    the difference could not be seen when type'd/cat'ed and
    then someone else later added the \uXXXX feature.

    Maybe not that far back.

    from JLS-3.8 for JavaSE 9
    Two identifiers are the same only if they are identical, that is,
    have the same
    Unicode character for each letter or digit. Identifiers that have the
    same external
    appearance may yet be different.

    from JLS-3.8 for JavaSE 10
    Two identifiers are the same only if, after ignoring characters that are >>> ignorable, the identifiers have the same Unicode character for each
    letter
    or digit. An ignorable character is a character for which the method
    Character.isIdentifierIgnorable(int) returns true. Identifiers that
    have the
    same external appearance may yet be different.

    But Java 8 javac did ignore them??

    So does Java 7 javac.

    Java 6 javac gives me an error when trying to create the class file,
    which likely means that it is not ignoring these characters.

    Thanks. I don't have earlier than 8 easily available.
    Was wondering.

    I was wondering when Character,isIdentifierIgnorable(int codepoint)
    became available. The earliest javadoc that i can find is 5 and is
    available there. A Sun press release from 1997 features
    "Global language support based on Unicode 2.0 standards" so maybe
    available from then?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeff Higgins@21:1/5 to All on Sat Apr 25 14:36:43 2020
    On 4/25/20 1:51 PM, Arne Vajhøj wrote:


    http://www-evasion.imag.fr/Membres/Francois.Faure/enseignement/ressources/java/jdk1.3/api/index.html


    shows it in 1.3 and says "Since: JDK1.1".


    Ah, thanks for that. So, since forever.

    In one of the bug reports that I mentioned upthread a fellow
    compiled his library using maven and javac, then uploaded to
    maven central, created a maven project in Eclipse and would
    not compile due to a spurious \u00?? in his source!!
    So even far flung language features sometime can cause bug bites.

    I mention in my post to JDT this from Java VM Spec 14:

    Module names may be drawn from the entire Unicode codespace, subject to
    the following constraints:

    • A module name must not contain any code point in the range
    ' \u0000 ' to ' \u001F ' inclusive.

    But no mention in JLS as far as I can tell?

    I wonder if this will be an inconsistency waiting to bite?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)