• SSAX bug?

    From Andrew Gierth@21:1/5 to All on Mon Nov 18 14:18:37 2019
    While investigating whether SSAX would work for a small personal
    project, I encountered this bug:

    (xml->sxml "<e><![CDATA[&gt;]]></e>")
    $2 = (*TOP* (e ">"))

    (expected output would be "&gt;")

    This is clearly an intentional behavior, since the code (in
    upstream/SSAX.scm in the Guile 2.2.6 distribution, but it seems to be
    the same in all other versions of SSAX I found) has:

    ; Within a CDATA section all characters are taken at their face value,
    ; with only three exceptions:
    [...]
    ; &gt; is treated as an embedded #\> character

    and even includes test cases requiring the incorrect result.

    This seems to be a blatant misreading of the XML spec, which requires
    that other than "]]>", no entities or other special characters (not even
    &gt;) be interpreted within a CDATA section. I have checked this with
    other experienced XML users and with multiple other tools, and all agree
    that interpreting &gt; inside CDATA is incorrect.

    As far as I can see this bug has existed as long as SSAX has, am I
    really the first person to notice it?

    --
    Andrew.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)