While investigating whether SSAX would work for a small personal
project, I encountered this bug:
(xml->sxml "<e><![CDATA[>]]></e>")
$2 = (*TOP* (e ">"))
(expected output would be ">")
This is clearly an intentional behavior, since the code (in
upstream/SSAX.scm in the Guile 2.2.6 distribution, but it seems to be
the same in all other versions of SSAX I found) has:
; Within a CDATA section all characters are taken at their face value,
; with only three exceptions:
[...]
; > is treated as an embedded #\> character
and even includes test cases requiring the incorrect result.
This seems to be a blatant misreading of the XML spec, which requires
that other than "]]>", no entities or other special characters (not even
>) be interpreted within a CDATA section. I have checked this with
other experienced XML users and with multiple other tools, and all agree
that interpreting > inside CDATA is incorrect.
As far as I can see this bug has existed as long as SSAX has, am I
really the first person to notice it?
--
Andrew.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)