Skip to content

Xml.java doesn't cope with encoded ampersand followed by semicolon #361

@DangerPete

Description

@DangerPete

Hi,

I've come across an issue when using XML.toJSONObject(String) where the string contains an encoded ampersand followed by a semicolon (i.e. <xml>Can cope &amp;; </xml>)
The issue is in the unescape method in Xml.java. Since the &amp; has been converted to an & already in the XmlTokener, the string it's trying to unescape looks like <xml>Can cope &; </xml> and it assumes that there will be an XML entity to parse preceding the &.
Not sure if you just want to put a length check before checking the first character, i.e.

final String entity = string.substring(i + 1, semic);
if(!entity.isEmpty()){
                    if (entity.charAt(0) == '#') {
...
}

Full Stacktrace:

java.lang.StringIndexOutOfBoundsException: String index out of range: 0
	at java.lang.String.charAt(String.java:658)
	at org.json.XML.unescape(XML.java:193)
	at org.json.XML.stringToValue(XML.java:434)
	at org.json.XML.parse(XML.java:398)
	at org.json.XML.toJSONObject(XML.java:485)
	at org.json.XML.toJSONObject(XML.java:456)

Unit test for you too:

import org.json.XML;
import org.junit.Assert;
import org.junit.Test;

public class AuthorsNaturalOrderingPreservedInMetadataXslTest {
    @Test
    public void testXmlToJson() {

        String xml;
        xml = "<xml>Can cope &lt;; </xml>";
        Assert.assertTrue(XML.toJSONObject(xml).toString().equals("{\"xml\":\"Can cope <;\"}"));

        xml = "<xml>Can cope &amp; ; </xml>";
        Assert.assertTrue(XML.toJSONObject(xml).toString().equals("{\"xml\":\"Can cope & ;\"}"));

        xml = "<xml>Can cope &amp;; </xml>";
        xml.isEmpty()
        try {
            Assert.assertTrue(XML.toJSONObject(xml).toString().equals("{\"xml\":\"Can cope &;\"}"));
        } catch (StringIndexOutOfBoundsException e) {
            e.printStackTrace();
            Assert.fail("Failed");
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions