Support Article
'>' characters improperly encoded in the same XML streams
Summary
When using a SOAP Connector, the XML request received has encoded the less than ("<") character as "<" but has NOT encoded the greater-than character (">") as >
The problem is that this XML is getting rejected by the company Enterprise Application Integration (EAI ) requires both characters < and > to be encoded.
Error Messages
No error message, but the receiving application is unable to proceed.
Steps to Reproduce
The issue can be simulated using PRPC as both the SERVICE provider (that generates a WSDL) AND the CONNECTOR (import the PRPC-generated WSDL into PRPC).
Once the Service and Connector are set-up, use a TEXT property (ID) set with the *STRING* "<TEST>"
Examine the XML in transit using SOAPService DEBUG and observe that the "<" gets encoded as < and the ">" does NOT get encoded.
Example log generated for the PRPC Soap Connect:
2015-01-14 16:33:31,686 [http-apr-8091-exec-6] [ ] [ ] ( services.soap.SOAPService) DEBUG xxxxxxxxx-2|xx.x.xx.xx - Received SOAP request message, trying to obtain SOAP Action value from SOAPAction HTTP Header
2015-01-14 16:33:31,686 [http-apr-8091-exec-6] [ ] [ ] ( services.soap.SOAPService) DEBUG xxxxxxxxx-2|xx.x.xx.xx - Obtained SOAP Action value from SOAPAction HTTP Header: urn:PegaRULES:SOAP:xxxxxxxxxxxxxDataCustomer:Services#GetCustomerbyID
2015-01-14 16:33:31,686 [http-apr-8091-exec-6] [ ] [ ] ( services.soap.SOAPService) INFO xxxxxxxxx-2|xx.x.xx.xx - SOAP Request Envelope:
<?xml version='1.0' encoding='UTF-8'?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Body><ns1:GetCustomerbyIDRequest xmlns:ns1="urn:PegaRULES:SOAP:xxxxxxxxxxxxxxxxDataCustomer:Services"> <ID><TEST></ID> <ID><TEST></ID> </ns1:GetCustomerbyIDRequest></soapenv:Body></soapenv:Envelope>
Example log generated for the PRPC Soap Service ( that accept the generated XML without an issue):
2015-01-14 16:33:31,725 [http-apr-8091-exec-6] [ STANDARD] [APPxxxxxxxx:01.01.01] (PC62SP2FW_Data_Customer.Action) INFO xxxxxxxxx-2|xx.x.xx.xx|SOAP|xxxxxxxxxxxxxDataCustomer|Services|GetCustomerbyID|A15EB5EC2C43C19D477806F390B15A8C6 - GetCustomerbyID - <TEST>
2015-01-14 16:33:31,743 [http-apr-8091-exec-6] [ STANDARD] [APPxxxxxxxx:01.01.01] ( services.soap.SOAPService) INFO xxxxxxxxx-2|xx.x.xx.xx - SOAP Response Envelope:
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<ns1:GetCustomerbyIDResponse xmlns:ns1="urn:PegaRULES:SOAP:xxxxxxxxxxxxxDataCustomer:Services"> <Address></Address> <DateOfBirth>1970-01-01</DateOfBirth> <FirstName></FirstName> <ID><TEST></ID> <LastName></LastName> <Married></Married> <NoOfChildren></NoOfChildren> <Salary></Salary> </ns1:GetCustomerbyIDResponse>
</soap:Body>
</soap:Envelope>
Root Cause
As part of the investigation, Engineering replicated the behaviour reported outside of PRPC using a simple Java class. This is to confirm the behaviour occurs in the StaxOMBuilder which is used in Rule-connect-SOAP.invokeaxis2 activity (as shown below).
com.pega.apache.axiom.om.impl.builder.StAXOMBuilder builder = new com.pega.apache.axiom.om.impl.builder.StAXOMBuilder(streamReader);
If we input <some><test>hello</test></some>, the output of builder.getDocumentElement() was <some><test>hello</test></some> - you can see that > has been decoded as >.
The behaviour reported occurs in the StAX based parser implementation and not in the PRPC XMLStream rule execution which generates the expected encoded data. A request to change the behaviour has already been rejected by the AXIOM developers as per the AXIOM Jira entry "AXIOM is not correctly encoding all XML special characters when inserted into a String (">" not changed to ">" when preceeded by "<")".
From a PRPC perspective the generated connect SOAP xml stream is valid as per W3C standard which as per "W3C Extensible Markup Language (XML) 1.1 (Second Edition)" documentation confirms that the greater-than character (">") may be escaped to use as >, but is not required to be escaped (Unless needing to make sure it is not part of a CDATA terminator).
Resolution
To overcome this problem explore alternative options either outside of PRPC or through a change to the receiving service such that it accepts the > character in unencoded form.
Published January 31, 2016 - Updated October 8, 2020
Have a question? Get answers now.
Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.