Skip to main content

This content has been archived and is no longer being updated. Links may not function; however, this content may be relevant to outdated versions of the product.

Support Article

Characters decoded incorrectly when email is sent as HTML

SA-83652

Summary



On sending an email as HTML, the content does not contain special characters for a few email IDs. This occurs despite setting UTF-8 charset for decoding.


Error Messages



Not Applicable


Steps to Reproduce

  1. Send an email to Gmail from Outlook.
  2. Wait for a minute to receive a notification on the case created.
  3. Search for the case created in the Pega instance.


ROOT  CAUSE

The setting in Outlook for Windows has the default encoding for outgoing messages as 'ISO_8859_1'. As a result, the HTML source has the ‘ISO-8859-1’ charset.
Setting the correct content type is important for email accessibility.
UTF-8 is a standard encoding which ensures that all the characters are decoded correctly, especially non-Latin characters.
While, ISO-8859-1 (this is the default on Outlook) only includes Latin-based languages.
Pega application uses UTF-8 for decoding. Therefore, the content must also be encoded with UTF-8.



Resolution



Perform the following local-change:

Modify the code in 'pyExtractHtmlFromAttachment' activity's Step 3 instead of changing the Outlook's default encoding settings (from 'ISO_8859_1' to 'UTF-8'):


String inputStream = htmlBase64;

byte[] bytes = org.apache.commons.codec.binary.Base64.decodeBase64(inputStream);
java.io.InputStream is = new java.io.ByteArrayInputStream(bytes);
java.nio.charset.Charset charset = java.nio.charset.StandardCharsets.UTF_8;
StringBuilder stringBuilder = new StringBuilder();
String line = null;
java.io.InputStreamReader reader = new java.io.InputStreamReader(is, charset);

try (java.io.BufferedReader bufferedReader = new java.io.BufferedReader(reader)) {        
  while ((line = bufferedReader.readLine()) != null) {
    stringBuilder.append(line);
  }
}catch(Exception ex){
  oLog.error("Error while reading file", ex);

FileString = stringBuilder.toString();
oLog.debug("Html string: " + FileString);
//-------------------------------------------
org.jsoup.nodes.Document doc = org.jsoup.Jsoup.parse(FileString);

//change all the anchor tags to have parameter as _blank
org.jsoup.select.Elements elements = doc.select("meta");
if(elements != null && !elements.isEmpty()){
  
  String charsetAttr = elements.get(0).attr("content");
  if(!charsetAttr.toUpperCase().contains("UTF-8")){
    stringBuilder = new StringBuilder();
    charset = java.nio.charset.StandardCharsets.ISO_8859_1;
    is = new java.io.ByteArrayInputStream(bytes);
    reader = new java.io.InputStreamReader(is, charset);

    try (java.io.BufferedReader bufferedReader = new java.io.BufferedReader(reader)) {    
      while ((line = bufferedReader.readLine()) != null) {
        stringBuilder.append(line);
      }
    }catch(Exception ex){
      oLog.error("Error while reading file", ex);
    } 
    FileString = stringBuilder.toString();
  }
}

Published August 15, 2019 - Updated December 2, 2021

Was this useful?

0% found this useful

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Community has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice
Contact us