1

I am using java mail, and facing an issue with the following error: java.io.UnsupportedEncodingException: us-ascii big5 at sun.nio.cs.StreamDecoder.forInputStreamReader

The following is the Mime header causing this issue.

Content-Type: text/plain; charset="us-ascii, big5"

(I see non english characters on the content)

Is this valid? what could be a solution?

One more related issue, i see different variations of charsets(spl characters around the charset value) that causes this exception: eg.

charset="'UTF-8'"
charset=`UTF-8` 
charset=UTF=8
charset=utf
charset=\"UTF-8\" etc.,

Note that this does not happen only to utf-8, happens to other char sets too, however email clients like outlook etc., opens and decodes these emails smartly.

Any ideas?

Sathish Kumar
  • 313
  • 2
  • 15
  • Can you try message.setHeader("Content-Type", "text/plain; charset=UTF-8"); – Sirsendu Dec 21 '17 at 08:02
  • No, messages come in (i have no control) and i had to run javamail lib to parse to get content. the incoming messages are not created by me.. – Sathish Kumar Dec 21 '17 at 08:10
  • Possible duplicate of [Is there a way to add aliases for Java's Charset names](https://stackoverflow.com/questions/40876598/is-there-a-way-to-add-aliases-for-javas-charset-names) – jmehrens Dec 21 '17 at 15:02
  • Big5 itself is not a problem and can be handled in the code. The bigger issue is couple of charsets are combined. eg: charset="us-ascii, big5" Not sure how can this be handled in code, as this should be based on content and we will need to do complex parsing. – Sathish Kumar Dec 22 '17 at 09:20
  • @SathishKumar Ignore big5 in that answer. The dupe explained that you can create a method to transform the content type header into any string before JavaMail processes it. – jmehrens Dec 22 '17 at 15:41

2 Answers2

2

Can you try message.setHeader("Content-Type", "text/plain; charset=UTF-8")?

No, messages come in (i have no control) and i had to run javamail lib to parse to get content. the incoming messages are not created by me

Use the mail.mime.contenttypehandler system property to transform transform the content type without actually modifying the emails.

package cool.part.team;

import java.util.Arrays;
import javax.mail.Session;
import javax.mail.internet.ContentType;
import javax.mail.internet.MimeMessage;
import javax.mail.internet.MimePart;


public class EverythingIsAscii {

 /**
  * -Dmail.mime.contenttypehandler=cool.part.team.EverythingIsAscii
  */
 public static void main(String[] args) throws Exception {
        MimeMessage msg = new MimeMessage((Session) null);
        msg.setText("test", "us-ascii, big5");
        msg.saveChanges();
        System.out.println("Transformed = "+ msg.getContentType());
        System.out.println("Original = " + Arrays.toString(msg.getHeader("Content-Type")));
    }

    public static String cleanContentType(MimePart p, String mimeType) {
        if (mimeType != null) {
            String newContentType = mimeType;
            try {
                ContentType ct = new ContentType(mimeType);
                String cs = ct.getParameter("charset");
                if (cs == null || cs.contains("'")
                        || cs.contains(",")) { //<--Insert logic here
                    ct.setParameter("charset", "us-ascii");
                    newContentType = ct.toString();
                }
            } catch (Exception ignore) {
                //Insert logic to manually repair.
                //newContentType = ....
            }
            return newContentType;
        }
        return mimeType;
    }
}

Which will output:

Transformed = text/plain; charset=us-ascii
Original = [text/plain; charset="us-ascii, big5"]

You must correct this example code to do a proper transformation of the charset as everything is not ASCII.

Community
  • 1
  • 1
jmehrens
  • 10,580
  • 1
  • 38
  • 47
1

All of those are invalid charsets. Whenever possible, report such problems to the owners of the programs that created these messages. If the messages are spam (they often are), just throw them away; these errors are a pretty good heuristic for detecting spam.

The JavaMail FAQ has strategies for dealing with these errors.

Bill Shannon
  • 29,579
  • 6
  • 38
  • 40