I was able to fix this by modifying the owasp code by detecting when it is htmlEncoding the base64 data tags, which doesn't seem necessary.
I believe this is secure because this code doesn't do the security checks, but just avoids doing the encodeHTML on data urls. If anybody knows otherwise, I'd like to know. Thanks!
private static void encodeHtmlOnto(
String plainText, Appendable output, @Nullable String braceReplacement)
throws IOException {
if(plainText!=null && plainText.startsWith("data:image")) {
//Don't touch the base64 encoded images. This messes up the diffing of things.
output.append(plainText);
return;
}
...
The following patch for the owasp code will get it to leave the img data tags alone.
Index: org/owasp/html/Encoding.java
<+>UTF-8
===================================================================
diff --git a/api/app-ejb/src/main/java/org/owasp/html/Encoding.java b/api/app-ejb/src/main/java/org/owasp/html/Encoding.java
--- a/api/app-ejb/src/main/java/org/owasp/html/Encoding.java (revision c5c815dda1f5c89d2e515d676b8c143591b68d8c)
+++ b/api/app-ejb/src/main/java/org/owasp/html/Encoding.java (date 1649080667669)
@@ -166,6 +166,7 @@
static void encodeHtmlAttribOnto(String plainText, Appendable output)
throws IOException {
encodeHtmlOnto(plainText, output, "{\u200B");
+ output.append(plainText);
}
/**
@@ -234,6 +235,13 @@
private static void encodeHtmlOnto(
String plainText, Appendable output, @Nullable String braceReplacement)
throws IOException {
+
+ if(plainText!=null && plainText.startsWith("data:image")) {
+ //Don't touch the base64 encoded images. This messes up the diffing of things.
+ output.append(plainText);
+ return;
+ }
+
int n = plainText.length();
int pos = 0;
for (int i = 0; i < n; ++i) {