0

I need help to convert a byte array into a format which could read the € symbol. Now, the program I'm working on is reading the file like this :

new String(bytes, "ISO-8859-1")

I find ISO-8859-15 and utf-8 can make € readable. But I can't find a way to convert my array into ISO-8859-15. When I try with utf-8, my byte array isn't decode (it trigger a loop). The byte array is from another application which I can't modify or even consult the code.

Is there a way to reveal the € symbol? Is there a way to analyze the contents of the byte array, to detect if the contents is convertible?

if (truc) {

  // Initialisation d'une ressource odet API
  Resource gor = new Resource(...);
  PliPDFBean pdf = new PliPDFBean();
  byte[] flux = null;

  pdf=ImpressionNOE.recuperationFluxNOE(...);

  // Appel http
  int status = gor.webCall(pdf);

  // Recuperation du flux
  flux = gor.getResult();

  if (flux != null) {
    // longueur du byte et type de l'application
    response.setHeader("Content-Disposition", "attachment; filename=Reimpression.pdf");
    response.setContentLength(flux.length);
    response.setContentType("application/pdf");
    // ecrit les donnees dans le pdf
    
    try {
        String string_flux = new String(flux, "ISO-8859-1");
      response.getWriter().write(string_flux);  
    } catch (Exception e) {
        log.error("erreur ImpressionReedition"+ e.getMessage());
    }
    
  }

} 
Jason Aller
  • 3,541
  • 28
  • 38
  • 38
ANAFLY
  • 1
  • 2
  • ISO-8859-15 is a very rarely-used encoding, maybe you’re confusing it with the (vastly more frequently used) Windows code page 1252? – Konrad Rudolph Aug 05 '21 at 15:00
  • Sorry - I made a mistake ;) – g00se Aug 05 '21 at 15:01
  • I think it's accurate to say that ISO-8859-1 cannot encode the Euro sign, but your goal to me is unclear – g00se Aug 05 '21 at 15:07
  • There is no universal way to detect the encoding of an arbitrary byte[]. You have to *know* it. **However** if you have *known text* encoded with a given encoding, you can quite easily check which one it could be. A first attempt would be to replace the `ISO-8859-1` above with `ISO-8859-15` and see if that produces sensible output. – Joachim Sauer Aug 05 '21 at 15:10
  • @JoachimSauer indeed i tried and it doesn't worked – ANAFLY Aug 05 '21 at 15:13
  • Can you post a sample of the content? Like a hexdump of the file or a subsection of it? An important indication is trying to find the bytes that you think should represent the `€`. If most other characters use only 1 byte and the `€` uses 3 bytes, then it's probably UTF-8. – Joachim Sauer Aug 05 '21 at 15:15
  • Copyable please, not a picture – g00se Aug 05 '21 at 15:16
  • Actually my goal is quite complicated to explain @g00se I have to print a PDF thanks to xml feed wich comes from another application. The problem is the € appear as a ? , because my feed is encode in ISO-8859-1 in the code – ANAFLY Aug 05 '21 at 15:16
  • 2
    Then the Euro symbol couldn't be there in the first place could it? (Since ISO-8859-1 cannot encode it) – g00se Aug 05 '21 at 15:26
  • 1
    "print a PDF thanks to xml feed"? What does that even mean? **What format** is the source of your data? XML? PDF? Something else? You seem to be confusing multiple layers here and I'm trying to figure out which ones are relevant. – Joachim Sauer Aug 05 '21 at 15:28
  • the feed is super long, I don't think I can find an € equivalent in the bytes. – ANAFLY Aug 05 '21 at 15:33
  • @JoachimSauer the : pdf=ImpressionNOE.recuperationFluxNOE(...); var bean is charge thanks to an xml string indeed. this data is used to call the service like this int status = gor.webCall(pdf); and then it's print to pdf thanks to the other code lines – ANAFLY Aug 05 '21 at 15:38
  • Sorry, I understand the words you write, but the order they are in makes 0 sense to me. – Joachim Sauer Aug 05 '21 at 15:44
  • All very mysterious, but I suppose if someone wanted to send a byte array that was encoded as ISO-8859-1 and meant to have a Euro character as well, they could use one of the 'holes' in that charset to stash it, but of course Java would not know about that and throw an error – g00se Aug 05 '21 at 15:53
  • 2
    *`response.setContentType("application/pdf");`* That would require 'binary content' and not a string. You would need to write the raw bytes directly to the stream (not with a `Writer`) – g00se Aug 05 '21 at 16:03
  • Also: if the issue is actually with a PDF, then all this talk about encodings could be ignored: PDF files are a.) binary data and not text and b.) can define their own mapping of "character" go glyph, so even the parts of the PDF that might look like plain-text at first glance could have a basically arbitrary mapping of any number to €. – Joachim Sauer Aug 06 '21 at 08:34
  • Interesting @g00se . My response object is a HttpServletResponse. How may I send the byte to write into a pdf ? – ANAFLY Aug 06 '21 at 08:34
  • @ANAFLY: use `response.getOutputStream` instead of `getWriter` – Joachim Sauer Aug 06 '21 at 08:38

0 Answers0