92

Is it possible to put a byte[] (byte array) to JSON?

if so, how can I do that in java? then read that JSON and convert that field again to byte[]?

Amin Sh
  • 2,684
  • 2
  • 27
  • 42
  • 13
    JSON does not support that. Use Base64. – SLaks Dec 20 '13 at 15:22
  • 1
    it does. I used this: jsonObj.put(byte[]); – Amin Sh Dec 21 '13 at 14:05
  • 3
    That is your library supporting it, not JSON itself. The byte array wont be stored as byte array in the JSON, JSON is a text format meant to be human readable. Your library maybe interprets the byte array as UTF-8 encoded String and displays that, or maybe shows a binary string, maybe base64, maybe a hex string, who knows. – Zabuzard Aug 14 '20 at 07:31

6 Answers6

88

Here is a good example of base64 encoding byte arrays. It gets more complicated when you throw unicode characters in the mix to send things like PDF documents. After encoding a byte array the encoded string can be used as a JSON property value.

Apache commons offers good utilities:

 byte[] bytes = getByteArr();
 String base64String = Base64.encodeBase64String(bytes);
 byte[] backToBytes = Base64.decodeBase64(base64String);

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Base64_encoding_and_decoding

Java server side example:

public String getUnsecureContentBase64(String url)
        throws ClientProtocolException, IOException {

            //getUnsecureContent will generate some byte[]
    byte[] result = getUnsecureContent(url);

            // use apache org.apache.commons.codec.binary.Base64
            // if you're sending back as a http request result you may have to
            // org.apache.commons.httpclient.util.URIUtil.encodeQuery
    return Base64.encodeBase64String(result);
}

JavaScript decode:

//decode URL encoding if encoded before returning result
var uriEncodedString = decodeURIComponent(response);

var byteArr = base64DecToArr(uriEncodedString);

//from mozilla
function b64ToUint6 (nChr) {

  return nChr > 64 && nChr < 91 ?
      nChr - 65
    : nChr > 96 && nChr < 123 ?
      nChr - 71
    : nChr > 47 && nChr < 58 ?
      nChr + 4
    : nChr === 43 ?
      62
    : nChr === 47 ?
      63
    :
      0;

}

function base64DecToArr (sBase64, nBlocksSize) {

  var
    sB64Enc = sBase64.replace(/[^A-Za-z0-9\+\/]/g, ""), nInLen = sB64Enc.length,
    nOutLen = nBlocksSize ? Math.ceil((nInLen * 3 + 1 >> 2) / nBlocksSize) * nBlocksSize : nInLen * 3 + 1 >> 2, taBytes = new Uint8Array(nOutLen);

  for (var nMod3, nMod4, nUint24 = 0, nOutIdx = 0, nInIdx = 0; nInIdx < nInLen; nInIdx++) {
    nMod4 = nInIdx & 3;
    nUint24 |= b64ToUint6(sB64Enc.charCodeAt(nInIdx)) << 18 - 6 * nMod4;
    if (nMod4 === 3 || nInLen - nInIdx === 1) {
      for (nMod3 = 0; nMod3 < 3 && nOutIdx < nOutLen; nMod3++, nOutIdx++) {
        taBytes[nOutIdx] = nUint24 >>> (16 >>> nMod3 & 24) & 255;
      }
      nUint24 = 0;

    }
  }

  return taBytes;
}
Sam Nunnally
  • 2,291
  • 2
  • 19
  • 30
  • Somehow, I get these errors `error: cannot find symbol method encodeBase64String(byte[])` and `error: incompatible types: String cannot be converted to byte[]` while compiling. Do you have any clue? – Yusril Maulidan Raji Sep 09 '19 at 14:44
  • There's some truth in this answer (i.e. the usual way is to Base64 encode), but it's not to the point. The question is about bytes; it does not matter whether these are PDF. Neither a charset you might have used to produce the bytes. "Server side", some `getUnsecureContent` method, and JavaScript are also off topic. (Almost feels like the question was edited later, is that possible even if it's not marked as such? If so, I apologize.) – EndlosSchleife Apr 03 '20 at 13:11
  • 1
    @sam-nunnally Any problem here if we encode as UTF-8 string? – Ayyappa Sep 24 '20 at 10:23
15

The typical way to send binary in json is to base64 encode it.

Java provides different ways to Base64 encode and decode a byte[]. One of these is DatatypeConverter.

Very simply

byte[] originalBytes = new byte[] { 1, 2, 3, 4, 5};
String base64Encoded = DatatypeConverter.printBase64Binary(originalBytes);
byte[] base64Decoded = DatatypeConverter.parseBase64Binary(base64Encoded);

You'll have to make this conversion depending on the json parser/generator library you use.

Sotirios Delimanolis
  • 274,122
  • 60
  • 696
  • 724
4

In line with @Qwertie's suggestion, but going further on the lazy side, you could just pretend that each byte is a ISO-8859-1 character. For the uninitiated, ISO-8859-1 is a single-byte encoding that matches the first 256 code points of Unicode.

So @Ash's answer is actually redeemable with a charset:

byte[] args2 = getByteArry();
String byteStr = new String(args2, Charset.forName("ISO-8859-1"));

This encoding has the same readability as BAIS, with the advantage that it is processed faster than either BAIS or base64 as less branching is required. It might look like the JSON parser is doing a bit more, but it's fine because dealing with non-ASCII by escaping or by UTF-8 is part of a JSON parser's job anyways. It could map better to some formats like MessagePack with a profile.

Space-wise however, it is usually a loss. With UTF-8 each non-ASCII byte would occupy 2 bytes, while BAIS uses (2+4n + r?(r+1):0) bytes for every run of 3n+r such bytes (r is the remainder). It would be a win on UTF-16, but who uses that for JSON?

(This encoding trick works on any language -- 8859-1 is very widely supported.)

Mingye Wang
  • 1,107
  • 9
  • 32
2

If your byte array may contain runs of ASCII characters that you'd like to be able to see, you might prefer BAIS (Byte Array In String) format instead of Base64. The nice thing about BAIS is that if all the bytes happen to be ASCII, they are converted 1-to-1 to a string (e.g. byte array {65,66,67} becomes simply "ABC") Also, BAIS often gives you a smaller file size than Base64 (this isn't guaranteed).

After converting the byte array to a BAIS string, write it to JSON like you would any other string.

Here is a Java class (ported from the original C#) that converts byte arrays to string and back.

import java.io.*;
import java.lang.*;
import java.util.*;

public class ByteArrayInString
{
  // Encodes a byte array to a string with BAIS encoding, which 
  // preserves runs of ASCII characters unchanged.
  //
  // For simplicity, this method's base-64 encoding always encodes groups of 
  // three bytes if possible (as four characters). This decision may 
  // unfortunately cut off the beginning of some ASCII runs.
  public static String convert(byte[] bytes) { return convert(bytes, true); }
  public static String convert(byte[] bytes, boolean allowControlChars)
  {
    StringBuilder sb = new StringBuilder();
    int i = 0;
    int b;
    while (i < bytes.length)
    {
      b = get(bytes,i++);
      if (isAscii(b, allowControlChars))
        sb.append((char)b);
      else {
        sb.append('\b');
        // Do binary encoding in groups of 3 bytes
        for (;; b = get(bytes,i++)) {
          int accum = b;
          if (i < bytes.length) {
            b = get(bytes,i++);
            accum = (accum << 8) | b;
            if (i < bytes.length) {
              b = get(bytes,i++);
              accum = (accum << 8) | b;
              sb.append(encodeBase64Digit(accum >> 18));
              sb.append(encodeBase64Digit(accum >> 12));
              sb.append(encodeBase64Digit(accum >> 6));
              sb.append(encodeBase64Digit(accum));
              if (i >= bytes.length)
                break;
            } else {
              sb.append(encodeBase64Digit(accum >> 10));
              sb.append(encodeBase64Digit(accum >> 4));
              sb.append(encodeBase64Digit(accum << 2));
              break;
            }
          } else {
            sb.append(encodeBase64Digit(accum >> 2));
            sb.append(encodeBase64Digit(accum << 4));
            break;
          }
          if (isAscii(get(bytes,i), allowControlChars) &&
            (i+1 >= bytes.length || isAscii(get(bytes,i), allowControlChars)) &&
            (i+2 >= bytes.length || isAscii(get(bytes,i), allowControlChars))) {
            sb.append('!'); // return to ASCII mode
            break;
          }
        }
      }
    }
    return sb.toString();
  }

  // Decodes a BAIS string back to a byte array.
  public static byte[] convert(String s)
  {
    byte[] b;
    try {
      b = s.getBytes("UTF8");
    } catch(UnsupportedEncodingException e) { 
      throw new RuntimeException(e.getMessage());
    }
    for (int i = 0; i < b.length - 1; ++i) {
      if (b[i] == '\b') {
        int iOut = i++;

        for (;;) {
          int cur;
          if (i >= b.length || ((cur = get(b, i)) < 63 || cur > 126))
            throw new RuntimeException("String cannot be interpreted as a BAIS array");
          int digit = (cur - 64) & 63;
          int zeros = 16 - 6; // number of 0 bits on right side of accum
          int accum = digit << zeros;

          while (++i < b.length)
          {
            if ((cur = get(b, i)) < 63 || cur > 126)
              break;
            digit = (cur - 64) & 63;
            zeros -= 6;
            accum |= digit << zeros;
            if (zeros <= 8)
            {
              b[iOut++] = (byte)(accum >> 8);
              accum <<= 8;
              zeros += 8;
            }
          }

          if ((accum & 0xFF00) != 0 || (i < b.length && b[i] != '!'))
            throw new RuntimeException("String cannot be interpreted as BAIS array");
          i++;

          // Start taking bytes verbatim
          while (i < b.length && b[i] != '\b')
            b[iOut++] = b[i++];
          if (i >= b.length)
            return Arrays.copyOfRange(b, 0, iOut);
          i++;
        }
      }
    }
    return b;
  }

  static int get(byte[] bytes, int i) { return ((int)bytes[i]) & 0xFF; }

  public static int decodeBase64Digit(char digit)
    { return digit >= 63 && digit <= 126 ? (digit - 64) & 63 : -1; }
  public static char encodeBase64Digit(int digit)
    { return (char)((digit + 1 & 63) + 63); }
  static boolean isAscii(int b, boolean allowControlChars)
    { return b < 127 && (b >= 32 || (allowControlChars && b != '\b')); }
}

See also: C# unit tests.

Qwertie
  • 16,354
  • 20
  • 105
  • 148
1

Amazingly now org.json now lets you put a byte[] object directly into a json and it remains readable. you can even send the resulting object over a websocket and it will be readable on the other side. but i am not sure yet if the size of the resulting object is bigger or smaller than if you were converting your byte array to base64, it would certainly be neat if it was smaller.

It seems to be incredibly hard to measure how much space such a json object takes up in java. if your json consists merely of strings it is easily achievable by simply stringifying it but with a bytearray inside it i fear it is not as straightforward.

stringifying our json in java replaces my bytearray for a 10 character string that looks like an id. doing the same in node.js replaces our byte[] for an unquoted value reading <Buffered Array: f0 ff ff ...> the length of the latter indicates a size increase of ~300% as would be expected

quealegriamasalegre
  • 2,887
  • 1
  • 13
  • 35
  • so if the byte array just has the integers values or string values then ? – kuldeep May 19 '20 at 08:32
  • I mean if your bytearray just contains strings or integers i guess it would be easier to just send the strings and integers in the json directly. – quealegriamasalegre May 19 '20 at 17:42
  • hmm so if there is some sort of byte data that i get back from lets say a machine that deals only with sequential memory access with each address starting at some offset and each index with some bits (variable) then this data you mean can be directly dumped to a json with some structural definition representing a sequential array ? e.g. at 0x122 address there is 2 bits of data and then starting immediately at address 0x125 there is some 4 bits of data – kuldeep May 19 '20 at 19:48
  • whatever your data format you could simpy transform it to a bytearray then use json.put("some name", bytearray) and send it and with org.json you can later read it as byte[] readarray=(byte[])json.get("some name") – quealegriamasalegre May 20 '20 at 03:27
  • This isn't particularly new and is not specific to `byte[]`. org.json treats array types the same: as JSON arrays and serializes their elements' values. A `byte` is treated as a number so the resulting JSON is just a JSON array of numbers. – Savior Aug 14 '20 at 14:39
  • @Savior do you have any answer to the doubts i expressed in my question. it certainly never looked like an array of numbers to me but do you know what the size implications of putting a byte array to a json using org.json are when compared to base64. and do you know why org.json replaces the byte array with a weird id when you stringify a jsnon that contains a byte array – quealegriamasalegre Aug 14 '20 at 17:15
  • Say you did `obj.put("wtv", new byte[] { 1, 2, 3 });` where `obj` is an `org.json.JSONObject`, and then called `toString` on it, you'd get `{"wtv":[1,2,3]}`. It would be the same result if you used `int[]` or `long[]`, so it's not unique to `byte[]`. Since it's serializing individual byte values as their numerical (decimal) representation, it's gonna take considerably more space the longer the array is. – Savior Aug 14 '20 at 22:06
  • I'm not sure what you mean in your last question. Are you referring to array types' `toString` result? – Savior Aug 14 '20 at 22:06
  • @Savior i actually passes the bytearray of a photo to my json and when using toString() what I get back is a 10 letter string code. – quealegriamasalegre Aug 15 '20 at 05:46
  • i dont think org.json behaves the same as it does on your example when dealing with the much larger byte array of an audio or image file. it cant even do what you suggested as such a bytearray would in all likelihood contain special characters that would conflict with the json format. i invite you to test it so you see what i mean. the interesting question here is whether this it is larger or smaller size-wise than base64 – quealegriamasalegre Aug 15 '20 at 06:17
  • Arrays types don't override `toString`, so you get the implementation from their `Object` superclass, see [here](https://stackoverflow.com/questions/409784/whats-the-simplest-way-to-print-a-java-array). The content of the byte array doesn't matter, whether you're serializing as Base64 or as a number array, both of those are fully representable in JSON. The number of bytes doesn't matter either. I don't see why you think so. The JSON number array option will quickly become larger in size than base64 as the byte array grows. – Savior Aug 22 '20 at 18:32
  • @Saviour have you actually tried it out? i feel like we are not talking about the same stuff. and what you are saying is simply not true, a bytearray can very much contain the byte representation of a bracket or apostrophes that will very much break a json that is the entire reason why base64 exists: so you get a string representation of your bytearrays (be they pictures, audiofiles or whatever) that does not contain colons, brackets and other special characters used in the json fromat – quealegriamasalegre Aug 22 '20 at 19:03
  • please just try it out with an image. take the image encode it as a bytearray, then use org.json to put it in a jsonObject and then apply toString() to that jsonObject and explain to me what comes out in the place where your bytearray should be. i just want to understand what org.json is doing there. i simply dont see the same result as what you posted 5 comments ago in your example with the numbers. please indulge me – quealegriamasalegre Aug 22 '20 at 19:22
-7

what about simply this:

byte[] args2 = getByteArry();
String byteStr = new String(args2);
Ash
  • 15
  • 4
  • 5
    The String(byte[]) constructor would encode it as a string applying the default encoding. This may alter the original bytes contents. Moreover, on the other side, it should be clear which encoding you are using, as well. – fcracker79 Aug 21 '15 at 07:15