5

Using ByteBuffer, I can convert a string into byte array:

val x = ByteBuffer.allocate(10).put("Hello".getBytes()).array()
> Array[Byte] = Array(104, 101, 108, 108, 111, 0, 0, 0, 0, 0)

When converting the byte array into string, I can use new String(x). However, the string becomes hello?????, and I need to trim down the byte array before converting it into string. How can I do that?

I use this code to trim down the zeros, but I wonder if there is simpler way.

def byteArrayToString(x: Array[Byte]) = {
    val loc = x.indexOf(0)
    if (-1 == loc)
      new String(x)
    else if (0 == loc)
      ""
    else
      new String(x.slice(0,loc))
}
prosseek
  • 182,215
  • 215
  • 566
  • 871

4 Answers4

6

Assuming that 0: Byte is a trailing value, then

implicit class RichToString(val x: java.nio.ByteBuffer) extends AnyVal {
  def byteArrayToString() = new String( x.array.takeWhile(_ != 0), "UTF-8" )
}

Hence for

val x = ByteBuffer.allocate(10).put("Hello".getBytes())

x.byteArrayToString
res: String = Hello
elm
  • 20,117
  • 14
  • 67
  • 113
4

Several of the String constructors accepts an offset+length into a byte[] - this eliminates the need to create a new trimmed array before hand.

Using one of the overloaded constructors might look like:

def byteArrayToString(x: Array[Byte]) = {
    val loc = x.indexOf(0)
    if (-1 == loc)
      new String(x)
    else if (0 == loc)
      ""
    else
      new String(x, 0, loc, "UTF-8") // or appropriate encoding
}

Or, a slight variation keeping the indexOf:

def byteArrayToString(arr: Array[Byte]) = {
    val loc = arr.indexOf(0)
    // length passed to constructor can be 0..arr.length
    new String(arr, 0, if (loc >= 0) loc else arr.length, "UTF-8")
}

Or, one line (thanks to find/Option):

def byteArrayToString(arr: Array[Byte]) = {
    new String(arr, 0, arr.find(_ == 0) orElse arr.length, "UTF-8")
}

Thoughts on the encoding:

  1. Using an explicit encoding is often recommended, and the same encoding should be used specified in getBytes, as the default may change. Here are the standard charset names.

  2. The byte 0 may appear in encoded output before the end of the data, depending on the String input (i.e. NUL) and encoding used.

Community
  • 1
  • 1
user2864740
  • 60,010
  • 15
  • 145
  • 220
2

If you just have one String, I would use .getBytes() -

val x:Array[Byte] = "Hello".getBytes("UTF-8");

Output is

x: Array[Byte] = Array(72, 101, 108, 108, 111)

For more then one String, I would use a ByteArrayOutputStream, like so -

val baos = new java.io.ByteArrayOutputStream(10); //  <-- I might not use 10.
                                                  //  <-- Smells of premature opt.
baos.write("Hello".getBytes("UTF-8"));
baos.write(", World!".getBytes("UTF-8"));

val x:Array[Byte] = baos.toByteArray(); // <-- x:Array[Byte], to specify the type.

Output is

x: Array[Byte] = Array(72, 101, 108, 108, 111, 44, 32, 87, 111, 114, 108, 100, 33)
Elliott Frisch
  • 198,278
  • 20
  • 158
  • 249
0

You could do it like so:

val bb = java.nio.ByteBuffer.allocate(10).put("Hello".getBytes)
val s = new String(bb.array, 0, bb.position)

though this won't indicate in the ByteBuffer that you've read anything. The normal pattern would be to flip and use limit, but if you're just grabbing the array you may as well just use position instead and clear when you're done before reading more.

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407