3

I'm new to Java and I'm no sure how to do the following:

A Scala application somewhere converts a String into bytes:

ByteBuffer.wrap(str.getBytes)

I collect this byte array as a Java String, and I wish to do the inverse of what the Scala code above did, hence get the original String (object str above).

Getting the ByteBuffer as a String to begin with is the only option I have, as I'm reading it from an AWS Kinesis stream (or is it?). The Scala code shouldn't change either.

Example string:

String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";

How can this be achieved in Java?

EDIT

Okay, so I'll try to elaborate a little more about the process:

  1. A 3rd party Scala application produces CSV rows which I need to consume
  2. Before storing those rows in an AWS Kinesis stream, the application does the following to each row:

    ByteBuffer.wrap(output.getBytes);
    
  3. I read the data from the stream as a string, and the string could look like the following one:

    String str = "AAAAAAAAAAGZ7dFR0XmV23BRuufU+eCekJe6TGGUBBu5WSLIse4ERy9............";
    
  4. I need to restore the contents of the string above into its original, readable, form;

I hope I've made it clearer now, sorry for puzzling you all to begin with.

Yuval Herziger
  • 1,145
  • 2
  • 16
  • 28
  • Won't something like `str.toCharArray.map(_.toByte)` work in Java? – 4lex1v Sep 16 '14 at 12:21
  • Possibly, but I'm not sure I unserstand what you did there with the `map(_.toByte)` part – Yuval Herziger Sep 16 '14 at 12:34
  • 3
    "I read the data from the stream as a string" -- how? Do you just pass the byte array into a `String` constructor, or do you use some kind of encoding like base64? – Mike Strobel Sep 16 '14 at 13:12
  • 1
    I googled aws kinesis and it seems like they base64 encode the records. Updated my answer. – aioobe Sep 16 '14 at 13:25
  • Another lesson learnt from this: `GetShardIteratorResult.getShardIterator()` returns a string onle, while `GetRecordsRequest getRecordsRequest = new GetRecordsRequest();` along with `getRecords(getRecordsRequest);` gets the desired ByteBuffer type. – Yuval Herziger Sep 17 '14 at 08:32

5 Answers5

3

If you want to go from byte[] to String, try new String(yourBytes).

Both getBytes and the String(byte[]) uses the default character encoding.


From Amazon Kinesis Service API Reference:

The data blob to put into the record, which is Base64-encoded when the blob is serialized.

You need to base64 decode the string. Using Java 8 it would look like:

byte[] bytes = Base64.getDecoder().decode("AAAAAAAAAAGZ7dFR0XmV23BR........");
str = new String(bytes, "utf-8"));

Other options: Base64 Encoding in Java

Community
  • 1
  • 1
aioobe
  • 413,195
  • 112
  • 811
  • 826
  • I actually tried that, this could have been simpler if my input wasn't a String that is bytes (see above). What you suggest simply treats the string as if it was the content of it in its readable form. – Yuval Herziger Sep 16 '14 at 12:37
  • Could you elaborate. I thought you basically wanted the inverse of getBytes? – aioobe Sep 16 '14 at 12:39
  • Sure: I have a string that looks like this: `"String str = "AAAAAAAAAAGZ7dFR0XmV23BR"`. I know that it's been converted to bytes but I get it as a String type. I want to know what's behind those bytes in a readable form – Yuval Herziger Sep 16 '14 at 12:42
  • But how was `"AAAAAAAAAAGZ7dFR0XmV23BR"` produced? You mention getBytes, but that gives byte[] not String. – aioobe Sep 16 '14 at 12:44
  • My apologies, my question was unorganized. what happens is this: 1. A Scala application takes a String, does `"getBytes()"`, wraps it in `ByteBuffer`. ---> 2. I read this ByteBuffer as a String. ---> I'd like to know what was originally the content of the string. In one code line, this is what's done with the original string: `ByteBuffer.wrap(output.getBytes)` – Yuval Herziger Sep 16 '14 at 12:48
  • I'm having a hard time following. Any chance you could amend your question with a step-by-step example that shows the string/byte data at each stage, and what result you are ultimately expecting? – Mike Strobel Sep 16 '14 at 13:01
  • Yes, you're both right, I'll edit the question and will elaborate the steps. – Yuval Herziger Sep 16 '14 at 13:04
  • @aioobe, that worked just perfect. I had to problems: (1) Unfamiliarity with Java; (2) Unfamiliarity with AWS Java SDK; Combining your solution with changing the AWS method to fetch data from strings, so that instead of a String representation of the data, I would retrieve the original, ByteBuffer representation. Thanks again for the help – Yuval Herziger Sep 17 '14 at 08:32
1

I m not sure if I understand the question exactly but do you mean this?

String decoded = new String(bytes);
aioobe
  • 413,195
  • 112
  • 811
  • 826
Katerina A.
  • 1,268
  • 10
  • 24
0
public static void main(String[] args){
    String decoded = new String(bytesData);
    String actualString;
    try{
       actualString = new String(bytesData,"UTF-8");
       System.out.printLn("String is" + actualString);
    }catch(UnsupportedEncodingException e){
       e.printstacktrace();
    }
}
Matthieu
  • 2,736
  • 4
  • 57
  • 87
  • What you suggest simply treats the string as if it were the content of it in its readable form. The original String looks like this: `String str = "AAAAAAAAAAGZ7dFR0XmV23BR........"` – Yuval Herziger Sep 16 '14 at 12:40
0

Sorry,wrong answer. Again,ByteBuffer is a java class. SO they may work the same way You need java version..

From kafka ApiUtils:

def writeShortString(buffer:ByteBuffer,string:String){
   if(String == null){
       buffer.putShort(-1)
   }
   else{
     val encodedString = string.getBytes(“utf-8”)
     if(encodedString.length > Short.MaxValue){
         throw YourException(Your Message)
     else{
        buffer.putShort(encodedString.length.asInstanceOf[Short])
        buffer.put(encodedString)
   }
  }

}

Lincoln
  • 181
  • 4
0

For Kinesis data blobs:

private CharsetDecoder decoder = Charset.forName("UTF-8").newDecoder();
decoder.decode(record.getData()).toString();
binshi
  • 1,248
  • 2
  • 17
  • 33