I would treat this as a serialization problem and just implemented it as follows(complete and working Java code):
import java.nio.ByteBuffer;
import java.util.ArrayList;
public class Serialization {
public static byte[] serialize(String[] strs) {
ArrayList<Byte> byteList = new ArrayList<Byte>();
for (String str: strs) {
int len = str.getBytes().length;
ByteBuffer bb = ByteBuffer.allocate(4);
bb.putInt(len);
byte[] lenArray = bb.array();
for (byte b: lenArray) {
byteList.add(b);
}
byte[] strArray = str.getBytes();
for (byte b: strArray) {
byteList.add(b);
}
}
byte[] result = new byte[byteList.size()];
for (int i=0; i<byteList.size(); i++) {
result[i] = byteList.get(i);
}
return result;
}
public static String[] unserialize(byte[] bytes) {
ArrayList<String> strList = new ArrayList<String>();
for (int i=0; i< bytes.length;) {
byte[] lenArray = new byte[4];
for (int j=i; j<i+4; j++) {
lenArray[j-i] = bytes[j];
}
ByteBuffer wrapped = ByteBuffer.wrap(lenArray);
int len = wrapped.getInt();
byte[] strArray = new byte[len];
for (int k=i+4; k<i+4+len; k++) {
strArray[k-i-4] = bytes[k];
}
strList.add(new String(strArray));
i += 4+len;
}
return strList.toArray(new String[strList.size()]);
}
public static void main(String[] args) {
String[] input = {"This is","a serialization problem;","string concatenation will do as well","in some cases."};
byte[] byteArray = serialize(input);
String[] output = unserialize(byteArray);
for (String str: output) {
System.out.println(str);
}
}
}
The idea is that in the resulting byte array we store the length of the first string(which is always 4 bytes if we use the type int
), followed by the bytes of the first string(whose length can be read later from the preceding 4 bytes), then followed by the length of the second string and the bytes of the second string, and so on. This way, the string array can be recovered easily from the resulting byte array, as demonstrated by the code above. And this serialization approach can handle any situation.
And the code can be much simpler if we make an assumption to the input string array:
public class Concatenation {
public static byte[] concatenate(String[] strs) {
StringBuilder sb = new StringBuilder();
for (int i=0; i<strs.length; i++) {
sb.append(strs[i]);
if (i != strs.length-1) {
sb.append("*.*"); //concatenate by this splitter
}
}
return sb.toString().getBytes();
}
public static String[] split(byte[] bytes) {
String entire = new String(bytes);
return entire.split("\\*\\.\\*");
}
public static void main(String[] args) {
String[] input = {"This is","a serialization problem;","string concatenation will do as well","in some cases."};
byte[] byteArray = concatenate(input);
String[] output = split(byteArray);
for (String str: output) {
System.out.println(str);
}
}
}
The assumption is that *.*
does not exist in any string from the input array. In other words, if you know in advance some special sequence of symbols won't appear in any string of the input array, you may use that sequence as the splitter.