(I don't have the reputation to comment on individual answers, but I have read through them all and here is a small review on why they shouldn't be used)
user2085092's answer is not reliable at all.
Muzammal Abdul Ghafoor's answer / Miloš Černilovský's answer / Jeffrey Blattman's answer / Rizki Sunaryo's answer / dazed's answer / Travis' answer / Mark Buikema's answer / enl8enmentnow's answer / neeraj t's answer does not consider non-ASCII character sizes, we could still use them but we will have to lower the partitioned string length to ~1000 to support non-ASCII messages.
(Miloš Černilovský's answer is actually really considerate and nice with tag length determination logic)
Homayoon Ahmadi's answer works with non-ASCII and is much better than the rest, but I dislike the trim
function where the "truncated character determination" logic creates extra String
objects, I believe there is a better way of doing it
My solution (chunkUtf8StringBySize
):
static void chunkUtf8StringBySize(
String string,
int chunkSizeInBytes,
Consumer<String> chunkedStringConsumer
){
chunkUtf8BytesBySize(string.getBytes(), 0, chunkSizeInBytes, byteArray -> {
chunkedStringConsumer.accept(new String(byteArray));
});
//or
/*chunkUtf8BytesBySize(string.getBytes(StandardCharsets.UTF_8), 0, chunkSizeInBytes, byteArray -> {
stringConsumer.accept(new String(byteArray, StandardCharsets.UTF_8));
});*/
}
static void chunkUtf8BytesBySize(
byte[] utf8bytes,
int startingIndex,
int chunkSizeInBytes,
ByteArrayConsumer chunkedUtf8BytesConsumer
) {
assert startingIndex >= 0 : "`startingIndex` must be at least 0!";
assert chunkSizeInBytes >= 4 : "`chunkSizeInBytes` must be at least 4 bytes!";
if (utf8bytes.length <= chunkSizeInBytes) {
chunkedUtf8BytesConsumer.accept(utf8bytes);
return;
}
int i = 0;
while (i < utf8bytes.length) {
int j = i + chunkSizeInBytes - 1;
if (j > utf8bytes.length) {
j = utf8bytes.length - 1;
}
else if (-64 <= utf8bytes[j] && utf8bytes[j] <= -9) {
//j is pointing at the first byte of a code point that requires more than one byte
j--;
}
else if (-128 <= utf8bytes[j] && utf8bytes[j] <= -65) {
//j is pointing at byte 2-4 of a code point
do {
j--;
} while (-128 <= utf8bytes[j] && utf8bytes[j] <= -65);
j--;
}
byte[] ba = new byte[j - i + 1];
System.arraycopy(utf8bytes, i, ba, 0, j - i + 1);
chunkedUtf8BytesConsumer.accept(ba);
i = j + 1;
}
}
interface ByteArrayConsumer {
void accept(byte[] byteArray);
}
interface Consumer<T> {
void accept(T t);
}
Usage example:
class LongLog{
public static void i(String tag, String msg) {
chunkUtf8StringBySize(msg, 4000, s -> Log.i(tag, s));
}
public static void d(String tag, String msg) {
chunkUtf8StringBySize(msg, 4000, s -> Log.d(tag, s));
}
public static void e(String tag, String msg) {
chunkUtf8StringBySize(msg, 4000, s -> Log.e(tag, s));
}
/* add more logging functions if needed */
}
Explanation:
My solution is very similar to Homayoon Ahmadi's solution, except it works with UTF-8 encoding directly instead of creating new String
instances and comparing them.
Also unlike some other answers my solution takes in consumer functions. This gives it more reusability and flexibility, and it avoids the creation of additional arrays/lists.