0

I d like to get byte from very large string. A string containing around a million of lines. Each line has 1400 characters.

The large String is successfuly generated using stringBuilder. But when I call getByte from the large string i get the exception :

java.lang.NegativeArraySizeException: -1574300626
    at java.base/java.lang.StringCoding.encodeUTF8_UTF16(StringCoding.java:923) ~[na:na]
    at java.base/java.lang.StringCoding.encodeUTF8(StringCoding.java:898) ~[na:na]
    at java.base/java.lang.StringCoding.encode(StringCoding.java:449) ~[na:na]
    at java.base/java.lang.String.getBytes(String.java:964) ~[na:na]

The code I'm using :

byte[] getByteArray() {
        int nbLine = 1_000_000;
        StringBuilder sb = new StringBuilder();
        char[] fakeArray = new char[1400];
        Arrays.fill(fakeArray, 'A');
        String oneLine = String.valueOf(fakeArray.toString()) + "\n";
        for (int i = 0; i < nbLine; i++) {
            sb.append(oneLine);
        }
        String contentString = sb.toString();

        return contentString.getBytes();
    }

Is there a way to make getbytes working with very large string ? Is there any alternative ?

soung
  • 1,411
  • 16
  • 33
  • 1
    The maximum size of an array is `Integer.MAX_VALUE`. What are you really trying to do - why do you need a byte array from a string? – g00se Apr 25 '22 at 17:45
  • Any effects might be mitigated by specifying an economical encoding to `getBytes` – g00se Apr 25 '22 at 17:53
  • 1M of lines, 1400 chars pro line, 2 bytes in char = 2,8 billion bytes. Integer.MAX_VALUE is about 2 billion. So, if you divide your array to two arrays, you probably will get it. But such amount of data must be kept in a database and all manipulations must be done in a database. – Vladimir.V.Bvn Apr 25 '22 at 17:56
  • I m trying to generate a zipfile for testing purpose (like in this link https://stackoverflow.com/a/1091817/3143009). I don't wanna store the file before creating the zip (because creating the file take some time). – soung Apr 25 '22 at 18:19
  • I suggest you re-design the method into a class that extends `OutputStream` and accepts an `InputStream`. that way you can process the values in chunks instead of trying to store everything in memory. – Ryan Apr 25 '22 at 18:20

0 Answers0