1

i have this code in my program in java :

 public static int NumOfStringByte(String str) throws UnsupportedEncodingException{
        return str.getBytes("UTF-8").length+2;
    }

... is this correct? how can i calculate the number of bytes of a string?

SurvivalMachine
  • 7,946
  • 15
  • 57
  • 87
dan
  • 91
  • 2
  • 10

1 Answers1

1

In Java, calling getBytes('UTF-8') already gives you exactly the bytes in the UTF-8 encoding format, so you should simply return the length of that byte array. The only reason to add to that number is if you are adding some additional bytes (such as for NUL-termination or to include a byte-order mark); however, if you were to do that, you should choose a clearer function name.

Note, however, that the length of the UTF-8 encoding format is NOT the same as the String's footprint in memory. Java stores its strings in memory using the UTF-16 encoding format. The number of bytes actually used to store the string is str.length() * 2 (basically, str.length() gives you the number of char objects in the underlying buffer, and each charis 2 bytes).

Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
  • so i need to double the length i got from the array of bytes??actually i need this information to know how to skeep over strings bytes in data binary text whith the randomAcssessfile.so why when i try this code and also tried to measure the size of the text in byte i got the same size?? – dan Jul 24 '16 at 17:43
  • @dan, no. It depends on what the intended meaning of "NumOfStringByte()" is. If the intended meaning is the length of the UTF-8 encoding, then str.getBytes("UTF-8").length(); if the meaning is the memory consumption of the String object, then str.length() * 2 (without invoking getBytes() at all). – Michael Aaron Safyan Jul 24 '16 at 17:45
  • @dan, some clarification of what the usage of this function is would help. For example, if you are trying to figure out memory footprint (e.g. for analyzing your memory usage), there are better approaches. See: http://stackoverflow.com/questions/9368764/calculate-size-of-object-in-java – Michael Aaron Safyan Jul 24 '16 at 17:46
  • i need this information to know how to skeep over strings bytes in data binary text whit RandomAcssessFile.so wich one of the options you suggest i should use? – dan Jul 24 '16 at 17:52
  • @dan how was the file written? (And do you have any control over the file format?). This is definitely the wrong way to do it for seeking in the file (especially since reading the file into a string in the first place requires you to know exactly how many bytes to read into the string). However, the proper way will depend on the answer to that question. If you can control the format, I'd suggest writing out the size as fixed-size integer (like a 32-bit int or a 64-bit long) prior to variable-length records. This way, you can just read the length property before each record to skip them. – Michael Aaron Safyan Jul 24 '16 at 19:44