0

I've looked for comparison in speed between the StringReader and a simple for loop, and have been unable to come up with anything useful. What I'm interested in is using jaxb to convert a string to a jaxb Object. The answer I found ( Use JAXB to create Object from XML String ) indicated I need to wrap my string in a StringReader, which makes sense, however it would require some work to put it together properly. (I'm using a custom class to marshal and unmarshal objects, which I would have to check out, modify, submit, then re-import into my project. This is a big time sink.) I found that I can do the same thing by simply converting my string into a byte[] and then using existing methods to get the object I need like so:

String responseAsString = " <?xml ver....";
byte[] myResponse = new byte[responseAsString.length()];
    for(int i = 0; i < responseAsString.length(); i++){
        myResponse[i]=(byte)responseAsString.charAt(i);
    }

So my question is this: If my responseAsString is around 200,000 characters long is this method going to be significantly slower than using StringReader to get my jaxb object?

Community
  • 1
  • 1
Ted Delezene
  • 2,461
  • 1
  • 16
  • 32
  • You could implement both and run benchmark tests to find out for yourself. Post the results here for future reference as well. – Al.Sal Jul 28 '14 at 17:59
  • Your code is just creating a `byte[]` (ignoring character encoding), no JAXB object. – isnot2bad Jul 28 '14 at 18:01
  • @isnot2bad Right, but I have a class that was created to do the marhsalling / unmarshalling for me so with a `byte[]` I can just do this: `marshal.bytesToObject(myResponse)` and get the `jaxb object` that I need. @Al.Sal If I don't get any responses here I may just do that later on today. – Ted Delezene Jul 28 '14 at 19:08
  • Do you really think you can speed up your app significantly by replacing a `StringWriter` by a `byte[]`??? This is like if you want to reduce total reading time of 'Lord of the Rings' by tweaking the printer that prints the pages... – isnot2bad Jul 28 '14 at 20:14
  • I don't mean to be rude @isnot2bad , but it seems like you didn't really read my Q. I am more worried that using my method for marshaling is going to cause a slowdown that could be avoided if instead wrote more code, but used the `StringReader` way. – Ted Delezene Jul 28 '14 at 20:50
  • @Ted I think you didn't get my point: Why do you think marshalling might be negatively affected by a StringWriter in opposite to something wrapped around a byte[]? – isnot2bad Jul 28 '14 at 22:14
  • I don't, I think that converting it to a `byte[]` and then marshaling that is going to be slower than just throwing the string into a `StringWriter` and marshaling that way. What I was hoping to discover is if it is going to be significantly slower. i.e. worth the hassle of checking out a class, writing the method, and checking it back in, then reimporting the jar into my existing grails project, and fighting with that. (The last part is what really fills me with trepidation since it only goes smoothely about 1/4 of the time, and is a 2 hour ordeal every other time.) – Ted Delezene Jul 28 '14 at 22:53

2 Answers2

2

Although this is a year old, those arriving from a google search for StringReader performance should know that it is pretty abysmal when reading char-by-char, as you would be if you were parsing XML or Json.

Each call to read() includes a synchronized block, which is unnecessary for most situations and quite expensive when you are really trying to squeeze for performance. It is possible to write your own replacement StringReader which functions the same way but doesn't synchronize in only a few lines of code, for potentially quite a large win.

Mike B
  • 1,600
  • 1
  • 12
  • 8
1

Although your implementation will work for some strings if there is any extra information in the 8 bytes that the character has (that a byte doesn't) then you are effectively truncating your character information by casting it to a byte, furthermore by casting to a byte directly you are disregarding the character encoding. So even though your for loop may be faster you are losing information. Instead of doing it this way try: responseAsString.getBytes(). This will ensure that you are not losing encoding information by just doing a simple cast to a byte. If you happen to know that you are using a specific encoding (such as UTF-8) then you can call getBytes with the encoding as a parameter.

As a side note I wrote a basic time test that uses both methods 1000 times on a string that is 20,000 characters long. The average time to do it with the for loop implementation above was 0.25ms whereas getBytes() took 0.75ms. Still because of the potential loss of information I would still go with getBytes()

Jeremy Farrell
  • 1,481
  • 14
  • 16