Ok, so, first of all, you're barking up the wrong tree by focusing on the fact that specifying an initial capacity leads to different serialized bytes. In fact, if you look at the difference:
pbA from your example:
: ac ed 00 05 73 72 00 0f 71 33 39 31 39 33 34 39 ....sr..q3919349
: 34 2e 53 74 61 74 65 00 00 00 00 00 00 00 01 02 4.State.........
: 00 01 4c 00 04 6d 61 70 73 74 00 10 4c 6a 61 76 ..L..mapst..Ljav
: 61 2f 75 74 69 6c 2f 4c 69 73 74 3b 78 70 73 72 a/util/List;xpsr
: 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 ..java.util.Arra
: 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 yListx.....a....
: 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00 I..sizexp....w..
: 00 00 01 73 72 00 14 71 33 39 31 39 33 34 39 34 ...sr..q39193494
: 2e 4d 61 70 57 72 61 70 70 65 72 00 00 00 00 00 .MapWrapper.....
: 00 00 01 02 00 01 4c 00 03 6d 61 70 74 00 0f 4c ......L..mapt..L
: 6a 61 76 61 2f 75 74 69 6c 2f 4d 61 70 3b 78 70 java/util/Map;xp
: 73 72 00 11 6a 61 76 61 2e 75 74 69 6c 2e 48 61 sr..java.util.Ha
: 73 68 4d 61 70 05 07 da c1 c3 16 60 d1 03 00 02 shMap......`....
: 46 00 0a 6c 6f 61 64 46 61 63 74 6f 72 49 00 09 F..loadFactorI..
: 74 68 72 65 73 68 6f 6c 64 78 70 3f 40 00 00 00 thresholdxp?@...
: 00 00 02 77 08 00 00 00 02 00 00 00 00 78 78 ...w.........xx
zero from your example:
: ac ed 00 05 73 72 00 0f 71 33 39 31 39 33 34 39 ....sr..q3919349
: 34 2e 53 74 61 74 65 00 00 00 00 00 00 00 01 02 4.State.........
: 00 01 4c 00 04 6d 61 70 73 74 00 10 4c 6a 61 76 ..L..mapst..Ljav
: 61 2f 75 74 69 6c 2f 4c 69 73 74 3b 78 70 73 72 a/util/List;xpsr
: 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 ..java.util.Arra
: 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 yListx.....a....
: 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00 I..sizexp....w..
: 00 00 01 73 72 00 14 71 33 39 31 39 33 34 39 34 ...sr..q39193494
: 2e 4d 61 70 57 72 61 70 70 65 72 00 00 00 00 00 .MapWrapper.....
: 00 00 01 02 00 01 4c 00 03 6d 61 70 74 00 0f 4c ......L..mapt..L
: 6a 61 76 61 2f 75 74 69 6c 2f 4d 61 70 3b 78 70 java/util/Map;xp
: 73 72 00 11 6a 61 76 61 2e 75 74 69 6c 2e 48 61 sr..java.util.Ha
: 73 68 4d 61 70 05 07 da c1 c3 16 60 d1 03 00 02 shMap......`....
: 46 00 0a 6c 6f 61 64 46 61 63 74 6f 72 49 00 09 F..loadFactorI..
: 74 68 72 65 73 68 6f 6c 64 78 70 3f 40 00 00 00 thresholdxp?@...
: 00 00 00 77 08 00 00 00 01 00 00 00 00 78 78 ...w.........xx
The only difference is the couple of bytes that specify load factor and such. Obviously, these bytes would be different - of course they would if you specify a different initial capacity that was ignored by the first deserialization. This is a red herring.
You are concerned about a corrupt deep copy, but this concern is misplaced. The only thing that matters, in terms of correctness, is the result of the deserialization. It just needs to be a correct, fully functional deep copy that doesn't violate any of your program's invariants. Focusing on the precise serialized bytes is a distraction: You don't care about them, in fact you only care that the result is correct.
Which brings us to the next point:
The only real issue you face here is a difference in long term performance (both speed and memory) characteristics from the fact that some Java versions ignore the initial map capacity when deserializing. This does not affect your data (that is, it will not break invariants), it only potentially affects performance.
So your very first step is to ensure that this is actually a problem. That is, it boils down to a potential premature optimization issue: Ignore the difference in the deserialized map's initial capacity for now. If your application runs with sufficient performance characteristics then you have nothing else to worry about. If it doesn't, and if you are able to narrow the bottlenecks down to decreased deserialized hash map performance due to a different initial capacity, only then should you approach this problem.
And so, the final part of this answer is, if you determine that the performance characteristics of the deserialized map actually are insufficient, there are a number of things you can do.
The simplest, most obvious one I can think of is to implement readResolve()
on your object, and take that opportunity to:
- Construct a new map with the appropriate parameters (initial capacity, etc.)
- Copy all of the items from the old deserialized map to the new one.
- Then discard the old map and replace it with your new one.
Example (from your original code example, choosing the map that yielded the "false" result):
class MapWrapper implements Serializable {
private static final long serialVersionUID = 1L;
Map<String, Integer> map = new HashMap<>(2);
private Object readResolve () throws ObjectStreamException {
// Replace deserialized 'map' with one that has the desired
// capacity parameters.
Map<String, Integer> fixedMap = new HashMap<>(2);
fixedMap.putAll(map);
map = fixedMap;
return this;
}
}
But first question if this is really causing issues for you. I believe you are overthinking it, and that hyper-focusing on the byte-for-byte serialized data comparison is not productive for you.