For the sake of answering the second question that was asked:
final String str = "™æ‡©Æ";
final byte[] cp1252Bytes = str.getBytes("windows-1252");
for (final byte b: cp1252Bytes) {
final int code = b & 0xFF;
System.out.println(code);
}
Associating the code with each text element is more work.
final String str = "™æ‡©Æ";
final int length = str.length();
for (int offset = 0; offset < length; ) {
final int codepoint = str.codePointAt(offset);
final int codepointLength = Character.charCount(codepoint);
final String codepointString = str.substring(offset, offset + codepointLength);
System.out.println(codepointString);
final byte[] cp1252Bytes = codepointString.getBytes("windows-1252");
for(final byte code : cp1252Bytes) {
System.out.println(code & 0xFF);
}
offset += codepointLength;
}
This is somewhat easier Java 8's String.codePoints() method:
final String str = "™æ‡©Æ";
str.codePoints()
.mapToObj(i -> new String(Character.toChars(i)))
.forEach(c -> {
try {
System.out.println(
String.format("%s %s",
c,
unsignedBytesToString(c.getBytes("Windows-1252"))));
} catch (Exception e) {
e.printStackTrace();
}
});