First, for all purposes a Java String
is always UTF-16, although since Java 9 it may be something else internally.
To achieve what you want ("Get only the first five characters from the input String!"), it should look like this:
public String truncate( String input )
{
var retValue = (input != null) && (input.length() > 5)
? input.substring( 0, 5 )
: input;
return retValue;
}
There should be no need to play around with codepoints for this particular task.
Unfortunately, this is not fully correct.
It works for the String s = "Dies ist ein langer String";
.
It does not work for s = "12345678";
.
Unfortunately, String.offsetByCodePoints()
is of no help here; when using the original code from the question, like this:
public String truncate( String input )
{
int x = 5;
if( input.codePointCount( 0, input.length() ) > 5 )
{
return input.substring( 0, input.offsetByCodePoints( 0, x ) );
}
return input;
}
the correct value for x
depends on the contents of the String.
That's because counts for two codepoints, while is just one – and both are more than one char
.
So this one failed, too:
public String truncate( String input )
{
var retValue = input;
if( input.codePointCount( 0, input.length() ) > 5 )
{
int [] codepoints = input.codePoints().limit( 5 ).toArray();
retValue = new String( codepoints, 0, 5 );
}
return retValue;
}
And here I am stuck …