A code and/or documentation review is probably your best option here. But, you can probe if you want. It seems that a sufficient test is the goal and minimizing it is less important. It is hard to figure what a sufficient test is, based only on speculation of what the threat would be, but here's my suggestion: all codepoints, including U+0000, proper handling of "combining characters."
The method you want to test has a Java string as a parameter. Java doesn't have "UTF-8 encoded strings": Java's native text datatypes use the UTF-16 encoding of the Unicode character set. This is common for in-memory representations of text—It's used by Java, .NET, JavaScript, VB6, VBA,…. UTF-8 is commonly used for streams and storage, so it makes sense that you should ask about it in the context of "saving and fetching". Databases typically offer one or more of UTF-8, 3-byte-limited UTF-8, or UTF-16 (NVARCHAR) datatypes and collations.
The encoding is an implementation detail. If the component accepts a Java string, it should either throw an exception for data it is unwilling to handle or handle it properly.
"Characters" is a rather ill-defined term. Unicode codepoints range from 0x0 to 0x10FFFF—21 bits. Some codepoints are not assigned (aka "defined"), depending on the Unicode Standard revision. Java datatypes can handle any codepoint, but information about them is limited by version. For Java 8, "Character information is based on the Unicode Standard, version 6.2.0.". You can limit the test to "defined" codepoints or go all possible codepoints.
A codepoint is either a base "character" or a "combining character". Also, each codepoint is in exactly one Unicode Category. Two categories are for combining characters. To form a grapheme, a base character is followed by zero or more combining characters. It might be difficult to layout graphemes graphically (see Zalgo text) but for text storage all that it is needed to not mangle the sequence of codepoints (and byte order, if applicable).
So, here is a non-minimal, somewhat comprehensive test:
final Stream<Integer> codepoints = IntStream
.rangeClosed(Character.MIN_CODE_POINT, Character.MAX_CODE_POINT)
.filter(cp -> Character.isDefined(cp)) // optional filtering
.boxed();
final int[] combiningCategories = {
Character.COMBINING_SPACING_MARK,
Character.ENCLOSING_MARK
};
final Map<Boolean, List<Integer>> partitionedCodepoints = codepoints
.collect(Collectors.partitioningBy(cp ->
Arrays.binarySearch(combiningCategories, Character.getType(cp)) < 0));
final Integer[] baseCodepoints = partitionedCodepoints.get(true)
.toArray(new Integer[0]);
final Integer[] combiningCodepoints = partitionedCodepoints.get(false)
.toArray(new Integer[0]);
final int baseLength = baseCodepoints.length;
final int combiningLength = combiningCodepoints.length;
final StringBuilder graphemes = new StringBuilder();
for (int i = 0; i < baseLength; i++) {
graphemes.append(Character.toChars(baseCodepoints[i]));
graphemes.append(Character.toChars(combiningCodepoints[i % combiningLength]));
}
final String test = graphemes.toString();
final byte[] testUTF8 = StandardCharsets.UTF_8.encode(test).array();
// Java 8 counts for when filtering by Character.isDefined
assertEquals(736681, test.length()); // number of UTF-16 code units
assertEquals(3241399, testUTF8.length); // number of UTF-8 code units