Process of elimination
You said:
i am stumped on the consonants
The basic idea is that after you have tested for the character being (a) of the Latin-script, (b) being a letter (not a digit, punctuation, etc.), and (c) not a vowel, you can assume you have a consonant.
As you can see at the center of the code example below, we examine each character with a cascading if
statement, summarized here as pseudo-code:
if( … not part of the Latin script, such as Korean or emoji )
{
other++;
}
else if( … not a letter, such as digit or punctuation )
{
other++;
}
else if ( … definitely a vowel )
{
vowel++;
}
else if ( … maybe a vowel (`y`) )
{
maybeVowel++;
}
else // Else definitely not a vowel, so it must be a consonant.
{
consonant++;
}
char
is legacy
The first two Answers are basically right, but use the obsolete char
type. That type handles less than half of the over 140,000 characters defined in Unicode. And those Answers assume only English without diacriticals and such.
Unicode code point
Instead, make a habit of using code point integer numbers instead.
String input = " Face with Medical Mask" ;
Make a stream of the code point numbers for each character in the text.
IntStream intStream = input.codePoints() ;
Materialize an array from the stream.
int[] codePoints = intStream.toArray();
Loop each code point.
for ( int codePoint : codePoints )
{
…
}
First see if the character is within the Latin script defined in Unicode. See Identify if a Unicode code point represents a character from a certain script such as the Latin script?.
if ( Character.UnicodeScript.LATIN.equals( Character.UnicodeScript.of( codePoint ) ) ) { … } else { other ++ ; )
Next we must test if this character is a letter or not.
if ( Character.isLetter( codePoint ) ) { … } else { other ++ ; )
To simplify our comparisons, we should convert to lowercase.
int lowercaseCodePoint = Character.toLowerCase( codePoint );
Next test for vowels. I do not know that Java or Unicode provides a test for vowel versus consonant. So we must define a set of vowels ourselves. I do not know about all Latin-based languages, but I can at least cover English vowels. Of course, y
is tricky, so I will count that as in a maybeVowel
count.
int[] vowelCodePoints = "aeiou".codePoints().toArray();
int[] maybeVowelCodePoints = "y".codePoints().toArray();
We will want to see if those arrays contain each character's code point number. So sort the arrays to enable a binary search.
Arrays.sort( vowelCodePoints );
Arrays.sort( maybeVowelCodePoints );
Add a test for vowel.
if ( Arrays.binarySearch( vowelCodePoints , lowercaseCodePoint ) >= 0 )
Add a test for maybe vowel.
else if ( Arrays.binarySearch( maybeVowelCodePoints , lowercaseCodePoint ) >= 0 )
And if we get past both those vowel-related tests, we can assume our non-vowel lowercase Latin-script character is a consonant.
Put all the code together.
String input = " Face with Medical Mask";
IntStream intStream = input.codePoints();
int[] codePoints = intStream.toArray();
int[] vowelCodePoints = "aeiou".codePoints().toArray();
int[] maybeVowelCodePoints = "y".codePoints().toArray();
// Sort those arrays to enable binary search.
Arrays.sort( vowelCodePoints );
Arrays.sort( maybeVowelCodePoints );
int vowel = 0;
int maybeVowel = 0;
int consonant = 0;
int other = 0;
for ( int codePoint : codePoints )
{
if ( Character.UnicodeScript.LATIN.equals( Character.UnicodeScript.of( codePoint ) ) )
{
if ( Character.isLetter( codePoint ) )
{
int lowercaseCodePoint = Character.toLowerCase( codePoint );
if ( Arrays.binarySearch( vowelCodePoints , lowercaseCodePoint ) >= 0 )
{ // If definitely a vowel…
vowel++;
} else if ( Arrays.binarySearch( maybeVowelCodePoints , lowercaseCodePoint ) >= 0 )
{ // Else if maybe a vowel…
maybeVowel++;
} else
{ // Else this non-vowel lowercase letter from Latin-script must be a consonant.
consonant++;
}
} else { other++; } // Else not a letter.
} else { other++; } // Else not in Latin script.
}
Dump to console.
// Report
System.out.println( "RESULTS ----------------------------------------------" );
System.out.println( "input = " + input );
System.out.println( "codePoints = " + Arrays.toString( codePoints ) );
System.out.println( "Count code points: " + codePoints.length );
System.out.println( "vowelCodePoints = " + Arrays.toString( vowelCodePoints ) );
System.out.println( "maybeVowelCodePoints = " + Arrays.toString( maybeVowelCodePoints ) );
System.out.println( "vowel = " + vowel );
System.out.println( "maybeVowel = " + maybeVowel );
System.out.println( "consonant = " + consonant );
System.out.println( "other = " + other );
System.out.println( "vowel + maybeVowel+consonant+other = " + ( vowel + maybeVowel + consonant + other ) );
System.out.println( "END ----------------------------------------------" );
Example usage
When run.
RESULTS ----------------------------------------------
input = Face with Medical Mask
codePoints = [128567, 32, 70, 97, 99, 101, 32, 119, 105, 116, 104, 32, 77, 101, 100, 105, 99, 97, 108, 32, 77, 97, 115, 107]
Count code points: 24
vowelCodePoints = [97, 101, 105, 111, 117]
maybeVowelCodePoints = [121]
vowel = 7
maybeVowel = 0
consonant = 12
other = 5
vowel + maybeVowel+consonant+other = 24
END ----------------------------------------------
Tip: Read the humorous article, The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).