1

I was wondering, is there any function or way, how to select from a random text all words(strings) with only uppercase letters? To be more specific, I want to take from text all uppercase words and put them into an string array, because those uppercase words are important for me.

For example from text: "This text was just made RANDOMLY to show what I MEANT."

In string array I will have words RANDOMLY and MEANT.

And array should looks like this String[] myArray = {"RANDOMLY", "MEANT"};

The only thing I think of is that I have go trought every single letter and check if its uppercase,

if yes

  • save the letter to a string variable
  • increase value of help integer variable (int count) by one
  • and take a look at the next letter,
    • if its uppercase again, repeat this part
    • if not - move to another letter.

I think my solotion is not very effective, so can tell me your opinion about it? Or prehaps how to make it more effective?

PS: int count is there for expelling short words with 3 letters and less.

Candybrk
  • 33
  • 2
  • 5
  • Can we assume some minimal length of words you want to find? For instance do you consider `I` as correct word? Also should `U.S.A` be counted as word? – Pshemo May 04 '15 at 19:35
  • Letters ending with . (dot) shouldnt be considered. Same goes for words with less than 4 letters. – Candybrk May 04 '15 at 19:42

4 Answers4

3

Probably easiest way to achieve it would be using regex like \b[A-Z]{4,}\b which represents

So your code could look like:

String s = "This text was just made RANDOMLY to show what I MEANT.";

Pattern p = Pattern.compile("\\b[A-Z]{4,}\\b");
Matcher m = p.matcher(s);
while (m.find()) {
    String word = m.group();
    System.out.println(word);
}

Beside printing word to console you can also store it in List<String>.

Pshemo
  • 122,468
  • 25
  • 185
  • 269
1

Split your sentence by whitespace. Then you can use StringUtils.isAllUpperCase(CharSequence cs) for instance to check every single string.

http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#isAllUpperCase(java.lang.CharSequence)

swinkler
  • 1,703
  • 10
  • 20
1

Use Regex to extract them. Like

public static void main(String[] args) {
        List<String> words = new ArrayList<>();
        String dataStr = "This text was just made RANDOMLY to show what I MEANT.";
        Pattern pattern = Pattern.compile("[A-Z][A-Z]+");
        Matcher matcher = pattern.matcher(dataStr);
        while (matcher.find()) {
            words.add(matcher.group());
        }

        System.out.println(words);
    }

Output:

[RANDOMLY, MEANT]

With this in future, you could just adjust search pattern to extract what ever you want.

K139
  • 3,654
  • 13
  • 17
1

Here is a solution with minimal use of regex.

String s = "This text was just made RANDOMLY to show what I MEANT.";
    String[] words = s.split(" |\\.");
    ArrayList<String> result = new ArrayList<>();

    for(String word : words) {
        String wordToUpperCase = word.toUpperCase();
        if(wordToUpperCase.equals(word)) {
            result.add(word);
        }
    }

The line of code:

String[] words = s.split(" |\\.");

means that the string will be split either by a white-space (" ") or by a dot(".")

More info on why the dashes (escaping) were needed here: Java string split with "." (dot)

If you would have split the string just by white-space, as such:

String[] words = s.split(" ");

it would have left possible nasty results like "MEANT."

In either case, the word "I" is included in the result. If you don't want that, make a check so that every word has a length greater that 1.

Community
  • 1
  • 1
Vlad Muresan
  • 29
  • 1
  • 2