0

Below is what I have in my Array...

myArray = {"about","name","dsafasdf","fix"};

I want to find what are English words in this array.

Below should be output:

Words found are as below
about
name
fix

Thanks in advance!!!

Any example or link would work!!!

Actually I want to implement TextTwist Game. I have found possible words, however I would like to check whether the String found is WORD/ Grammar or not...

Update 1

Please don't advice me to create a file and put words in it and then search word in this file... It will be the WORSTEN program....

Fahim Parkar
  • 30,974
  • 45
  • 160
  • 276

6 Answers6

2

You need a library with all english words. And you have to check every word.
And this is a similar question. And if you don't want to use a java library you should find a text file containing all words or something like that and write your own method to find a word. Note that your text file should be sorted so you could find word with divide and conquer algorithm. Otherwise searching will take very long time.
EDIT:
And you also have to remember that names are not "English words" as says @amit. An they can meet everywhere in text. You should check if word starts with upper case letter and isn't on the start of sentence.

Community
  • 1
  • 1
shift66
  • 11,760
  • 13
  • 50
  • 83
  • define "english words". is massachusetts a word, though it is not in the dictionary, but it certainly has meaning? – amit Jan 23 '12 at 14:37
  • Yes. I think there must be free libraries for this issue and containing your case too. – shift66 Jan 23 '12 at 14:40
  • I believe the library you are seeking is google... what about person names? will [Haveliwala](http://research.taherh.org/pubs/) be in your dictionary? I assume not. – amit Jan 23 '12 at 14:44
  • I think people who writes a java library and share it with the world, think about that kind of things and include them in their libraries. – shift66 Jan 23 '12 at 14:53
  • and if my name is `aVeryLongImossibleNameThatIsNotAWord`, and next year I replace larry page as CEO of google, but the library is up to date only for today, then what? Using static information is very problematic in a dynamic word. – amit Jan 23 '12 at 14:57
  • 1
    Though: the OP's recent comment clarifies he is not looking for these cases. One last thing: concider a [Trie](http://en.wikipedia.org/wiki/Trie) instead divide and conquer approach. – amit Jan 23 '12 at 14:58
  • There is only one case in which we don't know what to do: if sentence starts with DonPedroRiCardoKaroloPaulo then you'll never know is it a name or just a Spanish word. – shift66 Jan 23 '12 at 15:01
1

You will need to read in an English Library file and check against that. An example of such a file can be found here: http://wordlist.sourceforge.net/

Nick Garvey
  • 2,980
  • 24
  • 31
1

Instead of working with a static collection of words as suggested in other answers, I would use something much more dynamic - the web.

A good heuristic could be - search if the word you are seeking appears in a title of an article in wikipedia, and accept it if it does!

Note that the advantage is a dynamic growing "list" of words, without the need to store them in a dictionary.

Disadvantages: slow IO [constant internet usage], and the list is yet not full [some terms do not appear, even in wikipedia]. It also requires the user to be on-line to use this approach.

Have a look at wikipedia API to understand how to do it.

Another on-line source of information you can use is Bing Search API [which is free! though has some problems lately...]

amit
  • 175,853
  • 27
  • 231
  • 333
  • And what if it's just a desktop application and user is not connected to internet? If user is on Himalayas and wants to use this application. – shift66 Jan 23 '12 at 15:04
  • Then throw an exception: "must be connected". Using this approach allows you to be much more dynamic, at the cost of you must be on line to use it. I'll edit and add it explicitly to the "disadvantages". – amit Jan 23 '12 at 15:07
  • OK you're totally right IF it's a web application or is developed only for computers connected to the internet. – shift66 Jan 23 '12 at 15:10
  • @Ademiban: Engineering is the art of making the right tradeoff. You can never have it all :) – amit Jan 23 '12 at 15:12
0

First: Define your dictionary of english words.
Then: Put all those worlds in a Collection.
Finally: For each word in your array, check whether it is in the collection of english words.

This is not excessively performant, but it should do the job:

String[] englishWords = new String[]{"a", "all", "an",...};
Collection<String> dictionary = Arrays.asList(englishWords);
for (String candidate : myArray){
  if (dictionary.contains(candidate)){
    System.out.println(candidate);
  }
}
Urs Reupke
  • 6,791
  • 3
  • 35
  • 49
  • do you want me to write all english words in `String[] englishWords = new String[]{"a", "all", "an",...};` ?? Don't tell me that please.... – Fahim Parkar Jan 23 '12 at 14:51
  • 1
    No, I want you to look up a source for english words - a free dictionary file for example - and load that file into your list. The array above is just an example. Please clear up your question so we can see what it actually is you are asking for. – Urs Reupke Jan 23 '12 at 14:58
0

You could lookup in a dictionary, or you can use specific libraries for that: have a look at How to check if a word is an English word with Python?

Community
  • 1
  • 1
Savino Sguera
  • 3,522
  • 21
  • 20
0

I've implemented a version of TextTwist in java myself, and I've found that reading in from a dictionary text file into a Set of String's works great.

Here's my code, a Java Eclipse project, in case you're interested. Note that I implemented it with multiplayer functionality in mind so the code is split between client/server. https://github.com/fangsterr/Multiplayer-Text-Twist

fangsterr
  • 3,670
  • 4
  • 37
  • 54