Is there an open-source Java library for converting String numbers into their equivalent Integers (for example, converting "ten" into 10)? I know how to do it, but I'd rather not waste my customer's time writing one from scratch if there's already a library available.
-
I do not know of a proper "library", but there are many such academic projects one-off classes floating about. I believe I saw one by Eric Lippert awhile back (for C#). – Sep 25 '12 at 20:26
-
Maybe somewhere in [ICU](http://site.icu-project.org/home)? – Brendan Long Sep 25 '12 at 20:26
-
http://stackoverflow.com/q/3911966/106261 – NimChimpsky Sep 25 '12 at 20:39
-
@NimChimpsky Not a duplicate. That is the *opposite* direction. (I couldn't find one going this direction, but I have seen it, I am sure ..) – Sep 25 '12 at 20:47
2 Answers
I doubt that such a library exists.
If you're only looking to convert a limited number of numbers(such as zero
through ten
) than it probably would take you more time to ask this question here than to just implement it yourself.
If you're looking at converting more complex numbers such as "one hundred twenty four and fifty-one hundredth's"
than you're looking for is a natural language recognizer, which is extremely complicated, and unlikely to have a good library in any language.
In the end, It's normally best for back end values and user consumable content to not be coupled.

- 30,851
- 12
- 72
- 100
-
No. There is *no* complex NL required for this task -- it is a much simpler problem with a much more refined scope. The given case is still relatively trivial to handle. (It is even simpler if not needing to deal with fractional values, and this questions limits to integers.) – Sep 25 '12 at 20:33
-
2@pst `Twenty-one Hundred` `twenty one Hundred` `two thousand one hundred` `one hundred and one` `one hundred one` `twenty thirteen` `forty two point five` `3 thousand forty five` `twenty k` – Sam I am says Reinstate Monica Sep 25 '12 at 20:35
-
Again, *those examples do not represent a complicated grammar* and are easily contained in a simple CFL. (While there are opposing arguments on whether NLs are CFLs or not, they are at *extreme* end of complexity. This is not. Also, some of those forms could be excluded from accepted input in this case.) – Sep 25 '12 at 20:37
-
Now, an example which *would* require an "extremely complicated natural language recognizer" might be: ["Charlotte's Web is a children's novel by American author E. B. White, about a pig named Wilbur who is saved from being slaughtered by an intelligent spider named Charlotte."](http://xkcd.com/1087/) – Sep 25 '12 at 20:45
-
@pst how did the pig get saved from the spider? – Sam I am says Reinstate Monica Sep 25 '12 at 20:54
For "twenty-seven" or "twenty and seven"? For "twenty seven" or "score and seven"? Baker's dozen anyone? A pair of dice, or two dice? One short of a six pack? The trifecta of number processing routines? The 21st century (year 20xx)?
Your requirements are a bit broader than I imagine you considered them. I'd recommend that you work with a framework that will actually allow the flexibility to add new representations instead of assuming a single representation, Apache's Open Natural Language processing framework might be a good choice.
After a few attempts, you might build the trinity of number processing routines. Or at least have a plethora of ideas.

- 69,361
- 7
- 100
- 138