5

I am trying to extract spelled-out numbers from strings, plus extracting the word that comes after the number. I have managed to do this by a laboursome way of writing my own code including the spelled-out numbers to search for (here an example from stringr::sentences:

numbers <- str_c(c(" one ", " two ", " three ", " four ", " five ", " six ", " seven ", " eight "," nine ", " ten "), "([^ ]+)")
number_match <- str_c(numbers, collapse = "|")

reduced <- sentences %>%
   str_detect(number_match)
sent <- sentences[reduced==TRUE]
str_extract(sent, number_match)

These are the extracted strings:

 [1] " seven books"   " two met"       " two factors"   " three lists"   " seven is"      " two when"      " ten inches."   " one war"      
 [9] " one button"    " six minutes."  " ten years"     " two shares"    " two distinct"  " five cents"    " two pins"      " five robins." 
[17] " four kinds"    " three story"   " three inches"  " six comes"     " three batches" " two leaves."

As I cannot possibly know upfront if I have considered all numbers possible, I was wondering if R provides a tool that can identify spelled-out numbers? I have found similar questions, e.g. Convert spelled out number to number but this is unfortunately not a question about R.

Any help is appreciated.

NelnewR
  • 131
  • 7
  • 2
    can you give an example of your strings (you mention additions)? – Val Mar 13 '18 at 08:44
  • With addition I actually meant that I want to extract the spelled out number, and in addition the word that comes after the number. Sorry for not being clear. I have edited the question to clarify – NelnewR Mar 13 '18 at 08:45
  • what do you want to get on "twenty four book"?? or "two and a half cakes"? or "sixteen archangels?" – Zahiro Mor Mar 13 '18 at 09:01
  • It is an example as I am trying to learn R. I realize the results might be strange. But the most important question is how to find the spelled-out numbers, and I think this is a relevant question. – NelnewR Mar 13 '18 at 09:03
  • There's also [this question](https://stackoverflow.com/questions/18332463/convert-written-number-to-number-in-r).... – A5C1D2H2I1M1N2O1R2T1 Mar 13 '18 at 10:19

0 Answers0