0

I am trying to match a sentence that contains both English and Non English characters but does not contain pure numeric including decimals.

Example - Should match::

Renforcé-Bettwäschegar BLUMIRA123

Not match::

999.99

The following code matches everything that's not contained in the ASCII characters -

[^\u0000-\u0080]+

This is all I have at the moment. Any help will be much appreciated.

Thank you.

thinking_hydrogen
  • 189
  • 1
  • 5
  • 15
  • So it should match strings which contain at least one English letter ([a-zA-Z]) and one accented/diacritic letter ([àáâäåÀÁÂÃçÇêéëèÊËÉÈïíîìÍÌÎÏñÑöòõóÓÔÕÖÒšŠúüûùÙÚÜÛÿŸýÝžŽ])? Is there a need for non-ASCII characters? – mtanti Dec 19 '13 at 14:29
  • mtanti - Thanks for your comment. There's no need for non-ASCII characters. Thank you. – thinking_hydrogen Dec 19 '13 at 14:39

4 Answers4

2

See if this works:

.*([a-zA-Z].*[àáâäåÀÁÂÃçÇêéëèÊËÉÈïíîìÍÌÎÏñÑöòõóÓÔÕÖÒšŠúüûùÙÚÜÛÿŸýÝžŽ]|[àáâäåÀÁÂÃçÇêéëèÊËÉÈïíîìÍÌÎÏñÑöòõóÓÔÕÖÒšŠúüûùÙÚÜÛÿŸýÝžŽ].*[a-zA-Z]).*
mtanti
  • 794
  • 9
  • 25
2

First of all I'll assume that you have split your text into sentences. Then try this:

!/(?:^| )[0-9]+(?:\.[0-9]+)?(?: |$)$/.test(sentence);

For example, this is the returned result for each of the below sentences:

Renforcé-Bettwäschegar BLUMIRA123 //true
999.99                            //false
Another test                      //true
Hi this is a test 124             //false
Hi this is a test 124.23          //false
psxls
  • 6,807
  • 6
  • 30
  • 50
1

This should do the trick

!/^[0-9.]+$/.test(s)

Please note that will match only numbers and decimals, so you need to negate it (the !)

seldon
  • 977
  • 2
  • 8
  • 20
  • I think the OP wants both accented and non-accented letters though so just non-accented letters for example wouldn't do. – mtanti Dec 19 '13 at 14:44
  • `!/^[0-9.]+$/.test("BLUMIRA 123");` will return "true", but since it contains "pure numeric" to quote OP, this is wrong. – psxls Dec 19 '13 at 15:11
  • I understood the string must contain only numerals to be wrong – seldon Dec 19 '13 at 17:03
0

Thanks for the inputs. The below regex seem to work for me.

^([x00-\xFF]+[a-zA-Z][x00-\xFF]+)*
thinking_hydrogen
  • 189
  • 1
  • 5
  • 15
  • 1
    Are you sure? It doesn't look like it will match what you described. This will match any text at the beginning of a line that contains only ASCII characters with one English letter in the middle, not at the beginning or end. The text does not have to contain an accented letter. – mtanti Dec 20 '13 at 10:09