-1

I wanna check if there is special character in a line inside a text file using Regex in shell script. assume there is sentence "assccÑasas" how to check if there is 'Ñ' inside the line, so it should be output as error instead. I also wanna check if there is symbol such as '/' or '^' or '&', etc

my code :

VALID='^[a-zA-Z_0-9&.<>/|\-]+$' #myregex

checkError(){
    if [[ $line =~ $VALID ]]; then
       echo "tes"
    else
        echo "not okay"
       exit 1
    fi
 }

 while read line
 do 
    checkRusak      
 done < $1

so the example like if there is sentence "StÀck Overflow;" then it will output error. if there is sentence "stack overflow;" still output error. but if only "stack overflow" (no symbol or special character), it will output "test"

so far it can check for symbol ('/' or '\' etc) but still problem in special character.

Any help really appreciated, thank you in advance

Luthfan M
  • 89
  • 2
  • 13
  • What's wrong with what you have (other than the names not matching)? – Ignacio Vazquez-Abrams Aug 10 '15 at 03:16
  • the problem is the regex still be able to recognize special character (like i explained above). like special character showed here : http://tools.oratory.com/altcodes.html. ¿ñ like this. but i wanna the program can detect if there is special character and some symbol (not all) such as '/' , '*', '(' etc – Luthfan M Aug 10 '15 at 03:18
  • The problem is, bash regex is highly dependent on the locale definition on the system and the implementation of the regex engine (collation expansion or not). With locale expansion on, and in a locale where Ñ is defined to come between `a-z` in the collation expansion, `[a-z]+` will match the whole string `"assccÑasas"`. For example, this code `[[ "assccÑasas" =~ ([a-Z]*) ]] && echo "${BASH_REMATCH[1]}"` output `assccÑasas` on my Cygwin installation. – nhahtdh Aug 10 '15 at 10:38

1 Answers1

0

You could either

  • define, which characters are valid and only allow these or

  • define, which characters are invalid and check if there is one

For the 2nd approach, you could use this regex: [^a-zA-Z_0-9\s].

The ^ inside of square brackets negate the character class, so it matches on any string, that contains a character that is not a letter A-Z, a-z, a number, an underscore or a white space.

Since you want to detect a single character, you don't need a quantifier.

Demo

CarHa
  • 1,148
  • 11
  • 31