3

I made this regex but it is not working.

I got this example string: ('/test/test/test/test-Test/TEST/test.sql'),

my bash code:

if [[ ${arrayQuery[$i]} =~ ([a-z0-9]+)\/([a-z0-9]+)\/([a-z0-9]+)\/([a-z0-9-]+)\/([a-z0-9]+)\/([a-z0-9]+).([a-z0-9]+) ]]; then
        queryName=$1
        echo "test $queryName"
fi

it is not priting anything can anyone explain my why this is not working?

i tried my regex on regex101.com and the regex did work on this website.

  • 1
    It looks to me like you're only looking for lower case letters and digits, but you have upper case letters in the example string, so it should not match. – Eric Renouf Oct 15 '15 at 14:11
  • You don't need to escape the `/` in the regex. `bash` doesn't use them as a delimiter the way `sed`, for example`, does, so they have no special meaning. – chepner Oct 15 '15 at 14:27
  • May I ask you _why_ you're using this regex? are you trying to split your path into its dir components? – gniourf_gniourf Oct 15 '15 at 14:40
  • Hi guys, sorry for the late reactions had a busy weekend:S Anyway thanks for the answers! @gniourf_gniourf yes I want to split the path into components. Is regex a good idea todo this? – kaspertje100 Oct 19 '15 at 10:39

5 Answers5

2
  • you need to escape the dot, otherwise it matches any character

  • your example string contains uppercase, but your regex only accepts lowercase letters

(edit: no quoting needed)

marek.jancuska
  • 310
  • 1
  • 7
1

Your example does not work in regex101 with the string you provided for several reasons:

  1. Your string starts with a '/' but your regex starts with ([a-z0-9]+)
  2. Your string contains Upper case letters yet you don't use [A-Z] in your regex
  3. Your string contains a '-' yet your regex does not parse it, try adding \-
  4. You did not escape the dot replace it with '\.', by default '.' means all character

This regular expression would do the trick (link to regex101) :

\/[a-zA-Z0-9\-]+\/[a-zA-Z0-9\-]+\/[a-zA-Z0-9\-]+\/[a-zA-Z0-9\-]+\/[a-zA-Z0-9\-]+\/[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-]+

I guess this string represent a SQL file in your hard drive, this regex can be shortened to :

(\/[a-zA-Z0-9\-]+)+\.sql

and does not depend on how many folders you have in your directory tree.

guillaume guerin
  • 357
  • 2
  • 6
  • 17
  • so I changed my code to this: `code echo "${arrayQuery[$i]}" if [[ ${arrayQuery[$i]} =~ (\/([a-zA-Z0-9\-]+)\/([a-zA-Z0-9\-]+)\/([a-zA-Z0-9\-]+)\/([a-zA-Z0-9\-]+)\/([a-zA-Z0-9\-]+)\/([a-zA-Z0-9\-]+)\.([a-zA-Z0-9\-]+)) ]]; then queryName=$1 echo "test $queryName" fi` – kaspertje100 Oct 19 '15 at 10:50
  • It does go into the if statement now (in the console I see it prints test) however it doesnt print $queryName because the $1 variable is empty. I dont understand why the $1 is empty could some explain this to me please? – kaspertje100 Oct 19 '15 at 10:52
  • Neverm ind fixed it. I Used ${BASH_REMATCH[1]}" to print the result. Apperently $1 doesnt exists in bash?? – kaspertje100 Oct 19 '15 at 11:42
0

It should work if you add A-Z (capital letters) to all of the character sets.

reynoldsnlp
  • 1,072
  • 1
  • 18
  • 45
0

You can use [[:alnum:]] as a short cut for the (correct) class [a-zA-Z0-9] to simplify your regex while fixing it. Even simpler would be to use a parameter to shorten it:

x='[[:alnum:]]+'
# Or, if you prefer
# x='[a-zA-Z0-9]+'
if [[ ${arrayQuery[$i]} =~ ($x)/($x)/($x)/($x)/($x)/($x)\.($x) ]]; then
    queryName=$1
    echo "test $queryName"
fi

There is also an option to make regular expression matches case-insensitive, which would allow you to use your current regular expression (with a few minor fixes):

shopt -s nocasematch
if [[ ${arrayQuery[$i]} =~ ([a-z0-9]+)/([a-z0-9]+)/([a-z0-9]+)/([a-z0-9-]+)/([a-z0-9]+)/([a-z0-9]+)\.([a-z0-9]+) ]]; then
    queryName=$1
    echo "test $queryName"
fi
shopt -u nocasematch  # Turn it off again
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 1
    you missed that one "-" character , needed to match "test-Test" part – marek.jancuska Oct 15 '15 at 14:37
  • 1
    Good point. Not sure if that should be in each component, or just in the component that contains one in the user's input. It's not clear exactly what the regular expression should be. – chepner Oct 15 '15 at 14:38
  • you also missed the first character '/' of the string '/test/test/test/test-Test/TEST/test.sql' because your regex starts with a "([a-z0-9]+)" – guillaume guerin Oct 15 '15 at 14:59
  • 1
    The leading `/` is only necessary if you want to ensure that the path has *exactly* six components, rather than *at least* six components. As is, you will match and capture the last six in either case. – chepner Oct 15 '15 at 15:01
0

From your comment it seems that you want to split the path into its components. The best option, in Bash, is to use this:

mypath=/testa/testb/testc/test-Testd/TESTe/test.sql
IFS=/ read -r -d '' -a components < <(printf '%s\0' "$mypath")

Like so, you'll have an array components that will contain each component of your path:

gniourf$ declare -p components
declare -a components='([0]="" [1]="testa" [2]="testb" [3]="testc" [4]="test-Testd" [5]="TESTe" [6]="test.sql")'

You don't need (and don't want) to use a regex for this.

Also see this question: How do I split a string on a delimiter in Bash?

Community
  • 1
  • 1
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104