0

I'm trying to pull all MySQL queries from all of our scripts in native PHP. I'm looking to pull the table names ONLY from the queries below using grep. Below was my efforts wasted on what I came up with.

FROM 'tablename'

FROM tablename

FROM apn.tablename

FROM apn.table_name

FROM 'apn.tablename'

grep -ionER "(FROM)[[:space:]](.*[a-zA-Z\d_.\`])[\s]"

It's important the grep capture stops capturing text right after the table name ends, which my grep does not.

I need the results to show this information:

(script location) : (line number) : (table name)

/var/www/sites... : Line 31 : example_table_name

user2019528
  • 73
  • 1
  • 1
  • 6
  • What does "a backtick is affecting the code block" mean? – Ed Morton Mar 31 '13 at 14:36
  • it means I tried to make use of the code block code on stackoverflow but because there is a backtick in my example code it made an error. – user2019528 Apr 01 '13 at 17:08
  • 1
    just indent your code 4 spaces or use the {} directive on the SO toolbar. I'll tweak it for you to use that but I don't know what the actual code is intended to be so you'll have to modify it yourself afterwards. – Ed Morton Apr 01 '13 at 17:12

3 Answers3

4

Don't use grep. This is tailor-made job for Awk:

awk '$1 == "FROM" { print $2 }'

EDIT Thanks to @rojo for this suggestion

awk 'BEGIN{FS="from|FROM|where|WHERE"} /from|FROM/ {print $2}'

EDIT 2: WIth filename and line #

awk 'BEGIN{FS="from|FROM|where|WHERE"}
      /from|FROM/ {printf ("%s:%d:%s\n", FILENAME, NR, $2)}'
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • This doesn't appear to do anything after piping into a simple select statement. `echo "SELECT * FROM 'apn.test_contacts'" | awk '$1 == "FROM" { print $2 }'` – user2019528 Apr 01 '13 at 17:07
  • It is because in your question you showed inputs as `FROM 'tablename'` instead of full SQL that starts from a `SELECT`. – anubhava Apr 01 '13 at 17:09
  • @anubhava - How about `awk 'BEGIN{FS="from|FROM|where|WHERE"} /from|FROM/ {print $2}'`? But then the surrounding apostrophes / quotes / backticks / whatever still need to be removed.... – rojo Apr 01 '13 at 17:38
  • @rojo: Thanks for your comments. Yes it can be done that way and I thought of suggesting that but an SQL query be in so many flavors that this type of parsing will fail eventually. – anubhava Apr 01 '13 at 17:50
  • @anubhava right. `INSERT INTO tablename`, `TRUNCATE TABLE tablename`, and `UPDATE tablename SET column='etc'` are a few that come to mind immediately. Sometimes it's fun to give the user exactly what he asks for when it might not be what he actually *needs*. – rojo Apr 01 '13 at 19:14
  • the awk statement does work, thanks. But it does not allow for me to retain the line numbers nor the script location that grep usually outputs. I need these things to be displayed, can you guide me? – user2019528 Apr 01 '13 at 20:16
  • @user2019528: Pls check the **EDIT 2** for printing filename and line # – anubhava Apr 01 '13 at 20:41
  • Results look like "-::". The lineNumber and match fields show correct, but the FILENAME function is not working. I have been looking into it and it looks like thats because its within the "BEGIN" statement, and you might have to define 'getline'? – user2019528 Apr 03 '13 at 16:01
  • 1
    FILENAME will work as long as you supply awk a file and not use awk in a piped command. You should use awk as `awk 'command' file` – anubhava Apr 03 '13 at 16:21
2

Would a lookbehind accomplish what you want?

grep -P -i -o '(?<=from )\S+' *.php | sed -r 's/^\W|\W$//g'

Update:

If you want the file name and the line number printed as well, you'll probably need a for loop:

for i in `grep -R --include=*.php -l -i 'FROM' /var/www/sites`; do grep -Pion '(?<=from )\S+' $i | sed -r -e "s/['\`\"]/ /g" -e 's#^#'$i'... : line #'; done

This works as follows:

  • for each file in
    • grep recursive, print file name, case insensitive search for FROM in *.php
  • do
    • look for non-spaces following "from ", print only line number and matching word
    • use sed to replace '"` with a space and insert the filename at the beginning of the line

Example session:

rojo@pico:~$ cat Desktop/test.php
' SELECT * FROM `contacts` WHERE 1=1' test data here that should be cut out'

rojo@pico:~$ for i in `grep -R --include=*.php -l -i 'FROM' .`; do grep -Pion '(?<=from )\S+' $i | sed -r -e "s/['\`\"]/ /g" -e 's#^#'$i'... : line #'; done
./Desktop/test.php... : line 1: contacts

Here's another alternative using awk:

find /var/www/sites -type f -iname '*.php' -print0 | xargs -0 awk 'BEGIN {FS="from|FROM|where|WHERE"} {++x;} /from|FROM/ {printf "%s... : line %d : %s%s", FILENAME, x, $2, ORS}'

... But I haven't figured out how to make it strip quotes / backticks / apostrophes surrounding the table names. I could probably pipe it through sed or tr if it's important, but there has to be a more graceful way to do it.

rojo
  • 24,000
  • 5
  • 55
  • 101
  • it doensn't pull anything, try using this to test it, and pipe the grep command after. `echo "' SELECT * FROM \`contacts\` WHERE 1=1' test data here that should be cut out'"` – user2019528 Apr 01 '13 at 16:58
  • That's because when you `echo something \`contacts\`` your shell is treating `\`contacts\`` as if it were written as `$(contacts)`. In other words, bash is trying to execute an application named `contacts` and insert its output into that part of the `echo` command. You have to backslash escape `\\`contacts\\`` in your `echo` line (as well as remove `*.php`). – rojo Apr 01 '13 at 17:14
  • 1
    @user2019528 - I made a more complete answer for you. Enjoy. `:P` – rojo Apr 01 '13 at 18:11
  • I tried to add /path/to/files and it didn't take, gives me an error "sed: -e expression #2, char 6: unknown option to \`s'". Here is what I added: `for i in \`grep -R --include=*.php -l -i 'FROM' /path/to/files/\`; do grep -Pion '(?<=from )\S+' $i | sed -r -e "s/['\`\"]/ /g" -e 's/^/'$i'... : line /'; done` – user2019528 Apr 01 '13 at 18:40
  • @user2019528 I made an edit after that line you grabbed. It's because the final sed statement has `/` as delimiters. I had to change those to `#` to avoid the filename slashes being treated as regexp delimiters. Your line should read as follows: `for i in \`grep -R --include=*.php -l -i 'FROM' /path/to/files\`; do grep -Pion '(?<=from )\S+' $i | sed -r -e "s/['\\`\"]/ /g" -e 's#^#'$i'... : line #'; done` – rojo Apr 01 '13 at 18:42
  • Hi Rojo, thanks for your diligence on this, it's such a relief. It now runs successfully, and it's finding things but the table names are blank?? On a side note, how long have you been writing these statements and what are some good resources to help myself in the future with regular expressions, sed, gawk, and the like. – user2019528 Apr 01 '13 at 19:33
  • @user2019528 - I'm not sure why the table names are coming up blank for you. I'm unable to recreate the issue. For your other questions, I've been hacking code for a couple of decades or so, but not really every day. The path you seek is winding, with many forks and detours. I got comfortable with regexps by figuring out how to scrape web pages with Greasemonkey, perl and TCL. sed, awk, grep and bash just trial, error and reading man pages. I didn't set out to teach myself from start to finish, but simply to enjoy the journey. I'm still learning, as I'm sure I will be in another 20 years. – rojo Apr 01 '13 at 20:04
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/27354/discussion-between-user2019528-and-rojo) – user2019528 Apr 01 '13 at 20:33
0

I tried the following line. It will take your examples with dots, dashes and single quotes and pull out the table name. You can take the grep/gawk/sed portion and loop over your PHP code.


    echo "select * from 'the_db.the_table' where the_result=1;" | grep -ioE "(from)[[:space:]]([a-zA-Z0-9\_\.\']*)[[:space:]]" | gawk '{ print $2 }' | sed -e s/\'//g

    the_db.the_table

MoChaMan
  • 1,219
  • 2
  • 8
  • 6
  • This worked well! However, it's useless if it doesn't include the line number, and the script location (/var/www/...:line 131: ). How can this be retained? Your answer was the best thus far so I'm hoping you can help, thank you!
    – user2019528 Apr 01 '13 at 17:36