1

Guy I have a list of string.

Select 
id AS "cusId ",
name as 'cusName', gendar as ' Gendar.',
isPaid as " is'Paid ", total, remarks FROM

I need a regex that returns:

Select
id
name
gendar
isPaid
total
remarks
FROM

And also ignore comma and 'AS' keyword.

So far from PHP I can use preg_match_all('/(?<![\S"])([^"\'\s]+)(?![\S"])/') and filter all query keywords later on, but came to JavaScript there is no lookbehind in regex.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Tyler
  • 644
  • 1
  • 5
  • 18

1 Answers1

2

DISCLAIMER: The solution below is by no means a generic solution for parsing arbitrary SQL queries. To parse arbitrary SQL queries, you need to build or use an existing one. See also How to parse / tokenize an SQL statement in Node.js.

So, taking into account your specific input strings, you can use a regex that will match what you do not need, and then will capture what you need:

/"[^"]*"|'[^']*'|\s+AS\s+|\s*((?:(?!\sAS\s)[^,\s])+)/gi

See the regex demo

Explanation:

  • "[^"]*" - match a double-quoted substring that has no " inside (replace with "[^"\\]*(?:\\.[^"\\]*)*" if you need to support escaped " inside)
  • | - or
  • '[^']*' - match single-quoted substring having no ' inside (replace with '[^'\\]*(?:\\.[^'\\]*)*' if you need to support escaped ' inside)
  • | - or
  • \s+AS\s+ - "AS" word inside 1+ whitespaces
  • | - or
  • \s* - 0+ whitespaces
  • ((?:(?!\sAS\s)[^,\s])+) - Group 1 capturing one or more symbols other than , and whitespace (see [^,\s])+) that are not starting a sequence of a whitespace + AS + whitespace. It matches any text that is not space+AS+space.

JS demo:

var re = /"[^"]*"|'[^']*'|\s+AS\s+|\s*((?:(?!\sAS\s)[^,\s])+)/gi; 
var str = 'Select id AS "cusId ", name as \'cusName\', gendar as \' Gendar.\', isPaid as " is\'Paid " total , datetime FROM';
var res = [];
while ((m = re.exec(str)) !== null) {
    if (m[1]) { 
        res.push(m[1]); // Add the Capture group 1 to the resulting array
    }
}
document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I have re-checked the source of your question and it seems you need to get `Select` and `id` separated, right? Then, you should replace `[^,]` with `[^,\s]`. See https://regex101.com/r/eD8jR1/2 – Wiktor Stribiżew Jul 14 '16 at 09:52
  • wow, excellent , time to study regex again , hopefully one day become regex master like you : ) thanks dude – Tyler Jul 14 '16 at 09:52
  • 1
    Sorry, but do you need `[ "Select", "id", "name", "gendar", "isPaid", "total", "datetime", "FROM" ]` with the input like `Select id AS "cusId ", name as 'cusName', gendar as ' Gendar.', isPaid as " is'Paid " total , datetime FROM`? I think you do, I updated the answer. – Wiktor Stribiżew Jul 14 '16 at 09:53
  • Are you trying to provide a regex to work with SQL queries in general or just with that only one OP's sample? – revo Jul 14 '16 at 10:29
  • If it is a question to me, I am not providing a general regex for SQL queries, this is impossible to do with just a small regex like this. There are specific requirements in the question, that is what my regex is designed for. – Wiktor Stribiżew Jul 14 '16 at 10:34
  • As per question's title, *strings not between quotes*, you are trying to provide a regex to work with SQL queries in general which OP himself threw one type of PCRE regex for it already: `(?<![\S"])([^"\'\s]+)(?![\S"])` and it [almost works](https://regex101.com/r/eD8jR1/3) but [yours doesn't](https://regex101.com/r/eD8jR1/4). If you're going to reply, please mention me. – revo Jul 14 '16 at 10:53
  • @revo: I never mentioned this is a generic pattern. Only the one that matches OP's specific sample input. I would not post a regex answer for a request to parse arbitrary SQL queries. We have a C# SQL query parser at work, and it is very comprehensive, I know what this all means. – Wiktor Stribiżew Jul 14 '16 at 10:56
  • Yes, and you never mentioned it is not. I think it is considered *good* if you update your answer with an explicit point to this. – revo Jul 14 '16 at 11:01
  • I added the disclaimer with a reference to the SO thread discussing ways to parse SQL queries. – Wiktor Stribiżew Jul 14 '16 at 11:38
  • GJ, now to me it makes sense. – revo Jul 14 '16 at 11:43