1

I am currently trying to figure out, how to find all columns of a table within an SQL statement using Regex in notepad++.

Lets take this query:

select
  a.id,
  a.id || a.name,
  a.age,
  b.id
From a,b

Now, I wat to retrieve all columns for a using regex - the problem the query itself is much larger and I do not want to have to go through the whole query.

The desired result is:

id
name
age

I already figured out that with

(?<=a\.)(\S+)

I match the desired strings, but Notepad++ still returns the whole lines and not only the words I need.

Can anyone help me here?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
bublitz
  • 888
  • 2
  • 11
  • 21

2 Answers2

1

Maybe "matching non greedy" using "?" and looking for word boundaries can help? The expression would look like this (add a ? in the last bracket):

(?<=a\.)(\S+?\b) 

This just came into my mind as I read the question, didn't check it on functionality.

More information on non-greedy modifier can be found here.

actc
  • 672
  • 1
  • 9
  • 23
  • Could you please explain why you think it may help? Do you know how lazy subpatterns at the end of patterns behave? They match the least amount of chars possible. `+?` matches 1, `*?` match 0. – Wiktor Stribiżew Feb 27 '18 at 10:05
  • As the TO wants to get "only the words" and not "the whole lines" this would be my approach - to just select as few characters as possible – actc Feb 27 '18 at 10:18
  • You do not get as few as possible, with `\S+?` at the end of the pattern, you only get 1, even if there are more. – Wiktor Stribiżew Feb 27 '18 at 10:19
1

You may use this 2 step approach to extract values after a.:

Find:         \ba\.(\w+)|(?s:.)
Replace With: (?1$1\n:)

Then, you need to remove duplicate lines to get the expected results.

Details

  • \ba\. - a a. substring as a whole word
  • (\w+) - Group 1: one or more word chars (the group value will be kept + an LF will be appended in the replacement pattern)
  • | - or
  • (?s:.) - any char (it will be removed).

The (?1$1\n:) replacement means that the Group 1 value will be output and a line ending LF symbol will be appended to the result if Group 1 matches, else, empty string will be used as a replacement.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563