I have found a couple of related threads: Regular expression - match all words but match unique words only once and get unique regex matcher results (without using maps or lists) there are a few others but I just could not get their solutions to solve my issues.
I've been reading on looharounds and backreferences but I'm still missing something.
I need to search through several large code-bases, and find all unique occurrences of data source names or variables for them.
I tried the following regular expressions:
(datasource=\"(.*?)\")(?!.+\1)
(datasource=\"(.*?)\")(?!.*\1)
(datasource=\"(.*?)\")(?!.+\2)
(datasource=\"(.*?)\")(?!.*\2)
(datasource=\"(.*?)(?!.+\1)\")
(datasource=\"(.*?)(?!.*\1)\")
(datasource=\"(.*?)(?!.+\2)\")
(datasource=\"(.*?)(?!.*\2)\")
datasource="someDSN"
datasource="anotherDNS"
datasource = "anotherDNS"
datasource="someDSN"
The code can be complex, but basically it looks something like this:\
<cfquery name="qry_getEvent" datasource="#APPLICATION.firstDSN#">
SELECT *
FROM events
WHERE id = 1
</cfquery>
<cfquery name="qry_getPlayers" datasource="#APPLICATION.firstDSN#">
SELECT *
FROM players
WHERE event_id = 1
</cfquery>
<cfquery name="qry_getLocation" datasource="secondDSN">
SELECT *
FROM locations
WHERE event_id = 1
</cfquery>
The result should look something like:
#APPLICATION.firstDSN#
secondDSN
The only semi-solution I've discovered is to run the (datasource=\"([^"]*)\") multiple times, but after every time, prefix it with a known value to exclude it for example:
(?!datasource="dsnname1"|datasource="dsnname2")(datasource=\"([^"]*)\")
This helped me narrow down all the DSN names in a few minutes, but would have been so much easier if I could just get all the distinct results automatically. Maybe this need a little Node.js work added to it to streamline the process