I am attempting to extract Schema and Table information from connection string data. The Schema and Table information is in the format "Schema.Table" (e.g FROM EDWP_D2PM.SN_INC_RPTG_SCRUBBED in string below) . Multiple Schema and Tables can exist in the connection strings, and they always follow the words FROM or JOIN (Including INNER JOIN, LEFT JOIN etc.). I only want to extract Schema and Tables for Connections from a specific database, in the attached example this is DB2.
"let
Source = DB2.Database(""69.69.69.69"", ""bcudb"", [HierarchicalNavigation=true, Implementation=""Microsoft"", Query=""SELECT i.ASSIGNMENT_GROUP, i.BUSINESS_SERVICE, i.CATEGORY, i.CAUSED_BY, i.CLOSE_CODE, i.CLOSED_ON, i.COMPANY, i.CONTACT_TYPE, i.DESCRIPTION, i.NUMBER, i.PARENT_INCIDENT, i.PRIORITY, i.RESOLVED_ON, i.SHORT_DESCRIPTION, i.CREATED_ON, i.CAUSE_CODE, i.CLOSED, i.CREATED, i.RESOLVED, i.RESOLUTION_MET, p.problem_id FROM EDWP_D2PM.SN_INC_RPTG_SCRUBBED i LEFT OUTER JOIN EDWP_D2PM.SN_INCIDENTS_CUST_RPTG p on p.number = i.number WHERE i.PRIORITY INLEFT OUTER JOIN EDWP_D2PM.SN_INCIDENTS_CUST_RPTG ('1 - Critical', '2 - High') AND TO_CHAR(i.created_on,'YYYY-MM') > (select to_char((CURRENT DATE - 12 MONTHS),'YYYY-MM') from SYSIBM.SYSDUMMY1) AND (i.CATEGORY <> 'Alert' OR (i.CATEGORY IS NULL)) AND i.PARENT_INCIDENT IS NULL AND i.EXCLUSIONS <> 'R' AND i.CLOSE_CODE <> 'Duplicate - No Action Taken' with ur""]), RIGHT JOIN EDWP.TEMP blaha blah
#""Changed Type"" = Table.TransformColumnTypes(Source,{{""RESOLVED_ON"", type datetime}})
in
#""Changed Type"""
I have a REGEX expression that is correctly returning the 4 Schemas.Tables in the sample text as expected (link to regex 101 with working example :
(?:\s+JOIN\s+)(\w+\.\w+)|(?:\s+FROM\s+)(\w+\.\w+)
I want to improve the REGEX so that that I only get Schemas & Tables back for strings that begin with the text "DB2.Database" . How can I do this?
I have attempted adding in the prefix :
(?:DB2.Database)(?:\s|\S)*
but that stops the 4 Schema.Tables from being returned.
Can anyone suggest a fix? If you do provide an answer, an explaination of the REGEX logic would be appreciated.
Once I have the REGEX working I will run it in Python using the re module.