I have a table (Data
) that includes nine long text fields. It currently has 1,000,000 records and it's growing, so I would like to speed things up a bit.
I'm using a Table-Valued RegEx function (master.dbo.RegExMatches
) to parse out words from these nine fields and placing those into a table (DataWordFullMap
) that has the Record ID (Id
in the example below), word (wordtoadd
), column name (DE87
in the first query, DE150
in the second) and starting character location in the column (MatchIndex
).
The function master.dbo.RegExMatches
takes the parameters regex, column name, options
. The current setup has to scan the table once for each field (each is in a separate query), rather than applying the function to each of the nine columns in one scan.
Is there an efficient way consolidating these CROSS APPLY
statements into one query? Perhaps by adding an additional column to the CROSS APPLY
result that has the name of the column used in the regex function? Some of the columns are mostly NULL
, others have no NULL
values, so it seems wasteful to do full scans for each of the nine, when some of the nine will have no results for most of the scan.
The following snippet shows two of the nine queries involved:
INSERT INTO DataWordFullMap
SELECT Id ,
CAST ( Match AS nvarchar( 255 )) AS wordtoadd ,
'DE87' ,
MatchIndex
FROM
Data CROSS APPLY master.dbo.RegExMatches( '[\w-[0-9ÿ_]]{2,}(-[\w-[0-9ÿ_]]{2,})?(''[\w-[0-9ÿ_]])?' , DE87 , master.dbo.RegExOptionEnumeration( 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ));
INSERT INTO DataWordFullMap
SELECT Id ,
CAST ( Match AS nvarchar( 255 )) AS wordtoadd ,
'DE150' ,
MatchIndex
FROM
Data CROSS APPLY master.dbo.RegExMatches( '[\w-[0-9ÿ_]]{2,}(-[\w-[0-9ÿ_]]{2,})?(''[\w-[0-9ÿ_]])?' , DE150 , master.dbo.RegExOptionEnumeration( 0 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ));