Complexity
Your task sounds like it is at least O(N) where N is the number of records you have to process. If you are filtering out N of the M total records (e.g. 1M records out of 1B total, then it will be O(M) unless the M records are already indexed or stored in an appropriate data structure (e.g. a trie).
Wall clock time
The good news is that processing 1M records using string operations should take relatively little wall clock time (seconds at worst).
I recommend you extract 10K records and time your analysis on those in order to get a rough estimate of the total processing cost.
Caveats
A regex is a good filtering mechanism if the "language" of valid (or invalid) records is well defined.
If you are applying a regex on unsanitized client data, beware of Regular Expression Denial of Service. See my post here for details.