2

This question is strictly DQS-performance related.

The ‘customers’ table I need to clean has 40,000,000 rows… I created a matching policy using a subset (no issues there, I just used a top 10,000).

Now when I want to do a data quality project… I can’t take the entire table in one project… It just won’t respond… I only managed to handle 400,000 at a time and even in that situation it takes almost 2 hours… And it’s not the best solution, because I need to do the project on a view where id between 1 and 400,000.

Any solution to this guys?

I am also wondering… where's the bottleneck? is it CPU or disk?

Regards.

Chicago1988
  • 970
  • 3
  • 14
  • 35

0 Answers0