0

I have a mssql db of having min 39gb of finger print data , now required to find out the duplicates within it , each finger print record has a minimal (reduced here) structure as follows

[EMP ID] [finger print IMAGE] [finger print TEMPLATE (ISO)]

I’m using (1 to 1 Comparison ) a C# program and algorithm that is based on Ratha's algorithm on ISO TEMPLATE .The algorithm is working and is able to detect duplicates but the problem is the time that is required for an 1-to-one comparison The cost is of O(n2) , can anyone help me in giving any idea regarding the reduction of time cost on a finger print matching algorithm.

I read about “ms sql ssis” but its for ETL I have to apply the algorithm here that cant be done with “ms sql ssis

Now the sample benchmark is as follows (approx)

   SampleSpace  Compared    Time
 1. 100            100      ~ 53 sec 
 2. 500            500      ~ 3.50 min
 3. 1233           1233     ~1 hr 48 min

I found other ways for categorized feature extractions , but how can I categorize based on ISO TEMPLATE. Can any one give an advice ?

I think Hadoop is an idea , but any one came across a fingerprint matching integration with Hadoop

Micky C002
  • 39
  • 4
  • Can you give us some links to pertinent information, like the ISO Templates you're talking about, and Ratha's algorithm? You definitely don't want to be using the O(n^2) algorithm, so you do need some kind of feature extraction that will let you more easily reduce the number of potential duplicates. – Jim Mischel Oct 22 '14 at 13:07
  • template is of ISO/IEC 19794-2:2005 format. and Ratha Algm i cnt find a online paper on it https://graphics.stanford.edu/~vaibhav/pubs/fingerprint.pdf – Micky C002 Oct 22 '14 at 14:18
  • 2
    Your question has many several problems with it. The first, you're asking how you can improve performance of the code but you've provided no code. You've provided no sample data which, had you provided code, would make it impossible for us to verify your results. Thirdly, you're pulling random technologies (SSIS, Hadoop) into the mix to make it run faster but a poorly implemented algorithm is going to run poorly regardless of whether it's written in a .NET language, Java, assembly, python or REXX. – billinkc Oct 22 '14 at 15:05
  • i doesn't complaints on the speed of the implemented algorithm , as thia time od FPMatching algorithm's are always time eaters its fast as it can , problem is that how can i reduce the application of that algorithm only on certain TEMPLATEs rather than a one-to-one match of all TEMPLATEs – Micky C002 Oct 22 '14 at 17:08
  • Have you seen this question and the accepted answer? It discusses some ways of narrowing the search: http://stackoverflow.com/questions/4817467/iso-19794-2-fingerprint-format – Jim Mischel Oct 23 '14 at 01:35
  • I fear about these lines from "The Hand book of fingerprint recognition " Unfortunately, the distribution of fingerprints even into these five categories is not uniform, and there are many “ambiguous” fingerprints that cannot be accurately classified even by human experts. About 17% of the 4000 images in the NIST Special Database 4 (Watson and Wilson, 1992a) have two different ground truth labels! Therefore, in practice, fingerprint classification is not immune to errors and does not offer much selectivity for fingerprint searching in large databases".any idea on template mtching than image – Micky C002 Oct 24 '14 at 05:12

0 Answers0