0

I have a thousand fingerprints stored in my database in a Byte[] array, and I'm trying to make a 1 to N verification of the fingerprint, which means that I need to compare the fingerprint given by the sensor and the ones in the array.

But the process is taking too long, I'm using a forEach loop to iterate through all of the fingerprints in the array and calling the verification method to compare the 2 arrays to find the match.

Is there a way I can make the process of finding the match faster? In worst case scenario where the match is the last item in the array. Or near the bottom.

Fingerprint List

List<Huellas> ListaHuellas = new List<Huellas>();
public class Huellas 
{
    public int idUsuario;
    public Byte[] Huella;
}

Searching the match

foreach (Huellas h in ListaHuellas) {
    // Por cada huella... la almacenamos en un MemoryStream como arreglo de bytes.
     MemoryStream fingerprintData = new MemoryStream(h.Huella);
     // Creamos una plantilla a partir de esos bytes...
     DPFP.Template templateIterando = new DPFP.Template(fingerprintData);
     // Extraemos las caracteristicas de la plantilla
     DPFP.FeatureSet features = ExtractFeatures(Sample, DPFP.Processing.DataPurpose.Verification);
     // Verificamos que las caracteristicas sean buenas
     if (features != null) {
        // Compare the feature set with our template
         DPFP.Verification.Verification.Result result = new DPFP.Verification.Verification.Result();
         Verificator.Verify(features, templateIterando, ref result);
         // Y vemos si el resultado es valido o no, (verified)
         // Si es verified, significa que el dedo escaneado ya existia en la base de datos.
         if (result.Verified) {
             MessageBox.Show(new Form { TopMost = true }, "Usuario encontrado: ID " + h.idUsuario);
             // Por ultimo se cierra el programa.
             this.Invoke(new MethodInvoker(delegate { this.Close(); })); 
         }
     }
 }

Sorry 'bout the spanish comments.

  • 3
    I'd suggest looking into [`Parallel.ForEach`](https://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.foreach(v=vs.110).aspx). – juharr Aug 18 '17 at 19:31
  • 3
    You're storing the raw fingerprint data and not the result of feature extraction? IIRC (been a long time and probably a different product) you'd extract features from the fingerprint and get a hash result, and you could search a database by the hashes. You might want to review your fingerprint software's documentation for an example of storing fingerprint data for identification. Or, if this is how they do it, get a different dev kit. –  Aug 18 '17 at 19:34
  • If the `List` you're using is sorted, you may be able to use some form of binary search algorithm to reduce the number of iterations during search. The most efficient way to iterate over lists in general is using a `for` or `foreach` statement (Source: https://stackoverflow.com/questions/15204/what-is-the-best-way-to-iterate-through-a-strongly-typed-generic-listt) – 0liveradam8 Aug 18 '17 at 19:36
  • @Will I'm storing a base64 string of the byte array in the database, but decoding back to byte array for its verification 'cause the method provided by the SDK receives that as parameters – Antonio Díaz de León Aug 18 '17 at 19:41
  • Have you profiled this code to see which step is taking all the time, and looked at how to reduce that? The actual looping will be very, very fast. But the loop calls a lot of other methods. If for example, the `ExtractFeatures` function takes a long time, this could be the limiting step, and you could look for options to improve it. Is this the fingerprint you're trying to match? If so, do you need to run this `ExtractFeatures` on the same Sample every time you loop? Couldn't it be moved outside the loop? – ainwood Aug 18 '17 at 19:59
  • yes, `this.Invoke(new MethodInvoker(delegate { this.Close(); }));` does that. The ExtractFeatures method and the Verification one, are provided by the SDK itself and that's what is taking a long time, and I can't modify them so I was hoping for a way to make the looping faster. Because I must make those verification each iteration. – Antonio Díaz de León Aug 18 '17 at 20:04
  • Sorry - I edited my comment, and not sure if you are responding to the update. Where is the code to analyse the fingerprint sample from the Sensor? is that the `DPFP.FeatureSet features = ExtractFeatures(Sample, DPFP.Processing.DataPurpose.Verification);`? If so, can this be done only once, outside the loop? – ainwood Aug 18 '17 at 20:28
  • @ainwood Thanks a lot mate! I don't know why I was extracting the features of the same sample each iteration. I moved said code to the outside of the loop and the process went from 14 seconds to 2. How can I make your answer the selected one? – Antonio Díaz de León Aug 18 '17 at 22:06
  • You can't as its not officially an answer. Just click an up arrow next to my comment to mark it as "helpful". – ainwood Aug 18 '17 at 22:36

3 Answers3

2

You could use a 'Dictionary< long, List< Huellas>>':

For each fingerprint you calculate a 'long' value from a suitable hash-function, and store the fingerprint in the list associated with the hash-value (You need a list if you cannot guarantee there will be no collisions between known fingerprints).

When you want to search a fingerprint, you calculate the hash, retrieve the associated list from the dictionary, which you can then search sequentially (or using parallel.foreach).

If you use a decent hash function, there will be few collisons and the lists will mostly contain one element or at most a few, so the sequential search would not take long.

Note: even if the list for the (hash of the) unknown fingerprint only contains one result, you still have to verify the actual byte-arrays (or extracted features): there is always the chance that the fingerprint is actually a new one that just happens to produce the same hash as one of the fingerprints in your database.

Johan Donne
  • 3,104
  • 14
  • 21
  • 1
    Instead of a `Dictionary>` you can use a `Lookup`, just by doing `var lookup = ListaHuellas.ToLookup(element => )` – Jakub Dąbek Aug 18 '17 at 20:25
  • A Lookup is certainly a sensible alternative! for the difference, see https://stackoverflow.com/questions/13362490/difference-between-lookup-and-dictionaryof-list. – Johan Donne Aug 18 '17 at 20:31
0

Reduce the number of items you have to search by removing the ones that would never match. There are a big number of ways to do that, but an example could be:

  1. Interate for every byte on the your target byte array (The one you want to find).
  2. On each iteration, remove any entries from the source list in which the byte on the same index on the item array doesn't match the byte you're iterating.
  3. Do this until you have one single item.

This way, you are not searching entire arrays, you start by searching a byte at a time and reducing the number of items in which you search.

If you want to reduce this time even more, first sort the byte arrays on the list and search them by dictionary. That is, compare only with the item on the middle and divide the list by half, taking only the half where you think your match is. Do this until you have a half of only one single item. This is the same as you would do if you try to search for a word on a physical dictionary. First, you try to open the dictionary on the first letter of the word you want to find; when you find words starting with this letter you try to find only the words that also match the second letter of your word and so on. For this to work, you need to first sort your list (with byte arrays of {0, 0, 0, ...} on the beginning and byte arrays of {255, 255, 255, ...} on the end);

This algorithm would need to compare only up to 16 bytes in a list with 65536 items, for example, no matter the length of the arrays. This would be pretty fast.

This is just a talk example without any code, but I think you got the idea. An easy way of doing this, but without the guarantee of a perfect match and much slower would be to create a class called FingerPrint representing the byte array, override its GetHashCode() and Equals() method and just use a HashTable instance to perform the search.

Unknown Coder
  • 330
  • 2
  • 16
0

My problem was that I was extracting the features of the Sample fingerprint given by the sensor every iteration of the forEach loop. Thanks to ainwood's comment I moved the feature extraction to the outside of the loop as I only need to extract the features of the sample once, and compare to the fingerprints in the List and that reduced the worst case scenario from 15 seconds to 2.