2

I am actually trying to use Google's FaceNet for face verification. My idea is that since my app would be used by its users multiple times every day, I might be able to improve its accuracy at verifying their faces in particular, by training the model with the data that I gather every time a user's face is verified.

What I have found so far by searching on Google is that transfer learning is used when you want your model to perform a task that is very similar to the task that the pre-trained model is used for. However, what I want to do is improve the model's accuracy for my particular input data.

So, will training the model with my own data give any tangible improvement in accuracy? If so, could you tell me how I could go about doing it? I'm unsure because I expect to collect a few images of the user everyday or so, which isn't a lot of data. I am new to machine learning, so any help would be greatly appreciated!

  • I would say in most cases it wont give any tangible improvements in accuracy especially if your user database is rather small and you use a state-of-the-art system. Contrary, it might drop the accuracy for new users. I have heard about some systems that train a separate small classifier per user over the embeddings but I would not personally opt for that. Other systems try to improve performance per user by keeping track of successful authentication attempts and use those embeddings as references in future authentication attempts. – i regular Feb 17 '23 at 16:44
  • The 'classifier per user' solution would make use of binary classifiers, right? That did occur to me; maybe I should go for that. Using the embeddings from successful authentication attempts for future is also part of the plan. Could you give some suggestions for the 'classifier per user' solution? – TheLastAirbender Feb 18 '23 at 17:34
  • Yes, so I have not done this myself but one way to start could be to finetune an existing FR system by freezing the initial network weights and adding a new fully connected layer after the embedding layer with two output classes and then train your new network layer for binary classification per user. Here your main problem will be to avoid overfitting and be capable of generalizing to new images. But I would probably start with something like this and see how it performs. Alternative, maybe something like SphereFace2 could be an interesting read if the simple approach does not work out well. – i regular Feb 20 '23 at 23:25

0 Answers0