I'm working on a program that shall record audio recognizable, so it can later be easily compared with other audio files. The audio files will contain something like speech, so I was wondering what would be easier to do:
- Implementing an algorithm for speech recognition and saving/comparing that outputs,
- or implementing / creating an algorithm that creates something like a audio-fingerprint with e.g. Fast Fourier Transform and compares those?
Has anyone some experience on that area? I'm wondering if the second solution would be realizable within a relatively short period of time. Maybe there is a solution that is less hard to code and I'm just not finding it?