I want to train my neural network with different sounds. However, the size of each sounds are different. Does anyone know how to train a neural network with different size of input? Thanks.
Asked
Active
Viewed 816 times
1
-
What do you mean by "each sounds are different"? If I understand correctly you have an input which is a sound, i.e. array of samples and you want that to be the input of the neural network. One possible solution is to extract feature of the sound (or some part of it), which is going to be constant vector of values. For example, if you wanted to extract features for speech recognition you could use something like MFCC: https://en.wikipedia.org/wiki/Mel-frequency_cepstrum – Nikola Stojiljkovic Dec 08 '16 at 20:10
-
The length of each sounds are different, which means the size of input vectors will be different for each label sounds. For example, if I want to train the model identify sound A, B, C, however, the lengths of these three types of sounds are different. What if I want to train the model by the original sound, instead of training it by extracting features firstly? – Pelican Dec 08 '16 at 22:45
-
I think you might be interested in checking this: http://stats.stackexchange.com/questions/127542/convolutional-neural-network-for-time-series – HRgiger Dec 09 '16 at 09:32
-
Related https://stackoverflow.com/questions/19419098/how-to-train-on-and-make-a-serialized-feature-vector-for-a-neural-network – Nikolay Shmyrev Aug 19 '18 at 07:53
1 Answers
2
There is no way to classify inputs of different sizes, but you can transform your signal into a sequence of fixed-size feature vectors (or into a sequence of fixed-size pieces of the original sound). For a sound we usually employ MFCCs or just a spectrogram. Thus, you need to apply methods that operate on sequences. It can be a recurrent neural network, or you can employ a feed-forward network and then post-process its outputs for each frame somehow.

Dmytro Prylipko
- 4,762
- 2
- 25
- 44