Were there any approaches on Producing Audio with Convolutional neural networks?
There are a lot of approaches on producing images through convnets. But i see no articles or posts about producing audio.
According to this topic on stackoverflow , the post writer says :
"I have found out the audio can be represented as spectrograms."
So why cant it be done?
To do this with Convnets, should I :
a) Use LSTM with the conv layers?
B) What should be the output? Considering the spectogram...