2

I am currently working on a project where I need to create mel-spectrograms to classify WAV audio-files with a neuronal network. In order to have a valid input to train my network, I first have to convert these audio-files into a mel-spectrogram. To do so I am currently using librosa.feature.melspectrogram and this works perfectly fine.

Part of the project is also to classify unknown WAV audio-files in a webrowser. I am using ONNX.js for this task and this works also perfectly fine.

The problem is now that I need to create the exact same mel-spectrogram like the server would do in Python with librosa. Otherwise the input would be different and therefore the output/prediction as well.

So my question is: is there any library in JavaScript that allows one to create a mel-spectrogram that is identical for the same WAV-audio-file both on client-side (JS) and server-side (Python)?

Is there a port of librosa for JS? Any other ideas are also welcomed, e.g. changing the server-library just to use a library that works in both languages (JS and Python) and gave the exact same result for a given WAV.

I already considered TensorFlow.js but there is implemented only a subset of TensorFlow in JavaScript (tfio.experimental.audio.melscale exists only in Python server-side).

  • You want to embed some python code into your javascript - https://stackoverflow.com/questions/13175510/call-python-function-from-javascript-code – David Thery Apr 10 '21 at 07:22
  • Embedding python code in JavaScript is not possible (technically). Running server-side code that is requested by client is also not possible, because the audio wav-file must be uploaded to server and this is not allowed for privacy reasons. – Alexander Schmidt Apr 10 '21 at 08:29
  • So, isn't it possible to preprocess your wav files, and store the images (mel-spec) in the cloud ? – David Thery Apr 10 '21 at 08:39
  • Sorry, this was my mistake: the wav-file must NOT be uploaded to server due privacy reasons. This is part of the project specification. – Alexander Schmidt Apr 10 '21 at 08:48
  • What about that? https://github.com/torch-js/torch-js - Torchaudio provides all the required functions for computing MelSpectrogram – David Thery Apr 10 '21 at 08:51
  • Btw, be aware that with a high resolution mel-spectrogram it may be possible to recover privacy sensitive information - such as speech. – Jon Nordby Apr 11 '21 at 11:57
  • Jon, this is not a problem at all because the audio-file remains completely client-side. The the wav-file is transformed into a mel-spectrogram only in the browser and therefore never hits the server. – Alexander Schmidt Apr 11 '21 at 16:14
  • The mel-spectrogram never leaves the browser either? Then it is no problem of course – Jon Nordby Apr 12 '21 at 11:20
  • Alex, did you find a solution? – Xavier Sep 29 '22 at 06:22

0 Answers0