Create same mel-spectrogram on server (python) and client (javascript) with librosa/TensorFlow

Question

I am currently working on a project where I need to create mel-spectrograms to classify WAV audio-files with a neuronal network. In order to have a valid input to train my network, I first have to convert these audio-files into a mel-spectrogram. To do so I am currently using librosa.feature.melspectrogram and this works perfectly fine.

Part of the project is also to classify unknown WAV audio-files in a webrowser. I am using ONNX.js for this task and this works also perfectly fine.

The problem is now that I need to create the exact same mel-spectrogram like the server would do in Python with librosa. Otherwise the input would be different and therefore the output/prediction as well.

So my question is: is there any library in JavaScript that allows one to create a mel-spectrogram that is identical for the same WAV-audio-file both on client-side (JS) and server-side (Python)?

Is there a port of librosa for JS? Any other ideas are also welcomed, e.g. changing the server-library just to use a library that works in both languages (JS and Python) and gave the exact same result for a given WAV.

I already considered TensorFlow.js but there is implemented only a subset of TensorFlow in JavaScript (tfio.experimental.audio.melscale exists only in Python server-side).

You want to embed some python code into your javascript - https://stackoverflow.com/questions/13175510/call-python-function-from-javascript-code — David Thery, Apr 10 '21 at 07:22
Embedding python code in JavaScript is not possible (technically). Running server-side code that is requested by client is also not possible, because the audio wav-file must be uploaded to server and this is not allowed for privacy reasons. — Alexander Schmidt, Apr 10 '21 at 08:29
So, isn't it possible to preprocess your wav files, and store the images (mel-spec) in the cloud ? — David Thery, Apr 10 '21 at 08:39
Sorry, this was my mistake: the wav-file must NOT be uploaded to server due privacy reasons. This is part of the project specification. — Alexander Schmidt, Apr 10 '21 at 08:48
What about that? https://github.com/torch-js/torch-js - Torchaudio provides all the required functions for computing MelSpectrogram — David Thery, Apr 10 '21 at 08:51
Btw, be aware that with a high resolution mel-spectrogram it may be possible to recover privacy sensitive information - such as speech. — Jon Nordby, Apr 11 '21 at 11:57
Jon, this is not a problem at all because the audio-file remains completely client-side. The the wav-file is transformed into a mel-spectrogram only in the browser and therefore never hits the server. — Alexander Schmidt, Apr 11 '21 at 16:14
The mel-spectrogram never leaves the browser either? Then it is no problem of course — Jon Nordby, Apr 12 '21 at 11:20

Create same mel-spectrogram on server (python) and client (javascript) with librosa/TensorFlow

0 Answers0