6

I'm trying to use the Superpowered SDK to apply a real-time Time Stretching and Pitch Shifting on an mp3 file which is being played and also recorded at the same time. The problem is that no matter what I do the output sound quality is terrible (to the point of it being distorted).
I suspect that it's due to the conflicting samples per frame number. Here is the complete source code of my cpp file:

static SuperpoweredAndroidAudioIO *audioIO;
static SuperpoweredTimeStretching *stretching;
static SuperpoweredAudiopointerList *outputBuffers;
static SuperpoweredDecoder *decoder;
static SuperpoweredRecorder *recorder;
const char *outFilePath;
const char *tempFilePath;

static short int *intBuffer;
static float *playerBuffer;

bool audioInitialized = false;
bool playing = false;

static bool audioProcessing(
        void *__unused clientData, // custom pointer
        short int *audio,           // buffer of interleaved samples
        int numberOfFrames,         // number of frames to process
        int __unused sampleRate     // sampling rate
) {

    if (playing) {
        unsigned int samplesDecoded = decoder->samplesPerFrame;
        if (decoder->decode(intBuffer, &samplesDecoded) == SUPERPOWEREDDECODER_ERROR) return false;
        if (samplesDecoded < 1) {
            playing = false;
            return false;
        }



        SuperpoweredAudiobufferlistElement inputBuffer;
        inputBuffer.samplePosition = decoder->samplePosition;
        inputBuffer.startSample = 0;
        inputBuffer.samplesUsed = 0;
        inputBuffer.endSample = samplesDecoded;
        inputBuffer.buffers[0] = SuperpoweredAudiobufferPool::getBuffer(samplesDecoded * 8 + 64);
        inputBuffer.buffers[1] = inputBuffer.buffers[2] = inputBuffer.buffers[3] = NULL;


        SuperpoweredShortIntToFloat(intBuffer, (float *) inputBuffer.buffers[0], samplesDecoded);

        stretching->process(&inputBuffer, outputBuffers);

        if (outputBuffers->makeSlice(0, outputBuffers->sampleLength)) {

            while (true) { 
                int numSamples = 0;
                float *timeStretchedAudio = (float *) outputBuffers->nextSliceItem(&numSamples);
                if (!timeStretchedAudio) break;

                SuperpoweredFloatToShortInt(timeStretchedAudio, intBuffer,
                                            (unsigned int) numSamples);
                SuperpoweredShortIntToFloat(intBuffer, playerBuffer, (unsigned int) numSamples);

                recorder->process(playerBuffer, (unsigned int) numSamples);
                SuperpoweredFloatToShortInt(playerBuffer, audio, (unsigned int) numSamples);

            };
            outputBuffers->clear();
            return true;
        };
    }
    return false;
}


extern "C" JNIEXPORT void
Java_com_example_activities_DubsmashActivity_InitAudio(
        JNIEnv  __unused *env,
        jobject  __unused obj,
        jint bufferSize,
        jint sampleRate,
        jstring outputPath,
        jstring tempPath
) {

    decoder = new SuperpoweredDecoder();

    outputBuffers = new SuperpoweredAudiopointerList(8, 16);

    outFilePath = env->GetStringUTFChars(outputPath, 0);
    tempFilePath = env->GetStringUTFChars(tempPath, 0);

}

extern "C" JNIEXPORT jdouble
Java_com_example_activities_DubsmashActivity_OpenFile(
        JNIEnv *env,
        jobject  __unused obj,
        jstring filePath) {
    const char *path = env->GetStringUTFChars(filePath, 0);
    decoder->open(path);
    intBuffer = (short int *) malloc(decoder->samplesPerFrame * 2 * sizeof(short int) + 32768);
    playerBuffer = (float *) malloc(decoder->samplesPerFrame * 2 * sizeof(short int) + 32768);
    audioIO = new SuperpoweredAndroidAudioIO(
            decoder->samplerate,
            decoder->samplesPerFrame,
            false,
            true,
            audioProcessing,
            NULL,
            -1, -1,
            decoder->samplesPerFrame * 2
    );

    stretching = new SuperpoweredTimeStretching(decoder->samplerate);

    stretching->setRateAndPitchShift(1, 0);

    recorder = new SuperpoweredRecorder(
            tempFilePath,              
            decoder->samplerate,  
            1,                  
            2,                  
            false,             
            recorderStopped,    
            NULL               
    );

    return 0;
}

Some notes to consider:

  1. This is not a duplicate of this question, since the solution in that thread doesn't work for me
  2. I have tried playing with the decoder->samplesPerFrame and numSamples but I can't get a decent output.
  3. If I set the Time Stretching to 1 and Pitch Shift to 0 the sound plays seamlessly.

UPDATE 1:
After some more tampering and playing with different values for number of samples, I figured the problem must be with the difference between the amount of samples that the audio output (DAC MAN) expects and the amount that outputBuffers->nextSliceItem actually provides.
Having said that I can think of a way to mitigate this problem and that would be to append the output of outputBuffers->nextSliceItem to a temporary buffer and then when it reaches the threshold, direct it to the audio output.

Hence my second question: Is there a way in C++ to append a buffer to another buffer?

2hamed
  • 8,719
  • 13
  • 69
  • 112
  • That's a *lot* of code. Any chance you could reduce it down to a [mcve] / [SSCCE](http://www.sscce.org) ? – Jesper Juhl Oct 19 '18 at 18:27
  • @JesperJuhl, Ok, just removed all redundant code from the sample. It should now be much more readable. – 2hamed Oct 19 '18 at 18:32
  • Why are you using `malloc` in a C++ program? Besides the obvious "manual memory management is a code smell" comment, it (potentially) allocates from a different heap than `new` and you are just setting yourself up for future problems if you ever compare pointers or similar. But, just use smart pointers already.. all those raw pointers are really painful ; would never pass (*my*) code review. – Jesper Juhl Oct 19 '18 at 18:43
  • @JesperJuhl, That's because I'm not a C++ programmer and honestly I don't know what you're talking about. I just followed the published samples from the author of the SDK. – 2hamed Oct 19 '18 at 18:46
  • @JesperJuhl "Allocates from a different heap" ???? There is just one heap... Smart pointers would definitely not pass my code review, because real-time audio needs precise control on memory allocation (because of blocking), and old-school allocations provide the best way to see what happens exactly. – Gabor Szanto Oct 26 '18 at 06:27
  • @GaborSzanto: Using smart pointers doesn't equal giving up control on memory allocation (in fact, if you really want to, you can create your objects any way you prefer). Smart pointers are about managing an objects lifetime and ownership, and for that they are quite useful tools. – hoffmale Oct 26 '18 at 06:43
  • @HamedMomeni have you figured it out and can you maybe share your final code? – HTron Jan 22 '19 at 13:11
  • @HTron I ended up using the SP's AdvancedPlayer which provides TimeStretching and PitchShifting out of the box. And if you need the recording part as well you direct the output of the player to the recorder. – 2hamed Jan 22 '19 at 13:27
  • @HamedMomeni ah ok, I see! Thanks for your quick reply – HTron Jan 23 '19 at 17:55
  • How to get the value of ```samplesPerFrame```? – Adnan Arshad Oct 15 '21 at 08:43

1 Answers1

3

You need to output audioProcessing(int numberOfFrames) number of frames. Therefore in outputBuffers->makeSlice you need to ask numberOfFrames, not outputBuffers->sampleLength (basically you are asking for "any number of frames in outputBuffers", not "numberOfFrames").

Then you convert from float to int, then back to float? It doesn't make sense. You got floating point audio in timeStretchedAudio, which can be immediately processed by your recorder.

After that you forgot to step "audio" forward after you convert some floating point samples into it.

And finally you remove all audio from outputBuffers, while you need to remove only the number of frames you output to "audio".

Gabor Szanto
  • 1,329
  • 8
  • 12
  • Thanks. But what do you mean by "step audio forward"? I haven't seen this term in any of your samples. And the other thing about clearing buffers. Should I use `outputBuffers->truncate(numberOfFrames, true);`? – 2hamed Oct 27 '18 at 13:39
  • You have a buffer called "audio". After you used/copied some samples from it, you need to increase its value, because you are using it in an iteration, and if you don't increase it, you will use/copy the same data again. – Gabor Szanto Oct 30 '18 at 14:15
  • Could please demonstrate what you mean with a little bit of code? – 2hamed Oct 31 '18 at 10:17
  • audio += numSamples * 2; // (2 because stereo) – Gabor Szanto Nov 01 '18 at 13:48