3

so I changed the playback speed of my output.wav file using python wave module frame rate method and multiplying previous frame rate by 2, but I want to keep the pitch of the new output.wav the same, because it sounds high pitch. How to do? This is snippet of code I have for read and writing to output.wav. I am looking for simple solution, trying to avoid downloading external libraries. Ok with wave library.

Thanks.

import wave

wf = wave.open('output.wav', 'rb')
RATE = wf.getframerate()
signal = wf.readframes(-1)
channels = wf.getnchannels()
width = wf.getsampwidth()
wf.close()

spf = wave.open('output.wav', 'wb')
spf.setnchannels(channels)
spf.setsampwidth(width)
spf.setframerate(RATE*2)
spf.writeframes(signal)
spf.close()

Gjison Giocf
  • 145
  • 1
  • 11
  • There really is no simple solution without using other libraries, unless you think doing your own digital signal processing is simple. If you can, I would recommend using Rubber Band as per this [answer](https://stackoverflow.com/a/59269959/3589122). – GordonAitchJay Mar 20 '20 at 05:14

1 Answers1

1

I have pitch detection function written here but it need numpy at least, I think you should change it a bit. It does not rely on that library much. Just for faster results.

Here there is the code, this function as you can see does not shift the window with the size of it, instead it shifts window with some overlap. you should adjust these particular codes. there is some usage of numpy module which I could have changed it easily but I leave it to you.

There are many rules in signal processing and I have implemented some. for example if the energy of a frame is not enough it doesn't have pitch and it is shown by sending -1 instead of a pitch.

import numpy as np

    
def pitch_detection(self, frame_matrix, frame_number, lag_vector, frequency):
        np.seterr(divide='ignore', invalid='ignore')
        pitch_freq_vector = []
        for frame in range(frame_number):
            ccf = [] 
            frame_expand_1 = frame_matrix[frame-1, :]
            frame_expand_2 = frame_matrix[frame-2, :]
            temp_corr_1 = frame_matrix[frame, :]
            temp_corr_2 = np.append(frame_expand_1[256:], temp_corr_1, axis=0)
            temp_corr_2 = np.append(frame_expand_2[192:256], temp_corr_2, axis=0)
            len_tc2 = len(temp_corr_2)
            for lag in lag_vector: #pitch is the highest correlation in lag vector
                ccf.append(np.sum(temp_corr_1*temp_corr_2[len_tc2-lag-512:len_tc2-lag]))
            max_index, max_value = max(enumerate(ccf), key=operator.itemgetter(1))
            if max(ccf) > 0.3*np.sum(np.power(temp_corr_1, 2)): #if more than 30 detect pitch
                pitch_freq_vector.append(max_index)
            else:
                pitch_freq_vector.append(-1)
        return pitch_freq_vector

Sadegh
  • 125
  • 11