5

I'm developing a SIP client using python, based on pjsua2. I have a custom Call class based on python wrapper call and my code is able to get an active connection. On my custom onCallMediaState, I have access to the Audio Conference Bridge:

    def onCallMediaState(self, prm):
        """
        Manage call media state callbacks.

        - Autoconnect audio
        """
        ci = self.getInfo()

        logger.info("onCallMediaState", media_size=ci.media.size())
        self._print_call_info("onCallMediaState")

        for media_index, media in enumerate(ci.media):
            if media.type == pj.PJMEDIA_TYPE_AUDIO:
                if ci.stateText == "CONFIRMED":
                    """
                    It seems a bug with callbacks. CONFIRMED
                    is send at start and disconnect. So stop
                    record is manual, cannot use DISCONNECTD
                    """
                    logger.info("Call CONFIRMED")

At this point I can use media_index to record or play the audio incoming from the call. In example, for recording:

    def record_call(self, media_index):
        """
        Record the audio incoming from call using default playback device
        """
        record_media = pj.Endpoint_instance().audDevManager().getCaptureDevMedia()
        audio_media = pj.AudioMedia.typecastFromMedia(self.getMedia(media_index))
        port_id = audio_media.getPortId()
        rx_level = audio_media.getRxLevel()
        tx_level = audio_media.getTxLevel()
        filename = "file.wav"
        logger.info("Recording audio media", port_id=port_id, rx_level=rx_level, tx_level=tx_level)
        self._recorder = pj.AudioMediaRecorder()
        self._recorder.createRecorder(filename);
        self._is_recording = True
        record_media.startTransmit(self._recorder)

And a file.wav is created. Or for use default audio device:

    def play_call(self, media_index):
        """
        Play the audio incoming from call using default playback device
        """
        playback_media = pj.Endpoint_instance().audDevManager().getPlaybackDevMedia()
        audio_media = pj.AudioMedia.typecastFromMedia(self.getMedia(media_index))
        port_id = audio_media.getPortId()
        rx_level = audio_media.getRxLevel()
        tx_level = audio_media.getTxLevel()
        logger.info("Playing audio media", port_id=port_id, rx_level=rx_level, tx_level=tx_level)
        audio_media.startTransmit(playback_media)

Both examples work and according to PjSUA2 Media documentation and Audio media documentation it is possible transmit and receive audio, play and record WAV. But according with pjsip Media Port documentation other medias are possible. It seems that SWIG wrapper do not support it.

Finally, my question, Is it possible to manage the audio frame in memory without record a wav file?

I do not want to write in the hard disk the audio, just use it in memory, and for this purpose I need the raw data directly. One workaround is to write a chunk wav file, and read it one by one, but this is a dirty solution with a big overhead. According to Media flow documentation it is possible to get the callbacks, but I cannot find how to do this in Python. It doesn't exist the typedef void *MediaPort in the python wrapper to try to bypass the callbacks.

vgonisanz
  • 11,831
  • 13
  • 78
  • 130

2 Answers2

1

I have been looking into the same, but for pjsua and not pjsua2. Originally pjsip does not support this for pjsua (not sure for pjsua2), but I found a project on GitHub that had a customization of pjproject 2.2 with buffered streaming called "UFAL-DSG/alex".

I ported the customization to python3 and pjproject 2.9 in "nicolaipre/python3-pjsip-memory-buffer".

I know it may not be of use to you for pjsua2, but maybe it can help someone else looking for something similar.

Nicolai Prebensen
  • 301
  • 1
  • 4
  • 10
1

pjsua2_demo.cpp offers some hints on how to do this.

In python:

class MyCall(pj.Call):

    def onCallMediaState(self, prm):
        ci = self.getInfo()
        for media_info in ci.media:
            if media_info.status == pj.PJSUA_CALL_MEDIA_ACTIVE:
                if media_info.type == pj.PJMEDIA_TYPE_AUDIO:
                    print("-----------------------------------> OnCallMediaState: Audio media is active")

                    fmt = pj.MediaFormatAudio()
                    fmt.type = pj.PJMEDIA_TYPE_AUDIO
                    fmt.clockRate = 16000
                    fmt.channelCount = 1
                    fmt.bitsPerSample = 16
                    fmt.frameTimeUsec = 20000

                    self.med_port = MyAudioMediaPort()
                    self.med_port.createPort("med_port", fmt)
                    
                    media = pj.AudioMedia.typecastFromMedia(self.getMedia(media_info.index))
                    media.startTransmit(self.med_port)


class MyAudioMediaPort(pj.AudioMediaPort):

    def onFrameRequested(self, frame):
        frame.type = pj.PJMEDIA_FRAME_TYPE_AUDIO

    def onFrameReceived(self, frame):
        # Process the incoming frame here
        print("frame received")
        print(frame.size)
        byte_data = [frame.buf[i] for i in range(frame.buf.size())]
        # Convert 1-byte values to signed 16-bit values
        int_data = [struct.unpack('<h', bytes(byte_data[i:i+2]))[0] for i in range(0, len(byte_data), 2)]

        print(int_data)

Please note; I'm just doing the bit-shift conversion so the numbers make sense. I haven't actually tested this, as most libraries will expect the unconverted little-endian buffer (you probably wont use it), but the output looks reasonable.

TinkerTank
  • 5,685
  • 2
  • 32
  • 41