1

I have a problem counting voice input frequency from the audio input of my microphone. Can anyone help me with this?

I'm supposed to get an audio input from my microphone and count its frequency.

This is my code just to show how I did it; and if anyone can identify a faulty implementation.

package STLMA;

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */


/**
 *
 * @author CATE GABRIELLE
 */

import java.io.*;
import javax.sound.sampled.*;

public class SpeechDetection {
boolean stopCapture = false;
ByteArrayOutputStream byteArrayOutputStream;
TargetDataLine targetDataLine; // This is the object that acquires data from
                               // the microphone and delivers it to the program

// the declaration of three instance variables used to create a SourceDataLine
// object that feeds data to the speakers on playback
AudioFormat audioFormat;    
AudioInputStream audioInputStream;
SourceDataLine sourceDataLine;    

double voiceFreq = 0;    

FileOutputStream fout;
AudioFileFormat.Type fileType;
public static String closestSpeaker;

public SpeechDetection(){
    captureAudio();
}       

private void captureAudio(){
    try{
        audioFormat = getAudioFormat();
        DataLine.Info dataLineInfo = new   
         DataLine.Info(TargetDataLine.class,audioFormat);
        // object that describes the data line that we need to handle the acquisition 
        // of the audio data from the microphone. The first parameter makes the audio 
        // data readable
        targetDataLine = (TargetDataLine)AudioSystem.getLine(dataLineInfo);
         //  object to handle data acquisition 
        targetDataLine.open(audioFormat);                 
           //from the microphone that matches 
        targetDataLine.start();                            
         // the information encapsulated in the DataLine.Info object  
        Thread captureThread = new Thread(new CaptureThread());
        captureThread.start();
    } catch (Exception e) {
    System.out.println(e);
    System.exit(0);
    }
}    

private AudioFormat getAudioFormat(){
    float sampleRate = 8000.0F; // The number of samples that will be acquired 
    //8000,11025,16000,22050,44100  each second for each channel of audio data.
    int sampleSizeInBits = 16; //The number of bits that will be used to 
    //8,16                        describe the value of each audio sample.
    int channels = 1;           // Two channels for stereo, and one channel for mono.
    //1,2
    boolean signed = true;      // Whether the description of each audio sample 
    //true,false        
     //consists of both positive and negative values, or positive values only.          
    boolean bigEndian = false;
    //true,false
    return new AudioFormat(sampleRate,sampleSizeInBits,channels,signed,bigEndian);        
}

//Inner class to capture data from microphone
class CaptureThread extends Thread {
    byte tempBuffer[] = new byte[8000];  
    // byte buffer variable to contain the raw audio data
    int countzero;                      
     // counter variable to count the number of zero's               
    short convert[] = new short[tempBuffer.length]; 
    // short variable that is appropriate to 

    // collect the audio input for porcessing

    //        public void start(){
    //            Thread voices = new Thread(this);
    //            voices.start();
    //        }

    @Override
    public void run(){               
   // a continuous thread to process the continuous audio input
        byteArrayOutputStream = new ByteArrayOutputStream(); // the object to write the 

    // raw audio input to the byte buffer variable
        stopCapture = false;
        try{
            while(!stopCapture){                    
                int cnt = targetDataLine.read(tempBuffer,0,tempBuffer.length); 
            // reads the raw audio input 

             // and returns the number of bytes actually read
                byteArrayOutputStream.write(tempBuffer, 0, cnt); 
            // writing the number of bytes read to the 
                                                                 // container                 
                try{ 
                    countzero = 0; 

                    for(int i=0; i < tempBuffer.length; i++){  
                // the loop that stores the whole audio data                                        
                        convert[i] = tempBuffer[i];    
                // to the convert variable which is a short data type,
                        if(convert[i] == 0){countzero++;}     
                 // then counts the number of zero's 
                    }
                    voiceFreq = (countzero/2)+1;               
                // calculates the number of frequency and 
                                    // stores to the voiceFreq variable
                    if(voiceFreq>=80 && voiceFreq<=350)
                        System.out.println("Voice"+voiceFreq);
                    else
                       System.out.println("Unvoice"+voiceFreq);
                }catch(StringIndexOutOfBoundsException e)  
                {System.out.println(e.getMessage());}                                                                                    
                    Thread.sleep(0);                                        
            }
        byteArrayOutputStream.close();
        }catch (Exception e) {
            System.out.println(e);
            System.exit(0);
        }
    }
}           

public static void main(String [] args){
    SpeechDetection voiceDetector1 = new SpeechDetection();        
    //        voiceDetector1.setSize(300,100);
    //        voiceDetector1.setDefaultCloseOperation(EXIT_ON_CLOSE);
    //        voiceDetector1.setVisible(true);
}



}

by the way, "voiceFreq" stands for voice frequency. My goal here is to know if the input is a voice or a noise. I hope someone could help me with my problem. Thank you and a happy New Year.

jpeter723
  • 11
  • 1
  • 4

1 Answers1

3

I would think for detecting whether something is a potential voice or a noise, one would want to do an FFT on a section of data and see whether the frequency components were within some range of "normal voice".

Maybe see Reliable and fast FFT in Java for some FFT information.

Community
  • 1
  • 1
vextorspace
  • 934
  • 2
  • 10
  • 25
  • Yes I have considered FFT, but Im having a hard time understanding FFT on how it works and implementing together with my code. Still Im working on it. Do you have some other of doing this? – jpeter723 Jan 03 '12 at 13:19
  • +1: Counting frequency works for pure tones. You can just count the number of samples between the peaks. Recorded audio has a mix of frequencies and FFT gives you the strength of every frequency (or buckets of frequencies) From the transformation you can pick the strongest or take the weighted average or some other approach. FFT makes removing noise and infra sound trivial (you can just ignore those frequencies) – Peter Lawrey Jan 03 '12 at 13:26
  • So if I were to implement it by using FFT, which part of my code is needed to use in FFT? Because, as what I have understood on FFT it needs three inputs, it needs X, Y and Z. And I don't know what this variable stands for. – jpeter723 Jan 03 '12 at 13:59
  • But looking back in to my code, I am using the targetDataLine which contains 8000 sampleRate. If I'm not mistaken, this contains amplitude of the audio input. And counting zeros/zeroCrossing may help counting the frequency, but without any noise removal technique. So it turns out that it will give me an outcome of what is the actual audio data my microphone has captured. But what concerns me is if my implementations is correct or not. Because I cannot really absorb FFT technique, So I have to stick to my technique. – jpeter723 Jan 03 '12 at 14:00
  • 1
    You will not get a useful measurement if you just count zero crossings for an arbitrary waveform - you really need to think this through properly instead of coding wildly with an unsound theoretical foundation in the vain hope that it might just work. – Paul R Jan 03 '12 at 14:09
  • If you do not want to use FFT's, you can use filtering techniques. This is similar in theory to FFT's but very different in practice. If you compute a FIR filter (finite input response) and apply it to your data, you can remove all the frequencies that are not in the bandwidth of your filter. It has been a few years since I have done this sort of work or I'd suggest a set of co-efficients. This will remove low and high frequency noise. If these filters had a very narrow bandwidth, you could check several frequency ranges and use that to decide there was a "voice" – vextorspace Jan 03 '12 at 14:44
  • Ok thank you for getting into me. You really lighten up my mind. I will consider your suggestions, I know FFT works with a desirable outcome. So I just have to give time to it, right guyz? Thanks a lot – jpeter723 Jan 03 '12 at 15:01