How can the sound level at specific timestamps of a .wav file be measured using the Sound API in Java?

Question

After searching for over 12 hours, I was unable to find anything regarding this. ALl I could find is how to use functions from the Sound API to measure and change the volume of the device, not the .wav file. It would be great if someone could advise us/tell us how to get and/or change the volume from specific timestamps of a .wav file itself, thank you very much!

Even if it is not possible to change the audio of the .wav file itself, we need to know at least how to measure the volume level at the specific timestamps.

You can read the .wav file and examine the data. What definition of "volume" are you using? — tgdavies, Nov 11 '22 at 22:16
How can I read the .wav file? For the definition, i mean like how loud the sound is, either in Decibels or a linear scale (whichever one the computer offers). — anti_waxxer, Nov 12 '22 at 00:32
See https://stackoverflow.com/questions/3297749/java-reading-manipulating-and-writing-wav-files — tgdavies, Nov 12 '22 at 00:40

score 0 · Answer 1 · answered Nov 12 '22 at 20:12

To deal with the amplitude of the sound signal, you will have to inspect the PCM data held in the .wav file. Unfortunately, the Java Clip does not expose the PCM values. Java makes the individual PCM data values available through the AudioInputStream class, but you have to read the data points sequentially. A code example is available at The Java Tutorials: Using Files and Format Converters.

Here's a block quote of the relevant portion of the page:

Suppose you're writing a sound-editing application that allows the user to load sound data from a file, display a corresponding waveform or spectrogram, edit the sound, play back the edited data, and save the result in a new file. Or perhaps your program will read the data stored in a file, apply some kind of signal processing (such as an algorithm that slows the sound down without changing its pitch), and then play the processed audio. In either case, you need to get access to the data contained in the audio file. Assuming that your program provides some means for the user to select or specify an input sound file, reading that file's audio data involves three steps:

Get an AudioInputStream object from the file.

Create a byte array in which you'll store successive chunks of data from the file.

Repeatedly read bytes from the audio input stream into the array. On each iteration, do something useful with the bytes in the array (for example, you might play them, filter them, analyze them, display them, or write them to another file).

The following code snippet outlines these steps:

int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
  AudioInputStream audioInputStream = 
    AudioSystem.getAudioInputStream(fileIn);
  int bytesPerFrame = 
    audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
    // some audio formats may have unspecified frame size
    // in that case we may read any amount of bytes
    bytesPerFrame = 1;
  } 
  // Set an arbitrary buffer size of 1024 frames.
  int numBytes = 1024 * bytesPerFrame; 
  byte[] audioBytes = new byte[numBytes];
  try {
    int numBytesRead = 0;
    int numFramesRead = 0;
    // Try to read numBytes bytes from the file.
    while ((numBytesRead = 
      audioInputStream.read(audioBytes)) != -1) {
      // Calculate the number of frames actually read.
      numFramesRead = numBytesRead / bytesPerFrame;
      totalFramesRead += numFramesRead;
      // Here, do something useful with the audio data that's 
      // now in the audioBytes array...
    }
  } catch (Exception ex) { 
    // Handle the error...
  }
} catch (Exception e) {
  // Handle the error...
}

END OF QUOTE

The values themselves will need another conversion step before they are PCM. If the file uses 16-bit encoding (most common), you will have to concatenate two bytes to make a single PCM value. With two bytes, the range of values is from -32778 to 32767 (a range of 2^16).

It is very common to normalize these values to floats that range from -1 to 1. This is done by float division using 32767 or 32768 in the denominator. I'm not really sure which is more correct (or how much getting this exactly right matters). I just use 32768 to avoid getting a result less than -1 if the signal has any data points that hit the minimum possible value.

I'm not entirely clear on how to convert the PCM values to decibels. I think the formulas are out there for relative adjustments, such as, if you want to lower your volume by 6 dBs. Changing volumes is a matter of multiplying each PCM value by the desired factor that matches the volume change you wish to make.

As far as measuring the volume at a given point, since PCM signal values can range pretty widely as they zig zag back and forth across the 0, the usual operation is to take an average of the absolute value of many PCM values. The process is referred to as getting a root mean square. The number of values to include in a RMS calculation can vary. I think the main consideration is to make the number of values in the rolling average to be large enough such that they are greater than the period of the lowest frequency included in the signal.

There are some good tutorials at the HackAudio site. This link is for the RMS calculation.

How can the sound level at specific timestamps of a .wav file be measured using the Sound API in Java?

1 Answers1