15

I have a bunch of different audio recordings in WAV format (all different instruments and pitches), and I want to "normalize" them so that they all sound approximately the same volume when played.

I've tried measuring the average sample magnitude (the sum of all absolute values divided by the number of samples), but normalizing by this measurement doesn't work very well. I think this method isn't working because it doesn't take into account the frequency of the sounds, and I know that higher-frequency recordings sound louder than lower-frequency sounds of the same amplitude.

Does anyone know a good method for measuring the loudness of a sound?

MusiGenesis
  • 74,184
  • 40
  • 190
  • 334
  • It seems like this is dependent on a lot of factors outside your control - one of the biggest of which is the listener's relative sensitivity to various frequencies. That varies quite a bit from individual to the next. – Rex M Jun 12 '09 at 02:26
  • Just kidding. Yeah, a lot of this will vary by person, but I'd like to generally normalize as well as possible. – MusiGenesis Jun 12 '09 at 02:28
  • 1
    @Nosredna: I assume by compression you mean range compression, not mp3-type compression? Although mp3 compression kind of messes up music, too. – MusiGenesis Jul 19 '09 at 01:55
  • I was too pithy. I should have said, "The Loudness Wars Killed Music." – Nosredna Jul 19 '09 at 02:14

5 Answers5

15

Root Mean Square is often used to estimate the loudness of sound files. This is because a sound that is very loud might not be perceived that way if it is very short. Also remember that power increases exponentially with the square of amplitude.

The audio geeks at Hydrogen Audio know a ton about this stuff...check out their free Replay Gain software. You may not need to do any programming at all.

EDIT: Included comment feedback on power vs. amplitude.

PeterAllenWebb
  • 10,319
  • 3
  • 37
  • 44
  • Worked like a charm, thanks. My undergraduate degree was in Physics, so I'm kind of embarassed that I didn't remember this. I had done something really stupid before like multiplying n samples and taking the nth root, thinking that's what root mean square was. Thanks for saving me from myself. – MusiGenesis Jun 12 '09 at 02:52
  • 2
    You might want to pay attention to the fact that not all frequencies are percieved the same by the listener. A certain RMS level of very low frequencies might be perceived as sounding much louder than the same RMS level of high frequencies. – sthg Jun 12 '09 at 03:21
  • 4
    Loudness perception is indeed frequency dependent, and follows the equal loudness contours (http://en.wikipedia.org/wiki/Loudness). – Emile Vrijdags Jun 12 '09 at 03:36
  • 2
    Correction: Power increases *as the square* of amplitude, not exponentially (i.e. P=kA^2). Otherwise, RMS in indeed the right way to measure average loudness. – Noldorin Jul 19 '09 at 01:41
  • 1
    I agree on the Root Mean Square. To decide on the number of samples: "...time-weightings have been standardised, 'S' (1 s) originally called Slow, 'F' (125 ms) originally called Fast and 'I' (35 ms) originally called Impulse." taken from http://en.wikipedia.org/wiki/Sound_level_meter – AudioDroid Nov 18 '10 at 16:27
4

To add to PeterAllenWebb's response:

Before you calculate the RMS, you should "center" your sample first (think of a 5-minute .wav where each sample has the maximum +amplitude). The best way to do that is to use a highpass filter at a subsonic frequency.

That would still not take the frequencies that humans are sensitive to in count. To do that, you could use A-weighting. There's a page where you can calculate it online: http://www.diracdelta.co.uk/science/source/a/w/aweighting/source.html

The code seems to be here: http://www.diracdelta.co.uk/science/source/a/w/aweighting/multicalc.js

Wouter van Nifterick
  • 23,603
  • 7
  • 78
  • 122
  • I'm finding that normalizing by RMS works a lot better than normalizing by peak value in terms of getting sounds at the same pitch to be roughly equal in volume, but the RMS measurement seems relatively insensitive to pitch, so it's not doing what I want (which is to lower the volume for high-pitched sounds). Webb's wikipedia link showed the frequency response curve for human hearing, but thank you especially for the link to the formula - it's going into code tonight. – MusiGenesis Jul 19 '09 at 01:46
  • How is RMS measure different from log(amplitude)? https://stackoverflow.com/questions/2445756/how-can-i-calculate-audio-db-level (I mean of course practically, not mathematically) Can log-ging the RMS give us a logarithmic scale (decibels)? – Tomasz Gandor Jan 28 '20 at 18:48
3

Well not being an expert on audio and adding to the previous comment, you should figure out what you define as the "shortest amount of time for peak power" and then just convert the wave to raw floating point and use RMS over the stretch of time and continuously take chunks of that length of time, find the MAX and there you have your highest peak power.

2

To reiterate what some other people have said, use RMS value to estimate the "loudness" of a passage of sound.

But, if you're dealing with impulsive sounds like plucking or drum hits, you'd want to do a sliding RMS value and pick out only the peak RMS value. Measure 100 ms of the sound, slide the window, measure again, etc. and then normalize according to the largest value you find.

Definitely remove any DC value before doing the RMS, and A-weighting will make it more like how we hear. Here's code for A-weighting in MATLAB/Octave and Python.

endolith
  • 25,479
  • 34
  • 128
  • 192
-1

I might be way off here, but, if you have wavepad you can load in multiple files and mess with the volumes a little bit so they are all the same. Also, if you have certain sections of a file that are louder, you can select that section and lower the volume for that one section.

EDIT: And sorry, it;s not really a "method" for measuring volume, but if you just need to make them all the same this should work fine.