4

I am writing a music player and I want to normalize the audio volume across different songs.

I could think of some different ways to do this, e.g.:

  1. Go through all PCM samples (assume floating point from -1 to 1) and select the m = max(abs(sample)). Then apply the factor 1/m to all the PCM samples. This would make the peak be at 1.

  2. Go through the PCM stream and for each position, take the Hanning window of some width around it, calculate the average of absolute samples and from those data, pick the maximum and normalize everything.

  3. The same as 2 but some other way to get some sort of averaged value.

2 and 3 have the disadvantage that I might need some clipping and thus loose some quality. By not normalizing to 1 but to 0.95 or so, I maybe could avoid this to some degree, though. But I think 2 and 3 have the advantage that this might be the more natural normalization for the user. Wikipedia also has some information about this and mentions RMS, ReplayGain or EBU R128 to measure the loudness of a song.

How are other popular music players (like iTunes or so) doing this?

Albert
  • 65,406
  • 61
  • 242
  • 386
  • #1 is the definition of normalizing. forget #2 and #3. – Bjorn Roche Sep 19 '12 at 01:50
  • @BjornRoche: I'm not sure what you mean by that comment. From what I have read so far, most music players use ReplayGain or some similar method to measure the loudness. You can read everywhere that #1 is very much not a solution you want. – Albert Sep 19 '12 at 08:40
  • Sorry, let me try again: When A/V professionals speak of normalizing audio, they are referring to #1. It doesn't fully accomplish what you want (though it might be a reasonable first approximation) but it does have meaning in the A/V world. #2 and #3 have no meaning. – Bjorn Roche Sep 19 '12 at 14:13
  • @BjornRoche: It seems you have some professional audio background, so I might just believe you there. Note that Wikipedia however calls this specifically **peak normalization** and uses *normalization* itself generically as I have ([here](http://en.wikipedia.org/wiki/Audio_normalization)). – Albert Sep 30 '12 at 22:12
  • We're both right :) If you say "the files are normalized" without any other qualifiers, you are talking about peak normalization, to 0dBFS or +/- 1, depending on the file type. To get the other meanings you have to say something like: "normalized, by which I mean...." or "loudness normalized (to XXXX)", but it's all normalizing. – Bjorn Roche Oct 01 '12 at 02:07

1 Answers1

7

iTunes uses the Sound Check technology. "Sound Check is a proprietary Apple Inc. technology similar in function to ReplayGain. It is available in iTunes and on the iPod." (from Wikipedia) So, this is no solution for me.

It seems that ReplayGain is the most common technic. The algorithm is explained here. A sample implementation is mp3gain (GPL) or ffmpeg-replaygain (GPL, derived from mp3gain). I have my own implementation now in my MusicPlayer project (BSD-licence).

See also these projects with implementations:

Albert
  • 65,406
  • 61
  • 242
  • 386
  • 1
    Link to the algorithm leads to a parked domain. Alternative sources are [Replay Gain on Wikipedia](http://en.wikipedia.org/wiki/ReplayGain) and the official [ReplayGain 1.0 specification](http://wiki.hydrogenaud.io/index.php?title=ReplayGain_specification) – cod3monk3y Oct 09 '14 at 16:08