FFT and decibel scales

Question

If I take audio data on an iPhone (i.e. real) data, perform an FFT and then take the magnitudes (Re^2 + Im^2).

These vary from >0 to some large numbers, so I do 10log(n) to get it in dB.

This gives me outputs that are negative (for the inputs that were < 1) to positive.

But the examples I've seen for this (and also drawing the spectrum in Sonic Visualiser) always have positive spectrums when measured in dB.

So what have I missed?!

On a wider note, as I understand it decibels are a ratio, so in this context when turning the FFT magnitudes into dB, what are they a ratio to?

possible duplicate: http://stackoverflow.com/questions/2445756/how-can-i-calculate-audio-db-level — Robert Harvey, Feb 14 '13 at 18:13
In short, you're right; it is a ratio. You have to establish a "reference level," as described in the [Wikipedia article](http://en.wikipedia.org/wiki/Decibel). For example, with Compact Disks, zero db is considered the highest possible magnitude, so all other values are going to be negative, down to about -92 db. — Robert Harvey, Feb 14 '13 at 18:15
It's not a dupe of stackoverflow.com/questions/2445756/ , I'm specifically asking about the outputs from FFT bins. That is a general question measuring sound levels with microphones. — Mark, Feb 14 '13 at 18:17
OK, but the principle is the same; the answers provided there explain how to establish a "reference level." — Robert Harvey, Feb 14 '13 at 18:18
I read the wikipedia article on this as well, which says the reference level for audio is defined a 1dB = 20 micro pascals. So presumably for audio you would go `10log(20/val)` where val is ones measurement in pascals? I can't find anywhere what is the reference level for FFTs, or why I get a mix of negative & positive numbers when other sources always have spectrums as positive. There's probably a really simple answer I've just missed! — Mark, Feb 14 '13 at 18:21

tom10 · Accepted Answer · 2013-02-27T16:52:12.300

The simple answer is that, for the most part, you can add an arbitrary number to the dB value to make the values all positive, or all negative, or whatever you prefer. With an uncalibrated microphone, like on the iPhone, this is all that makes sense anyway, since all you know are relative values.

For a more advanced technical approach, using a calibrated microphone, you could reference everything using dB (SPL), as a reasonable standard, but this is a hassle, and not meaningful in your use case anyway.

Rationale:
The main reason that shifting by an arbitrary amount is that the log doesn't report the units of measurement. For example, even if you know the input amplitude is 0.1 Pascal, it's completely valid to say this is 100 milliPascal, where you'd be taking the log of 100 rather than 0.1 (so the log values are either 2 or -1). Both are completely valid and the choice is entirely arbitrary. When comparing to a standard reference, as in dB SPL, note that it's done as a ratio, log(P/P_ref), removing the impact of changing the units.

Thanks for the answer, I guess that makes sense. For others reading this, this idea confused me initially. However, adding/subtracting values from logs is obviously the same as scaling by a multiple before taking logs (i.e. (log(a/b) = log(a) - log(b)), and so essentially you are just scaling your inputs (e.g. divide everything by the value in the smallest bin, so your minimum value is 1. You'll need to add a small value to all bins so they are all non-zero. — Mark, Feb 15 '13 at 09:39
@Mark: see my edit above for more on the rationale for shifting. (I added this in long after the answer, but it's been nagging at me that I needed to clarify this for subsequent readers.) — tom10, Feb 27 '13 at 16:54

score 0 · Answer 2 · answered Feb 15 '13 at 03:53

0

Since an FFT is a linear operator, The scale of the output of an FFT is related to the scale of the data input to the FFT. The scale of the input to the FFT on your iPhone depends on the gain of the mic, audio filters, potentially the AGC, and the DAC reference. Since the latter are all undocumented and can vary (by position of the mic, model of device, input gain which may depend on the audio session configuration, and etc.) you won't know the ratio unless you perform some sort of calibration against a known reference.

answered Feb 15 '13 at 03:53

hotpaw2

70,107
14
90
153

My main confusion though, is why do I get a mix of +ve and -ve values, whereas when I draw dB values using other packages, the values are always +ve? tom10 above has provided one suggestion, that you can just add an arbitrary number to make them all +ve, which sounds a bit odd! – Mark Feb 15 '13 at 09:31
Easy. Those other packages just scale the inputs, the outputs, and/or the dB results into the desired range, which depends on how they relate to the desired reference value. – hotpaw2 Feb 15 '13 at 21:05

FFT and decibel scales

2 Answers2