0

I have an audio file and I want a float variable to be synced with it. The float variable will then be used to create a graphical indication of the audio file being played.

What I would like to happen:

Every other beat in the audio file (first beat, third beat, fifth beat, and so on), the variable should be 0.0f. The other beats (second beat, forth beat, sixth beat, and so on), the variable should be 1.0f.

On top of that, I would like the variable, over time, between the beats, to "slide" between 0.0 and 1.0, and my first thought is to use the sin-function in the C++ standard library.

The information I have about the audio file:

  • The tempo / BPM of the file
  • The length of the file (in seconds)
  • How many beats in total the audio file consist of
  • The position of the file, while it's being played back. I know where in the song, in seconds, I'm currently at. For example, if the song has played for 3 and a half seconds, I get 3.5f from the function I'm using

Besides this, I also have a deltatime and a lifetime, which tells how long (in seconds) the application has been executed.

Since the sin-function takes a float (or double) as parameter, what I need help with is the calculation, which can then be passed as the parameter to the function, which will then be used to generate a sine wave synced with the audio file.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
Daniel_1985
  • 103
  • 1
  • 4
  • Do you have access to the sample rate of the file and sample size in bytes ?(It's usually located in the header) If you do you could read out the audio data block and extract the stereo/mono values, which you could then analyze (using your BPM information) so that you could update your float variable during the playing of the music – tbvanderwoude Jul 10 '16 at 11:40
  • Yes, I have access to the sample rate (which is 44.100Hz) and the sample size (which is 16). In the music program I'm using, they call it 'Bit Depth' but I think that's the same thing as the sample size you're referring to. Any tip on how the analyze of the data block can be created/calculated? I have the opportunity to get something called FFT data from the audio file, can this be of any help for me? – Daniel_1985 Jul 10 '16 at 11:51
  • I also have the opportunity to get a float pointer containing "256 samples of the currently playing sound (post-clipping)". – Daniel_1985 Jul 10 '16 at 12:17
  • That's indeed what it's actually called and I believe the FFT data is just the binary data block including the audio values you need. In order to correctly sync your sine wave, you will need to know the exact 'peak' of the first beat, so that you could then just modify your sine function by adding an offset and modifying the frequency according to the given BPM. Detecting this first beat is the tricky part since just detecting highs (talking about volume) in an audio file doesn't always suffice. By the way, you are referring to some sort of program: Is this actually about a c++ implementation? – tbvanderwoude Jul 10 '16 at 12:27
  • Sounds a bit tricky, yes. When I'm talking about a program, I'm just referring to the music program I used to create the song. In my project (c++ project), I then use a audio library called SoLoud, which has functions that returns the length of a song, the position of the currently played song, FFT data etc. The float value (i.e, the sine wave variable) will then be used in the game I'm currently making. Not sure if I misunderstand your question but it is indeed a c++ implementation I'm working on. – Daniel_1985 Jul 10 '16 at 12:39
  • Would you like to help me a bit with the beat detection-code and modifying the sine function with an offset-code? Perhaps if you could write some pseudo code on how I should start and how to think, because I'm kinda lost here on how to solve it all. – Daniel_1985 Jul 10 '16 at 13:56
  • Yes I'll try, but I'm sure my solution won't work for all audio files (Meanwhile, keep looking for ways to detect a beat (check this question+answer out http://stackoverflow.com/questions/657073/how-to-detect-the-bpm-of-a-song-in-php)) – tbvanderwoude Jul 10 '16 at 14:10

1 Answers1

0

While this solution is not rock-solid, it should be able to detect the beat of a simple audio file: image Here you see a simple audio file, where you can clearly see the beats. Our goal is to find the offset value so that we can create our cos function. A way to find the offset value is by first calculating the average volume of the file (by taking the absolute value of the audio data, since the values go subzero as you can see below). We calculate this average volume in order to negate possible noise. Next, we loop through all of the samples (you can probably just pick a single audio channel) and find the first sample with an absolute value higher than this average. This yields an offset value in samples (NOT SECONDS).

Since we know at which second we are currently playing back, we can now calculate the value of the cos function which visualizes the audio by filling in the frequency and the offset of the function which we previously calculated.

Please correct me I suck at math

A regular beat with a given BPM should look like this.

BPS=BPM/60

g(x)=cos((x*BPS)*360)

A beat with an offset should then look like this:

OFFSET_IN_SECONDS=OFFSET_IN_SAMPLES/SAMPLERATE

g(x)=cos((x*BPS+OFFSET_IN_SECONDS)*360)

IMPORTANT: THESE FUNCTIONS USE DEGREES OPPOSED TO THE RADIANS C++ USES

tbvanderwoude
  • 587
  • 6
  • 17
  • Really cool, thank you! One question: what is g(x) and the x inside the cos-function(s)? – Daniel_1985 Jul 10 '16 at 15:03
  • @Naith X is the current position in the song in seconds, g(x) could be (if it's actually vald) the function you use to generate the float you need for the visualization (float g(float time); in code) – tbvanderwoude Jul 10 '16 at 15:46
  • Instead of calculating the volume with absolute values you should calculate the root-mean-sqare (RMS) instead. This gives a far more sensible value and also is robust against being set off by transients (beats are transients, BTW). – datenwolf Jul 10 '16 at 15:58
  • @datenwolf After I looked up some information about RMS, it seems like the RMS of the final function equals 2^-0.5. Is this correct or am I doing it wrong? – tbvanderwoude Jul 10 '16 at 16:51
  • @thomw2o0o: You'd apply the RMS on the incoming audio signal. For a sine the RMS is amplitude·sqrt(1/2) – note that amplitude is not peak to peak but peak to zero. There's probably very little sense in determining the RMS of the final function. – datenwolf Jul 10 '16 at 17:52
  • Sorry to be a noob, and to ask all these questions, but I have no clue on how to calculating the average volume of my audio file. I understand everything else that you've written, and have inserted that into my code, and it's only that one detail on calculating the average volume left. Math is not my strongest side. – Daniel_1985 Jul 10 '16 at 19:32
  • it's really simple: You loop through all of your samples while adding the absolute (negative values get multiplied by -1 basically) value of sample[i] to a int outside of the loop. Next you divide this int by the total number of samples you previously looped through and then you've got your average volume. While math is apparently not your strongest side, I do suggest that you check out @datenwolf's way using RMS – tbvanderwoude Jul 10 '16 at 20:28
  • Now I understand and I will be able to get it to work in code. Big thanks for all your help, much appreciated! – Daniel_1985 Jul 10 '16 at 20:34
  • You're welcome :). Feel free to pm me whenever you end up stuck – tbvanderwoude Jul 10 '16 at 21:21