This is a more generic RTFM DSP question coming from someone very comfortable with audio production and software , but new to audio software development, regarding the difference in size between uncompressed (wav, caf, aiff) files (44.1 sample rate, 16 bit) on disk, vs. the actual float values of this audio in memory.
For example, I have a test WAV file that according to MacOS is seven minutes and fourteen seconds (7:14) and is 83.4 MB in size.
If I import this file to my project and open the file as an AKAudioFile
, then inspect the .floatChannelData
property (which is an array of two arrays, one for each channel (two standard for a stereo file)), this file in particular is a total of roughly 23 million floats, around 180 megabytes on the heap. This makes sense as the standard Float
object in Swift is a 32bit float at 8 bytes per float.
I understand the size, however I am hoping to be able to work with something closer to 16bit as at least in my applications, I am simply analyzing this audio, not processing it any way, and even after some basic optimizations and preventing deep copies, any audio exceeding 10 or so minutes goes into using gigs of memory on the heap.
According to this SO question there are some novel ways to convert 32bit to 16, but honestly this feels like the wrong/overkill approach for what I want to do. As a point of example, if I simply just reference floatChannelData
from my AKAudioFile
it automatically adds around 300 megs to the heap! Even without copying, appending, etc....
For the more experienced DSP audio developers out there, are there any resources for good heap/stack management for large floating point numbers in your program? Can AudioKit record things to 16bit? I am currently doing processing in C and C++ so I feel very comfortable doing any sort of math or conversions there if it's more performative. Any leads are so appreciative, thank you!