1

Hello stackoverflow users,
Currently I am facing the following problem, I have a function to open a .wav file, it returns sample rate, length and samples. I have tried it will small files, it worked perfectly, now I am trying to load 1GB wav file and it returns me "MemoryError was unhandled by user code" window. Here is my function:

def OpenWavFile(fileName):
    waveFile = wave.open(fileName, 'r')
    sampFreq = waveFile.getframerate()
    length = waveFile.getnframes()

    byteList = np.fromstring(waveFile.readframes(length), dtype = np.int16)

    return sampFreq, length, byteList

I have tried it with breakpoints and I noticed that the value of length variable is 472289280, what fits in int range. I have tried this function at this position dtype = np.int16, with different types.

Is there a limitation of numpy? Or where is the problem?

RAM size is 8Gb on my laptop.

HEADLESS_0NE
  • 3,416
  • 4
  • 32
  • 51
Andrey Mazur
  • 510
  • 4
  • 14
  • Have you tried with the syntax `with open(fileName, 'r') as wavefile: ...`? – HEADLESS_0NE May 24 '17 at 14:15
  • 2
    Have you caught the exception and inspected it? https://docs.python.org/2/library/exceptions.html#exceptions.MemoryError – Attie May 24 '17 at 14:17
  • How are you internally storing the audio samples? If they are not stored in a raw binary format, that can greatly increase the amount of member used. – pcarter May 24 '17 at 14:18
  • "what fits in int range" What do you mean? Python variable sizes are extendable. – gonczor May 24 '17 at 14:29
  • 3
    The data is at least duplicated in memory. It is returned from `readframes` and `byteList` needs to be allocated before that can be freed again. Should still fit in memory, but who knows what else is going on in `fromstring`.. (are you possibly running 32 bit Python?). You could try if [`scipy.io.wavfile.read`](https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.io.wavfile.read.html) works for you. It even supports memory mapping if you don't want to have the file completely in memory. – MB-F May 24 '17 at 14:33
  • Post the full traceback. – John Zwinck May 24 '17 at 15:06
  • How many channels does the file have? Have you verified that the sample width is what you expect? Check `waveFile.getnchannels()` and `waveFile.getsampwidth()`. – Warren Weckesser May 24 '17 at 15:42

1 Answers1

1

So according to the recommendations above, I have checked the version Python and it was 32 bit, I changed it to 64 and it works now

Andrey Mazur
  • 510
  • 4
  • 14