3

In Anaconda Python 3.6.7 with PyTorch installed, on Windows 10, I do this sequence:

conda install -c conda-forge librosa
conda install -c groakat sox

then in a fresh download from https://github.com/pytorch/audio I do

python setup.py install

and it runs for a while and ends like this:

torchaudio/torch_sox.cpp(3): fatal error C1083: Cannot open include file: 'sox.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.15.26726\\bin\\HostX86\\x64\\cl.exe' failed with exit status 2

I am trying to reproduce this OpenNMT-py speech training demo on Windows: http://opennmt.net/OpenNMT-py/speech2text.html

Lars Ericson
  • 1,952
  • 4
  • 32
  • 45

2 Answers2

4

I managed to compile torchaudio with sox in Windows 10, but is a bit tricky.

Unfortunately the sox_effects are not usable, this error shows up:

RuntimeError: Error opening output memstream/temporary file

But you can use the other torchaudio functionalities.

The steps I followed for Windows 10 64bit are:

TORCHAUDIO WINDOWS10 64bit

Note: I mix some command lines unix-like syntax, you can use file explorer or whatever

preliminar arrangements

  1. Download sox sources

$ git clone git://git.code.sf.net/p/sox/code sox

  1. Download other sox source to get lpc10
$ git clone https://github.com/chirlu/sox/tree/master/lpc10 sox2
$ cp -R sox2/lpc10 sox
  1. IMPORTANT get VisualStudio2019 and BuildTools installed

lpc10 lib

4.0. Create a VisualStudio CMake project for lpc10 and build it

Start window -> open local folder -> sox/lpc10
(it reads CMakeLists.txt automatically)
Build->build All

4.2. Copy lpc10.lib to sox

$ mkdir -p sox/src/out/build/x64-Debug
$ cp sox/lpc10/out/build/x64-Debug/lpc10.lib sox/src/out/build/x64-Debug

gsm lib

5.0. Create a CMake project for libgsm and compile it as before with lpc10

5.1. Copy gsm.lib to sox

$ mkdir -p sox/src/out/build/x64-Debug
$ cp sox/libgsm/out/build/x64-Debug/gsm.lib sox/src/out/build/x64-Debug

sox lib

6.0. Create a CMake project for sox in VS

6.1. Edit some files:

CMakeLists.txt: (add at the very beginning)

project(sox)

sox_i.h: (add under stdlib.h include line)

#include <wchar.h> /* For off_t not found in stdio.h */
#define UINT16_MAX  ((int16_t)-1)
#define INT32_MAX  ((int32_t)-1)

sox.c: (add under time.h include line)

`#include <sys/timeb.h>`

6.2. Build sox with VisualStudio

6.3. Copy the libraries where python will find them, I use a conda environment:

$ cp sox/src/out/build/x64-Debug/libsox.lib envs\<envname>\libs\sox.lib
$ cp sox/src/out/build/x64-Debug/gsm.lib envs\<envname>\libs
$ cp sox/src/out/build/x64-Debug/lpc10.lib envs\<envname>\libs

torchaudio

$ activate <envname>

7.0. Download torchaudio from github

$ git clone https://github.com/pytorch/audio thaudio

7.1. Update setup.py, after the "else:" statement of "if IS_WHEEL..."

$ vi thaudio/setup.py

if IS_WHEEL...

else:
    audio_path = os.path.dirname(os.path.abspath(__file__))

    # Add include path for sox.h, I tried both with the same outcome
    include_dirs += [os.path.join(audio_path, '../sox/src')]
    #include_dirs += [os.path.join(audio_path, 'torchaudio/sox')]

    # Add more libraries

    #libraries += ['sox']
    libraries += ['sox','gsm','lpc10']

7.2. Edit sox.cpp from torchaudio because dynamic arrays are not allowed:

$ vi thaudio/torchaudio/torch_sox.cpp

 //char* sox_args[max_num_eopts];
 char* sox_args[20]; //Value of MAX_EFFECT_OPTS

7.3. Build and install

$ cd thaudio
$ python setup.py install

It will print out tons of warnings about type conversion and some library conflict with MSVCRTD but "works".

And thats all.

maremoto007
  • 104
  • 1
  • 5
1

Bad news I am afraid: you won't get PyTorch Audio on Windows without putting significant effort. The problem is with libsox-dec that is one of dependencies. You might have installed sox, but the development version is a whole different beast. The error exactly complains about missing header file. There's a ticket opened for Windows support.

Long story short, building libsox as a static library for Windows is tough. You might try your luck with cygwin. Or use Docker / VM.

Lukasz Tracewski
  • 10,794
  • 3
  • 34
  • 53
  • Thanks, but I need GPU with my Torch and Docker/VM is not so good for GPU. On the other hand, I think I can work around by rewriting the code so it uses some other library to bring in the audio. Or bail on Torch and use a Tensorflow equivalent. – Lars Ericson Feb 25 '19 at 20:53
  • @LarsEricson You're right, somehow I thought I have seen Docker being used on Windows host with 2 cards (GPU + integrated), but now can't find it... I'd recommend looking at `keras` / `tensorflow` as both provide some extra convenience methods to work with audio. Besides, `pytorch/audio` isn't that fancy. You can use `librosa` directly for transformations and I/O. – Lukasz Tracewski Feb 25 '19 at 21:36