I am trying to use and AMLCompute instance to preprocess my data. To do so I need to be able to write the processed data back to the datastore. I am taking this approach because the cluster will automatically shutdown when it is complete so I can let it run until it is done without worrying about paying for more time than is needed.
The problem is when I try to write back to the datastore (which is mounted as a dataset) I get the following error:
OSError: [Errno 30] Read-only file system: '/mnt/batch/tasks/shared/LS_root/jobs/[...]/wav_test'
I have set the access policy for my datastore to allow read, add, create, write, delete, and list, but I don't think that is the issue because I can already write to the datastore from the Microsoft Azure File Explorer.
Is there a way to mount a datastore directly or through a dataset with write privileges from the azureml python sdk?
Alternatively, is there a better way to preprocess this (audio) data on azure for machine learning?
Thanks!
EDIT: I'm adding an example that illustrates the problem.
from azureml.core import Workspace, Dataset, Datastore
import os
ws = Workspace.from_config()
ds = Dataset.get_by_name(ws, name='birdsongs_alldata')
mount_context = ds.mount()
mount_context.start()
os.listdir(mount_context.mount_point)
output:
['audio_10sec', 'mp3', 'npy', 'resources', 'wav']
So the file system is mounted and visible.
# try to write to the mounted file system
outfile = os.path.join(mount_context.mount_point, 'test.txt')
with open(outfile, 'w') as f:
f.write('test')
Error:
--------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-9-1b15714faded> in <module>
1 outfile = os.path.join(mount_context.mount_point, 'test.txt')
2
----> 3 with open(outfile, 'w') as f:
4 f.write('test')
OSError: [Errno 30] Read-only file system: '/tmp/tmp8ltgsx6x/test.txt'