We're using Google Cloud Dataproc for quick data analysis, and we use Jupyter notebooks a lot. A common case for us is to generate a report which we then want to download as a csv.
In a local Jupyter env this is possible using FileLink
for example:
from IPython.display import FileLinks
df.to_csv(path)
FileLinks(path)
This doesn't work with Dataproc because the notebooks are kept on a Google Storage bucket and the links generated are relative to that prefix, for example http://my-cluster-m:8123/notebooks/my-notebooks-bucket/notebooks/my_csv.csv
Does anyone know how to overcome this? Of course we can scp
the file from the machine but we're looking for something more convenient.