I want to speed up a task on a Dataframe that, for each row, saves an image to a local folder. So, it doesn't return anything. I tried to run that function using Dask, but Dask seems to require that the function return something; I cannot make .apply work...
Is there any other way to make this work?
Update: Minimally reproducible example
import dask.dataframe as dd
import pandas as pd
doc = pd.DataFrame({'file_name': ['Bob', 'Jane', 'Alice','Allan'],
'text': ['text1','text2', 'text3','text4']})
def func(row):
with open(row['file_name']+'.txt', 'w') as f:
f.write(row['text'])
ddf = dd.from_pandas(doc, npartitions=2)
k = ddf.apply(func,axis=1,meta=(None,'object'))
k.compute()
The only reason meta is (None,'object') is because that's what Dask itself suggested when I ran similar code without a meta argument.
This doesn't produce any errors, and it correctly runs.. I am now not able to reproduce my own mistake since I corrected my original mistake yesterday with Michael Delgados answer..