2

when i use pandas, the code works perfect ( but very slow ), and when use modin, and concat dataframe, shows me an aerror

contador = 0
df = pd.DataFrame()
data = pd.DataFrame()

for file in range(len(files)):
    usefile = files[file]
    print("Valor Numero :" + str(contador) + " de un total de " + str((len(files))) + " archivos")
    print("Existe " + str(usefile) + " añadiendolo al DataFrame" )
    contador = contador +1
    ruta = mainpath + "/" + str(usefile) 
    df = pd.read_csv(ruta)
    datos[usefile] = df
data = pd.concat(datos.values(), keys=datos.keys() , sort='True')

I expect the output of a dataframe with all files concatenate from dict, but y recive ( in pandas , all works perfect ) :

<ipython-input-4-e5a361476e76> in <module>
     12     df = pd.read_csv(ruta)
     13     datos[usefile] = df
---> 14 data = pd.concat(datos.values(), keys=datos.keys() , sort='True')
     15 

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

~/anaconda3/lib/python3.7/site-packages/modin/pandas/concat.py in <dictcomp>(.0)
     98         new_idx_labels = {
     99             keys[i]: objs[i].index if axis == 0 else objs[i].columns
--> 100             for i in range(len(objs))
    101         }
    102         print(new_idx_labels)

TypeError: 'dict_keys' object is not subscriptable
Ophir Yoktan
  • 8,149
  • 7
  • 58
  • 106
zkittlez
  • 21
  • 3

1 Answers1

1

This is behavior that is unintentionally not yet supported in Modin (version 0.4) based on an assumption that the keys and objs parameters are subscriptable.

The last line in your code can be changed as a workaround until it is fixed in Modin:

data = pd.concat(list(datos.values()), keys=list(datos.keys()) , sort='True')

I created an issue on the Modin repo to track the issue: https://github.com/modin-project/modin/issues/557

Devin
  • 89
  • 2