0

The only way I know to apply ast.literal_eval to an entire dataframe is to loop over each column like so:

for col in df.columns:
         df[col] = df[col].apply(ast.literal_eval)

Is there a more efficient way to do so like using multiprocessing.Pool or some other way?

Amin Selim
  • 71
  • 4
  • @Karl IIUC that wasn't really the question, OP rather asked if there was a multiprocessing way rather than simple evaluation on all cells with a loop – mozway Sep 19 '22 at 13:57
  • "more efficient" is ambiguous and it isn't at all clear why OP expects multiprocessing to be helpful here. If the question is "how do I divide up a task so that I can give separate parts to processes in a `multiprocessing.Pool`?", then that is already nowhere near focused enough - it essentially reduces to "how do I use multiprocessing?". The *standard* way is `.applymap`, as in the linked duplicate and also the answer that was attracted here; if that isn't sufficient then alternative answers *belong on the linked duplicate anyway* (such as the `np.vectorize` approach). – Karl Knechtel Sep 19 '22 at 14:00
  • For sure the question lacks details, but I doubt `applymap` is what OP is looking for ;) – mozway Sep 19 '22 at 14:08
  • you need to use lambda inside appy : ```df[col] = df[col].apply(lambda x: ast.literal_eval(x))``` or ```df[col] = df[col].map(lambda x: ast.literal_eval(x))``` – khaled koubaa Sep 19 '22 at 14:47
  • So, in the end, this is what worked for me: `pool = multiprocessing.Pool(multiprocessing.cpu_count()) df = pd.DataFrame(np.array([pool.map(ast.literal_eval, df[col]) for col in df.columns]).transpose().astype(np.uint8), columns = df.columns)` @mozway applymap was too slow – Amin Selim Sep 19 '22 at 16:31
  • @Amin have you found a question that duplicates the answer you gave as comment? If yes please provide it and I'll update the duplicate link. NB. I know that `applymap` is slow, thus my initial comment ;) – mozway Sep 19 '22 at 17:01

0 Answers0