I am planning to use mincemeat.py for my map reduce task on a ~100GB file. After seeing the example code from mincemeat, it seems I need to input an in-memory dictionary as the data source. So, what is the right way to provide my huge file as the data source for mincemeat?
Link to mincemeat: https://github.com/michaelfairley/mincemeatpy