I am trying to merge two large datasets (around 3.5m lines,each) using dplyr::inner_join
.
I am working on a powerful machine with 40+ cores. I am not sure I am taking advantage of the machine itself as I am not parallelizing the task anyhow.
How should I tackle the problem, which is taking a lot to run?
Best