0

I have a library to calculate the best sequence of list of groups which each of group has list of SKU to be compared. The comparison logic is simple just to compare SKUs on each loop of its calculated field. The math tat i'm using are permutation factorial of total SKU on each group. The order is the most important part of this comparison loop. And the looping it the most crucial part of this library sequencer, that's why we're using parallel looping optimize the looping so that the speed will be faster.

So imagine if i have 1 group with 4 SKUs = 4 x 3 x 2 x 1 = 24 sequences that i will need to loop to find the best sequence. The comparison is somehow looks like this :

  1. A > B > C > D
  2. A > B > D > C
  3. A > C > B > D
  4. A > C > D > B
  5. A > D > B > C
  6. A > D > C > B
  7. B > A > C > D
  8. B > A > D > C
  9. B > C > A > D
  10. B > C > D > A
  11. B > D > A > C
  12. B > D > C > A
  13. C > A > B > D
  14. C > A > D > B
  15. C > B > A > D
  16. C > B > D > A
  17. C > D > A > B
  18. C > D > B > A
  19. D > A > B > C
  20. D > A > C > B
  21. D > B > A > C
  22. D > B > C > A
  23. D > C > A > B
  24. D > C > B > A

Above are for 4 SKUs inside a group, imagine i have few groups and inside each group has more than 4 SKUs. The permution factorials that my library has to do are as following :

  • 5 SKUs = 120 sequences
  • 6 SKUs = 720 sequences
  • 7 SKUs = 5,040 sequences
  • 8 SKUs = 40,320 sequences
  • 9 SKUs = 362,880 sequences
  • 10 SKUs = 3,628,800 sequences
  • 11 SKUs = 39,916,800 sequences
  • 12 SKUs = 479,001,600 sequences

I had a job running on AWS with 36 CPU cores and 64 GB Memory. The job contains 2 groups, group A with 3 SKUs and group B with 13 SKUs, it took more than 2 days and it still calculating. The result that i collected from the server was :

  • CPU Usage : 18 CPU cores used
  • Memory Usage : 40% used

Question :

  • Is there any possibilities that memory is helping CPU to do the job on the looping, perhaps?
  • How am i maximize all CPU core to help me to the job? As currently i can only do the parallel loop for each sequence on 1 CPU, is there anyway i can use multiple CPUs to do 1 loop?
  • Any other recommendation?
Stone
  • 111
  • 2
  • 14

1 Answers1

1

Is there any possibilities that memory is helping CPU to do the job on the looping, perhaps?

No. The effect of memory usage on performance is twofold:

  1. If you're accessing too much memory, it has negative effect on performance, because the CPU can't use its cache well.

  2. If you know you have lots of memory, you can sometimes store the results of some computation so that you don't have to recompute it again later. But the computer generally won't do that automatically, you have to change your program to do that.

    I don't know how exactly do you calculate the best sequence, but maybe when computing "A > B > C > D", you could remember the result for "C > D" and reuse it when computing "B > A > C > D", or something like that.

    This is called space-time tradeoff.

How am i maximize all CPU core to help me to the job?

I don't think this can be answered without seeing your code and understanding why it doesn't use all CPUs.

As currently i can only do the parallel loop for each sequence on 1 CPU, is there anyway i can use multiple CPUs to do 1 loop?

Maybe, it depends on what exactly does the loop do.

Any other recommendation?

Yes, ask a more specific question, ideally including your code.

svick
  • 236,525
  • 50
  • 385
  • 514