Questions tagged [parallelism-amdahl]

Amdahl's law, also known as Amdahl's argument, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel computing to predict the theoretical maximum speedup using multiple processors. The law is named after computer architect Gene Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967.

Amdahl's law, also known as Amdahl's argument, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel-computing to predict the theoretical maximum speedup using multiple processors. The law is named after computer architect Gene Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967.

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of the program which takes one hour to execute cannot be parallelized, while the remaining 19 hours (95%) of execution time can be parallelized, then regardless of how many processors are devoted to a parallelized execution of this program, the minimum execution time cannot be less than that critical one hour. Hence the speedup is limited up to 20x.

106 questions

votes

1 answer

Python multiprocessing performance only improves with the square root of the number of cores used

I am attempting to implement multiprocessing in Python (Windows Server 2012) and am having trouble achieving the degree of performance improvement that I expect. In particular, for a set of tasks which are almost entirely independent, I would expect…

python windows performance multiprocessing parallelism-amdahl

asked May 07 '18 at 19:55

KPM

votes

2 answers

Why isn't N independent calculations N times faster on N threads?

I have an N core processor ( 4 in my case ). Why isn't N totally independent function calls on N threads roughly N times faster ( of course there is an overhead of creating threads, but read further )? Look at the the following code: namespace ch =…

c++ multithreading multicore cpu-cores parallelism-amdahl

asked Jul 15 '15 at 22:46

krispet krispet

1,648
1
14
25

votes

2 answers

Amdahl's law and GPU

I have a couple of doubts regarding the application of Amdahl's law with respect to GPUs. For instance, I have a kernel code that I have launched with a number of threads, say N. So,in the amdahl's law the number of processors will be N right? Also,…

cuda parallelism-amdahl

asked Sep 13 '12 at 03:14

Anirudh Kaushik

votes

1 answer

Chapel-Python integration questions

I'm trying to see if I can use Chapel for writing parallel code for use in a Python-based climate model: https://github.com/CliMT/climt I don't have any experience with Chapel, but it seems very promising for my use-case. I had a few questions about…

python parallel-processing chapel parallelism-amdahl

asked Oct 11 '19 at 18:00

Joy Monteiro

votes

2 answers

How to find an optimum number of processes in GridSearchCV( ..., n_jobs = ... )?

I'm wondering, which is better to use with GridSearchCV( ..., n_jobs = ... ) to pick the best parameter set for a model, n_jobs = -1 or n_jobs with a big number, like n_jobs = 30 ? Based on Sklearn documentation: n_jobs = -1 means that the…

python machine-learning parallel-processing scikit-learn parallelism-amdahl

asked May 04 '18 at 21:09

Minions

5,104
5
50
91

votes

2 answers

pathos: parallel processing options - Could someone explain the differences?

I am trying to run parallel processes under python (on ubuntu). I started using multiprocessing and it worked fine for simple examples. Then came the pickle error, and so I switched to pathos. I got a little confused with the different options and…

python parallel-processing multiprocessing pathos parallelism-amdahl

asked Feb 26 '18 at 14:24

Olivier

votes

2 answers

Poor scaling of multiprocessing Pool.map() on a list of large objects: How to achieve better parallel scaling in python?

Let us define : from multiprocessing import Pool import numpy as np def func(x): for i in range(1000): i**2 return 1 Notice that func() does something and it always returns a small number 1. Then, I compare an 8-core parallel…

python performance parallel-processing multiprocessing parallelism-amdahl

asked Feb 08 '20 at 15:22

user15964

2,507
2
31
57

votes

1 answer

An OpenCL code in MQL5 does not get distributed jobs to each GPU core

I have created a GPU based indicator for MetaTrader Terminal platform, using OpenCL and MQL5. I have tried hard that my [ MetaTrader Terminal: Strategy Tester ] optimization job must get transferred on GPU to maximum. Most of the calculations are…

performance parallel-processing opencl mql5 parallelism-amdahl

asked May 16 '18 at 07:03

Jaffer Wilson

7,029
10
62
139

votes

3 answers

Amdahl's Law examples

Amdahl's Law states that the maximal speedup of a computation where the fraction S of the computation must be done sequentially going from a 1 processor system to an N processor system is at most 1 / (S + [(1 - S) / N]) Does…

parallel-processing parallelism-amdahl

asked Apr 08 '11 at 17:07

OTO

votes

0 answers

cv::parallel_for_ not very big improvement

I'm testing the class cv::ParallelLoopBody for image processing code. I first started implementing the normalization, where I've to divide all the pixels with certain values for each channel, which is an easy nice parallelized code. However, when…

c++ opencv lambda parallel-processing parallelism-amdahl

asked Dec 14 '17 at 22:47

Ja_cpp

2,426
7
27
49

votes

2 answers

improving bigint write to disk performance

I am working with really large bigint numbers and I need to write them to disk and read them back later because they won't all fit in memory at a time. The current Chapel implementation first converts the bigint to a string and then writes that…

distributed bigint chapel parallelism-amdahl

asked Dec 12 '17 at 13:57

zx228

votes

2 answers

Expected speedup from embarrassingly parallel task using Python Multiprocessing

I'm learning to use Python's Multiprocessing package for embarrassingly parallel problems, so I wrote serial and parallel versions for determining the number of primes less than or equal to a natural number n. Based on what I read from a blog post…

python parallel-processing python-multiprocessing embarrassingly-parallel parallelism-amdahl

asked Oct 24 '14 at 20:26

Jonathan Young

votes

1 answer

Efficient collection and transfer of scattered sub-arrays in Chapel

Recently, I came across Chapel. I liked the examples given in the tutorials but many of them were embarrassingly parallel in my eyes. I'm working on Scattering Problems in Many-Body Quantum Physics and a common problem can be reduced to the…

parallel-processing hpc low-latency chapel parallelism-amdahl

asked Jun 08 '20 at 18:43

CKl

votes

2 answers

CyclicDist goes slower on multiple locales

I tried doing an implementation of Matrix multiplication using CyclicDist module. When I test with one locale vs two locales, the one locale is much faster. Is it because the time to communicate between the two Jetson nano boards is really big or is…

performance parallel-processing low-latency chapel parallelism-amdahl

asked Dec 14 '19 at 03:30

Bofo

votes

1 answer

Why does joblib.Parallel() take much more time than a non-paralleled computation? Shouldn't Parallel() run faster than a non-paralleled computation?

A joblib module provides a simple helper class to write parallel for loops using multiprocessing. This code uses a list comprehension to do the job : import time from math import sqrt from joblib import Parallel, delayed start_t =…

python parallel-processing parallelism-amdahl

asked Aug 29 '19 at 09:07

user11566345

2 3 4 5 6 7 8 Next