I did a simple performance comparison focused on floating point operations using C#, targeted for Raspberry Pi 3 Model 2 with Windows 10 IoT and I have compare it to Intel Core i7-6500U CPU @ 2.50GHz.
Raspberry Pi 3 Model B V1.2 - Test Results - Chart
Intel Core i7-6500U CPU @ 2.50GHz - x64 Test Results - Chart
Intel Core i7 is only twelve times faster (x64) than Raspberry Pi 3! - according to those tests.
The factor is 11.67 to be exact and was calculated for best performance achieved within those tests on each platform. Both platforms achieved best performance for four threads run in parallel making computations (very simple, independent calculations).
Question: what is the correct method to measure and compare computational performance such platforms? The intention is to compare computational performance in the area of optimization algorithms, machine learning algorithms, statistical analysis, etc. Thus my focus was on floating point operations.
There are some benchmarks (like MWIPS) and measures like MIPS or FLOPS. But I didn't found one way to compare different CPU platforms in terms of computational power.
I found one comparison by Roy Longbottom's (Google "Roy Longbottom's Raspberry Pi, Pi 2 and Pi 3 Benchmarks" - I can not post more links here) but according to his benchmark Raspberry Pi 3 is only four times faster than Intel Core i7 (x64 architecture, MFLOPS comparison). So very different than my results.
Here are details of the tests that I performed:
The test was build around simple operation supposed to be executed iteratively:
private static float SingleAverageCalc(float seed, long nTimes)
{
float x1 = seed, x2 = 0;
long n = 0;
for (; n < nTimes; ++n)
{
x2 = x2 + x1 * n;
}
return x2 / n;
}
where seed is generated randomly in the calling function and nTimes is the number of iterations. Intention is to avoid simple compile-time optimisations.
This test function has been called several times with various iteration number (1M, 10M, 100M and 1B) in single thread and for multiple threads. Multithreaded test look as below:
private static async void RunTestMT(string name, long n, int tn, Func<float, long, float> f)
{
float seed = (float)new Random().NextDouble();
DateTime s1 = DateTime.Now;
List<IAsyncAction> threads = new List<IAsyncAction>();
for (int i = 0; i < tn; i++)
{
threads.Add( ThreadPool.RunAsync((operation) => { f(seed, n/tn); }, WorkItemPriority.High));
}
for (int i = 0; i < tn; i++)
{
threads[i].AsTask().Wait();
}
TimeSpan dt = DateTime.Now - s1;
Debug.WriteLine(String.Format("{0} ({1:N0}; {3}T): {2:mm\\:ss\\.fff}", name, n, dt, tn));
}
Test have been run in Debug mode. Application was built as UWP (Universal Windows Platform). ARM architecture for Raspberry Pi and x86 for Intel.