12

To my surprise, I cannot find a comparison of these products using open source OpenCL benchmark suites, such as rodinia and SHOC. Such a comparison could be more interesting than comparisons of theoretical peak performance, or of performance in simple matrix multiplication kernels, which I have been able to find.

Does anyone know where such results might be available? Failing that, do any stack overflow users have access to one or both products, and the time and inclination to run the benchmarks and share the results? Results for any of the versions of either card would be interesting.

Boppity Bop
  • 9,613
  • 13
  • 72
  • 151
Matt
  • 569
  • 1
  • 4
  • 16
  • Does Xeon Phi support OpenCL yet? I haven't seen any announcements. Plus, I'd expect Xeon Phi to be really slow because its architecture is better suited to message passing applications. – Tim Child Jan 19 '13 at 03:24
  • @Tim Xeon Phi does support OpenCL, though it's still in beta phase: http://software.intel.com/en-us/vcsource/tools/opencl-sdk-xe – Oak Jan 19 '13 at 11:26
  • 4
    I have access to both at my work place. Are you looking for opencl performance on k20 or the cuda performance ? – Pavan Yalamanchili Jan 19 '13 at 19:56
  • It would be quite interesing to compare the OpenCL performance on the rodinia benchmarks on Xeon Phi with the results for both sets of benchmarks on Tesla. – Matt Jan 19 '13 at 20:05
  • 1
    Building them is becoming more of a pain for me on the weekend (not playing nice with cuda 5.0). I will try to get back later this week. – Pavan Yalamanchili Jan 19 '13 at 20:27
  • 1
    @Pavan Have you had time to run the benchmarks on Xeon Phi? (I think not benchmarking CUDA is fine.) – Aleksandr Dubinsky Feb 16 '13 at 10:36

4 Answers4

7

CLBenchmark.com now has some results for the Xeon Phi, and a complete set for the K20c.

Here is a side-by-side comparison.

Oak
  • 26,231
  • 8
  • 93
  • 152
Matt
  • 569
  • 1
  • 4
  • 16
5

Here is a comparison of the Xeon Phi with a GTX Titan.

http://clbenchmark.com/compare.jsp?config_0=14470292&config_1=15887974

The Xeon Phi basically gets completely destroyed in 10/12 benchmarks and is on par for the other 2. So the 300 watt 22 nm Phi part does not far well against the 250 watt 28 nm GPU.

Basically the Phi seems to be having major troubles utilizing it's bandwidth capacity, vectorizing the code seems to be another issue.

Jimmy Pettersson
  • 465
  • 4
  • 13
4

Here is a benchmark comparing sparse matrix multiplication performance:

http://uk.arxiv.org/abs/1302.1078

It partly answers my question, but I would rather see more than one algorithm, and I would like to see how portable OpenCL performance is, I will still accept any answers which can provide that information.

Matt
  • 569
  • 1
  • 4
  • 16
2

SHOC benchmark suite for Xeon Phi is on github here:

Intel Xeon Phi SHOC Benchmark Suite

Plenty of benchmark postings starting to go public and "googlable", but here is the standard Intel communication on benchmarking of Xeon Phi versus a dual socket E5-2670:

Intel Xeon Phi Performance Doc.

When looking to compare performance of Xeon Phi to a regular Xeon, or any other platform, make sure you're taking into account the power envelope of the platform (dual socket Xeon) and if the application was already tuned for a Xeon or not. One of the big sells on Xeon Phi is that you typically get Xeon improvements in addition to Xeon Phi improvements. Pretty sweet..

MikeWade
  • 84
  • 1
  • 7
  • Nice to have an answer from Intel! Thanks. I noticed "Intel Xeon Phi SHOC Benchmark Suite" does not seem to use OpenCL any more. Is that right? If so it is a bit of a shame, it would be nice to compare OpenCL performance. – Matt May 22 '13 at 13:00
  • I'm certain this is not the case longer term... It's more likely that someone is working on it in a local branch and they'll push it to github soon. – MikeWade May 31 '13 at 01:10