Questions tagged [xeon-phi]

a co-processor/accelerator from Intel

Intel Many Integrated Core Architecture or Intel MIC (pronounced Mike) is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee many core architecture, the Teraflops Research Chip multicore chip research project, and the Intel Single-chip Cloud Computer multicore microprocessor.

188 questions
44
votes
11 answers

What is the fastest way to return the positions of all set bits in a 64-bit integer?

I need a fast way to get the position of all one bits in a 64-bit integer. For example, given x = 123703, I'd like to fill an array idx[] = {0, 1, 2, 4, 5, 8, 9, 13, 14, 15, 16}. We can assume we know the number of bits a priori. This will be…
Andrew
  • 867
  • 7
  • 20
12
votes
3 answers

What is the most efficient way to clear a single or a few ZMM registers on Knights Landing?

Say, I want to clear 4 zmm registers. Will the following code provide the fastest speed? vpxorq zmm0, zmm0, zmm0 vpxorq zmm1, zmm1, zmm1 vpxorq zmm2, zmm2, zmm2 vpxorq zmm3, zmm3, zmm3 On AVX2, if I wanted to clear ymm registers, vpxor was…
Maxim Masiutin
  • 3,991
  • 4
  • 55
  • 72
12
votes
4 answers

Benchmarks comparing Intel Xeon Phi and Nvidia Tesla K20

To my surprise, I cannot find a comparison of these products using open source OpenCL benchmark suites, such as rodinia and SHOC. Such a comparison could be more interesting than comparisons of theoretical peak performance, or of performance in…
Matt
  • 569
  • 1
  • 4
  • 16
9
votes
2 answers

Fast popcount on Intel Xeon Phi

I'm implementing an ultra fast popcount on Intel Xeon® Phi®, as it's a performance hotspot of various bioinformatics software. I've implemented five pieces of code, #if defined(__MIC__) #include __attribute__((align(64))) static const…
8
votes
1 answer

Why Intel compiler ignores the non-temporal prefetch pragma directive for Intel MIC?

Intel compiler generates the following prefetch instruction within a loop for accessing an array by a_ptr pointer: 400e93: 62 d1 78 08 18 4c 24 vprefetch0 [r12+0x80] If I manually change (by hex-editing the executable) this to non-temporal…
Daniel Langr
  • 22,196
  • 3
  • 50
  • 93
8
votes
1 answer

How to use GCC 5.1 and OpenMP to offload work to Xeon Phi

Background We have been trying unsuccessfully to use the new GCC 5.1 release to offload OpenMP blocks to the Intel MIC (i.e. the Xeon Phi). Following the GCC Offloading page, we've put together the build.sh script to build the "accel" target…
grumpy_robot
  • 141
  • 1
  • 6
8
votes
1 answer

How to program Intel Xeon Phi with C#?

I am a C# programmer with some C++ experience, all on Windows. With this skill set, are there any options to develop for Intel Xeon Phi processor? Found this link, but not sure if that's the best/only way. Thanks for your advice.
user1044169
  • 2,686
  • 6
  • 35
  • 64
8
votes
3 answers

Using Xeon Phi with JVM-based language

Is it possible to use Xeon Phi using JVM-based language such as Scala? Is there any example?
Kokizzu
  • 24,974
  • 37
  • 137
  • 233
7
votes
0 answers

AVX512 log2 or pow instructions

I need a AVX512 double pow(double, int n) function (I need it for a binomial distribution calculation which needs to be exact). In particular I would like this for Knights Landing which has AVX512ER. One way to get this is x^n =…
Z boson
  • 32,619
  • 11
  • 123
  • 226
6
votes
0 answers

Getting max FLOPS for dense matrix multiplication with the Xeon Phi Knights Landing

I recently started working with a Xeon Phi Knights Landing (KNL) 7250 computer (http://ark.intel.com/products/94035/Intel-Xeon-Phi-Processor-7250-16GB-1_40-GHz-68-core). This has 68 cores and AVX 512. The base frequency is 1.4 GHz and the Turbo…
Z boson
  • 32,619
  • 11
  • 123
  • 226
6
votes
1 answer

Will Knights Landing CPU (Xeon Phi) accelerate byte/word integer code?

The Intel Xeon Phi "Knights Landing" processor will be the first to support AVX-512, but it will only support "F" (like SSE without SSE2, or AVX without AVX2), so floating-point stuff mainly. I'm writing software that operates on bytes and words…
user1649948
  • 651
  • 4
  • 12
6
votes
2 answers

Is there a simulator/emulator of Xeon Phi?

I am going to offload some computation to Xeon Phi but would like to test different APIs and different apporached to the parallel programming first. Is there a simulator / emulator for Xeon Phi (either Windows or Linux) ?
Boppity Bop
  • 9,613
  • 13
  • 72
  • 151
5
votes
2 answers

Installing R `forecast` package on a Linux Cluster: compiler issues?

I am looking to test performance of R, more specifically some routines in the forecast package on an HPC cluster with Intel Xeon Phi co-processors. The sysadmin has, I understand, built R/3.2.5 from source following the instructions on Intel's…
Matt Weller
  • 2,684
  • 2
  • 21
  • 30
5
votes
1 answer

R Parallel Processing with Xeon Phi, minimal code changes?

Looking at buying a couple Xeon Phi 5110P, but trying to estimate how much code I have to change or other software needed. Currently I make good use of R on a multi-core Windows machine (24 cores) by using the foreach package, passing it other…
Zachary
  • 319
  • 1
  • 7
5
votes
2 answers

Xeon Phi coprocessor vs Xeon Phi host processor?

What is the difference between a host processor and coprocessor? Specifically Xeon Phi coprocessor and Xeon Phi host processor? I have some performance results on these machines (a parallelized OpenMP code of diffusion equation was being run) which…
Amir
  • 637
  • 1
  • 6
  • 11
1
2 3
12 13