1

I am trying to implement equivalent for x86 SSE4 pcmpestrm, pcmpestri instructions on Power PC.

Example code on x86 is like:

     register volatile __m128i result asm("xmm0");
     __asm__ volatile ("pcmpestrm %5, %2, %1"
  : "=x"(result) : "x"(str1), "xm"(str2), "a"(len1), "d"(len2), "i"(MODE) : "cc");

Has anyone written something like this for Power PC? Any ideas on how I can implement something similar for the Power Architecture?

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • It should be fairly simple to take the [description for this intrinsic from the Intel manual](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pcmpestrm&expand=787) and re-implement it using scalar code - is there a specific problem you are facing with doing this ? – Paul R Dec 31 '15 at 08:49
  • 1
    Did you look at QEMU? https://github.com/qemu/qemu/blob/54c54f8b56047d3c2420e1ae06a6a8890c220ac4/target-i386/ops_sse.h#L2084 – Marat Dukhan Dec 31 '15 at 09:35
  • @Paul I am trying something on assembly for the first time. Any simple sample will be of great help. – Deepali Chourasia Dec 31 '15 at 12:22
  • @DeepaliChourasia: great suggestion from Marat in the comment above - did you take a look at that link ? – Paul R Dec 31 '15 at 13:35
  • 2
    pcmpestrm can do many different things, depending on the immediate byte (which has several fields). Unless you're actually writing an emulator, you'll get better powerpc performance from just implementing your specific algorithm with powerpc instructions, potentially AltiVec vector stuff. You will probably get bad results from writing a function to support all the different modes of operation of pcmpestrm. Also, if you're just trying to learn some asm as a beginner, GNU inline ASM is a bad choice. See my answer on http://stackoverflow.com/q/34520013/224132 for some beginner hints. – Peter Cordes Jan 01 '16 at 06:12

0 Answers0