4

Hi to all C coders.

Having looked first for similar questions like mine I couldn't find ones.

How to fetch/compare 4bytes in a portable way (without memcpy/memcmp of course)?

I have never learned C and because of that I am a living proof that without knowing the basics everything becomes a nasty mess afterwards. Anyway, writing words (already) is no time to say 'start with the alphabet'.

    ulHashPattern = *(unsigned long *)(pbPattern);
        for (a=0; a < ASIZE; a++) bm_bc[a]=cbPattern;
        for (j=0; j < cbPattern-1; j++) bm_bc[pbPattern[j]]=cbPattern-j-1;
        i=0;
        while (i <= cbTarget-cbPattern) {
            if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) {

The above fragment works as it must on Windows 32bit compiler. My desire is all such 4vs4 comparisons to work under 64bit Windows and Linux as well. Many times I need 2,4,8 bytes transfers, in above example I need explicitly 4bytes from some pbTarget offset. Here the actual question: what type should I use instead of unsigned long? (I guess something close to UINT16,UINT32,UINT64 will do). In other words, what 3 types I need in order to represent 2,4,8 bytes ALWAYS independently from the environment.

I believe this basic question causes a lot of troubles, so it should be clarified.

Add-on 2012-Jan-16:

@Richard J. Ross III
I am double-confused! Since I don't know whether Linux uses 1] or 2] i.e. is _STD_USING defined in Linux, in other words which group is portable the types uint8_t,...,uint64_t or the _CSTD uint8_t,...,_CSTD uint64_t?

1] An excerpt from MVS 10.0 stdint.h

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
typedef _ULonglong uint64_t;

2] An excerpt from MVS 10.0 stdint.h

 #if defined(_STD_USING)
...
using _CSTD uint8_t; using _CSTD uint16_t;
using _CSTD uint32_t; using _CSTD uint64_t;
...

With Microsoft C 32bit there is no problem:

; 3401 :           if ( *(_CSTD uint32_t *)&pbTarget[i] == *(_CSTD uint32_t *)(pbPattern) )

  01360 8b 04 19     mov     eax, DWORD PTR [ecx+ebx]
  01363 8b 7c 24 14  mov     edi, DWORD PTR _pbPattern$GSCopy$[esp+1080]
  01367 3b 07        cmp     eax, DWORD PTR [edi]
  01369 75 2c        jne     SHORT $LN80@Railgun_Qu@6

But when 64bit is the targeted code, that is what happens:

D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>cl /Ox /Tcstrstr_SHORT-SHOWDOWN.c /Fastrstr_SHORT-SHOWDOWN /w /FAcs
Microsoft (R) C/C++ Optimizing Compiler Version 15.00.30729.01 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

strstr_SHORT-SHOWDOWN.c
strstr_SHORT-SHOWDOWN.c(1925) : fatal error C1083: Cannot open include file: 'stdint.h': No such file or directory

D:\_KAZE_Simplicius_Simplicissimus_Septupleton_r2-_strstr_SHORT-SHOWDOWN_r7>

How about Linux' stdint.h, is it always presented?

I didn't give up and commented it: //#include <stdint.h>, then compilation went ok:

; 3401 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) ) 
  01766 49 63 c4     movsxd  rax, r12d
  01769 42 39 2c 10  cmp     DWORD PTR [rax+r10], ebp
  0176d 75 38        jne     SHORT $LN1@Railgun_Qu@6

; 3401 :           if ( *(unsigned long *)&pbTarget[i] == ulHashPattern ) 
  01766 49 63 c4     movsxd  rax, r12d
  01769 42 39 2c 10  cmp     DWORD PTR [rax+r10], ebp
  0176d 75 38        jne     SHORT $LN1@Railgun_Qu@6

This very 'unsigned long *' troubles me since gcc -m64 will fetch a QWORD not DWORD, right?

@Mysticial
Just wanted to show the three different translations done by Microsoft CL 32bit v16:
1]

; 3400 :           if ( !memcmp(&pbTarget[i], pbPattern, 4) )
  01360 8b 04 19     mov     eax, DWORD PTR [ecx+ebx]
  01363 8b 7c 24 14  mov     edi, DWORD PTR _pbPattern$GSCopy$[esp+1080]
  01367 3b 07        cmp     eax, DWORD PTR [edi]
  01369 75 2c        jne     SHORT $LN84@Railgun_Qu@6

2]

; 3400 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) )
  01350 8b 44 24 14  mov     eax, DWORD PTR _ulHashPattern$[esp+1076]
  01354 39 04 2a     cmp     DWORD PTR [edx+ebp], eax
  01357 75 2e        jne     SHORT $LN83@Railgun_Qu@6

3]

; 3401 :           if ( *(uint32_t *)&pbTarget[i] == ulHashPattern )
  01350 8b 44 24 14  mov     eax, DWORD PTR _ulHashPattern$[esp+1076]
  01354 39 04 2a     cmp     DWORD PTR [edx+ebp], eax
  01357 75 2e        jne     SHORT $LN79@Railgun_Qu@6

The initial goal was to extract (with a single mov instruction respectively *(uint32_t *)&pbTarget[i]) and compare 4bytes versus a register variable 4bytes in length i.e. one RAM access one comparision in a single instruction. Nastily I managed only to reduce the memcmp()'s 3 RAM accesses (applied on pbPattern which points to 4 or more bytes) down to 2, thankfully to the inlining. Now if I want to use memcmp() on first 4bytes of pbPattern (as in 2]) ulHashPattern should be not of type register, whereas 3] needs not such a restriction.

; 3400 :           if ( !memcmp(&pbTarget[i], &ulHashPattern, 4) )

The line above gives an error (ulHashPattern is defined as: register unsigned long ulHashPattern; ):

strstr_SHORT-SHOWDOWN.c(3400) : error C2103: '&' on register variable

Yes, you are right: memcmp() saves the situation (but with a limitation) - the fragment 2] is identical to 3] mine dirty style. Obviously my inclination not to use a function when it might be manually coded is a thing of the past but I like it.

Still I am not fully happy from the compilers, I have defined ulHashPattern as a register variable but it is loaded each time from RAM?! Maybe I miss something but this very (mov eax, DWORD PTR _ulHashPattern$[esp+1076]) line degrades performance - an ugly code in my view.

Georgi
  • 148
  • 7

1 Answers1

1

To be strictly pedantic, the only type you can use is char. This is because you are violating strict-aliasing with the following type-puns:

*(unsigned long *)(pbPattern);
*(unsigned long *)&pbTarget[i]

char* is the sole exception to this rule as you can alias any data-type with char*.

If you turn up your warnings on GCC, you should be getting strict-aliasing warning with your code-snippet. (AFAIK, MSVC doesn't warn about strict-aliasing.)


I can't quite tell exactly what you are trying to do in that code-snippet, but the idea still holds, you should not be using unsigned long or any other data-type to load and compare larger chunks of data that are of different types.

In all reality, you really should be using memcmp(), as it's straight-forward and will let you bypass the inefficiencies of forcing everything down to a char*.

Is there a reason you can't use memcmp()?


If you're OK with violating strict-aliasing, you can use the fixed integer types (such as uint32_t) defined in <stdint.h>. However, be aware that these are fixed to the # of bits rather than the # of bytes.

Community
  • 1
  • 1
Mysticial
  • 464,885
  • 45
  • 335
  • 332
  • Thanks for your help, a pun ha-ha, if you only knew how many of them I produce - they should call me punman. I am afraid my style of coding is so violating (not only in regards with strict-aliasing). As for your question: memcmp() is out of question because I want speed, in fact this very 4vs4 comparison is the beginning of an inlined memcmp(). – Georgi Jan 15 '12 at 19:16
  • Agree but in this case after many tries-and-errors (and many benchmarks) I came up with a faster than memcmp code: [link](http://www.sanmayce.com/Railgun/index.html) – Georgi Jan 15 '12 at 19:23
  • Ah, if you've benchmarked it (properly), then fine. It is certainly possible to do better than the compiler. Though it's usually safer and more readable just to issue the `memcpy()`. See my answer [here](http://stackoverflow.com/questions/8528590/what-is-the-advantage-of-using-memset-in-c). – Mysticial Jan 15 '12 at 19:26
  • Also, a great handicap of mine: despite my good feelings toward open-source OSes and tools I am still using Windows, thus I have almost zero experience in Linux and with GCC, grumble. – Georgi Jan 15 '12 at 19:28
  • Same, I still use Windows as primary. But I dual-boot Linux for when I need it. But yes, I've never seen any Windows compiler generate strict-aliasing warnings. So I can see how you got into this situation in the first place. Though to date, the only time I've seen strict-aliasing violation actually cause undesired behavior was when mixing with inline-assembly. – Mysticial Jan 15 '12 at 19:33
  • You felt my entrapment, in a near future (some days) maybe I am gonna ask a similar question about GCC warnings which I am getting like a torrential rain (for a simple code that passes without any warning with CL), hope you will help me there again. – Georgi Jan 15 '12 at 19:48
  • Oh Mysticial, just visited your profile and I am struck by a lightning, man! Recently I used your fantastic accomplishment as an example for infinity at one of my threads at [link](http://www.chinahistoryforum.com/index.php?/topic/20387-underdog-way-lao-zi-dao-de-jing/page__view__findpost__p__5008277) Our world is a small place for sure, your are a real deal - my best regards. – Georgi Jan 15 '12 at 19:56