0

I was just examining the code of a simple executable packer which writes a section into the executable that unpacks it during startup, when I stumbled upon this piece of code:

void setDistance( unsigned long size )
{
    char* set = (((char *)I)+pUnpacker->VirtualAddress);
    union
    {
        short sh[2];
        long  l;
    } conv;
    conv.l      = size;
    conv.sh[0]  = 0;

    unpacker_set(set, (char *)(&conv.l), 4, TEXT_DISTANCE);
}

Size is the distance from the unpacker code in memory to the beginning of the Section that is supposed to be unpacked. In the loader code it is defined as a unsigned long. unpacker_set on the other hand has this code:

void inline unpacker_set( char* at, char* what, size_t size, unsigned long sig )
{
    DWORD oldprotect;
    unsigned char *set  = (unsigned char *)at;

    while(*((unsigned long*)(set)) != sig)
        set++;

    if(VirtualProtect(set, size, PAGE_READWRITE, &oldprotect) == TRUE)
        for(unsigned i=0; i<size; i++)
            *(set+i) = *(what+i);
}

Although I understand that second routine replaces the value from the unpacker code, but I would like to know why the hassle with a union is done. Any help would be appreciated.

Leon
  • 37
  • 1
  • 4
  • 2
    Lol this is truly horrible code – David Apr 01 '13 at 12:50
  • 1
    The code has undefined behavior. You are not allowed to *read* out of a member of an `union` other than the *active* one, which is the last one to which you wrote... it works in practice, but the compiler could decide to generate code that does not yield the expected result and it would still be right. – David Rodríguez - dribeas Apr 01 '13 at 13:12

1 Answers1

1

Probably the best way to understand the code would be to write a very minimal test case and see what it does:

#include <iostream>

void f()
{
  union 
  {
    short sh[2];
    long l ;
  } conv ;
   conv.l = 100000000 ;

   std::cout << std::hex << conv.l << std::endl ;

  conv.sh[0] = 0 ;

  std::cout << std::hex << conv.l << std::endl ;
}

int main()
{
  f() ;
}

The output I see for this is as follows:

5f5e100
5f50000

So the code intention looks like it is trying to mask out the higher order bits of the size, although this is very ugly and it is unlikely to be portable.

As David pointed out you should be aware of strict aliasing. This article Type-punning and strict-aliasing is even better since it has some solid examples of real world issues using unions. So in order to assure that this code works as expected assuming gcc or clang you would need to pass in the following command line argument -fno-strict-aliasing.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • So would you have an idea what code I could use instead in that place, as I've tried just copying the memory wich simply failed. – Leon Apr 01 '13 at 14:18