6

Is there any comprehensive information on how binary files can be read? I found information on the PHP website (http://www.php.net/manual/en/function.pack.php) but I am struggling to understand on how to handle "typedef struct" and struct uses.

I have a long binary file with many blocks, each block can be represented us C struct. This C struct has various "typedef struct" similar to what i have come up with below:

typedef struct
{ 
 unsigned char Day;
 unsigned char Month;
 unsigned char Year;
} DATE_OF_BIRTH;
#define USER_TYPE 5
DATE_OF_BIRTH Birth[2];\

EDIT:

I have a structure below, this is a part of a bigger structure

typedef struct FILE_HEADER_tag
{
    int Type;
    int Version;
    unsigned long Model;
    unsigned long Number;
    int Class;

    int TemplateLoaded;
    char TemplateName[32];
    RTC_TIME_DATE StartTime;
    RTC_TIME_DATE CurrentCal;
    RTC_TIME_DATE OriginalCal;
    TEMPLATE_SETTINGS;
    int EndType;
} FILE_HEADER;

typedef struct
{
 unsigned char Second;
 unsigned char Minute;
 unsigned char Hour;
 unsigned char Day;
 unsigned char Month;
 unsigned char Year;
} RTC_TIME_DATE;

The binary file is full of line breaks and I was able to decode the first line of it, which returned correct: type, version, model, number and a class. I think I have also decoded two next variable, but i am not sure of it because StartTime returns some gibberish.

At the moment I am looping through the lines from the binary file and trying to unpack each one:

$i = 1;
while (($line = fgets($handle)) !== false) {
    // process the line read.
    var_dump($line);
    if($i == 1) {
        $unpacked = unpack('iType/iVersion/LModel/LNumber/iClass/iTemplateLoaded', $line );
    }if($i == 2) {
        $i++;
        continue;
    }if($i == 3) { 
        $unpacked = unpack('C32TemplateName/CStartTime[Second]/CStartTime[Minute]/CStartTime[Hour]/CStartTime[Day]/CStartTime[Month]/CStartTime[Year]', $line);
    }

    print "<pre>";
    var_dump($unpacked);
    print "</pre>";

    $i++;

    if($i == 4) { exit; }
}
Vlad Vladimir Hercules
  • 1,781
  • 2
  • 20
  • 37
  • 1
    `$unpacked = unpack('15C', $binary_string);`? – Cyclonecode Aug 09 '16 at 20:18
  • php doesn't provide anything like structs. you can't suck bytes directly into a php memory location and then start treating that memory as a struct, because PHP's memory internals are nowhere near as simplistic as C's. you can read in all the binary data you want into a string, but you can't apply C-style struct-like meaning to that byte sequence. – Marc B Aug 09 '16 at 20:25

1 Answers1

8

I'm not really sure what you are trying to achieve here. If you have a binary file generated from the above c code then you could read and upack its content like this:

// get size of the binary file
$filesize = filesize('filename.bin');
// open file for reading in binary mode
$fp = fopen('filename.bin', 'rb');
// read the entire file into a binary string
$binary = fread($fp, $filesize);
// finally close the file
fclose($fp);

// unpack the data - notice that we create a format code using 'C%d'
// that will unpack the size of the file in unsigned chars
$unpacked = unpack(sprintf('C%d', $filesize), $binary);

// reset array keys
$unpacked = array_values($unpacked);

// this variable holds the size of *one* structure in the file
$block_size = 3;
// figure out the number of blocks in the file
$block_count = $file_size/$block_size;

// you now should have an array where each element represents a
// unsigned char from the binary file, so to display Day, Month and Year
for ($i = 0, $j = 0; $i < $block_count; $i++, $j+=$block_size) {
   print 'Day: ' . $unpacked[$j] . '<br />';
   print 'Month: ' . $unpacked[$j+1] . '<br />';
   print 'Year: ' . $unpacked[$j+2] . '<br /><br />';
}

Of course you could also create an object to hold the data:

class DATE_OF_BIRTH {
  public $Day;
  public $Month;
  public $Year;

  public function __construct($Day, $Month, $Year) {
      $this->Day = $Day;
      $this->Month = $Month;
      $this->Year = $Year;
  }
}

$Birth = [];

for ($i = 0, $j = 0; $i < $block_count; $i++, $j+=$block_size) {
   $Birth[] = new DATE_OF_BIRTH(
       $unpacked[$j], 
       $unpacked[$j+1], 
       $unpacked[$j+2]
   );
}

Another approach would be to slice it at each third element:

$Birth = [];    

for ($i = 0; $i < $block_count; $i++) {
  // slice one read structure from the array
  $slice = array_slice($unpacked, $i * $block_size, $block_size);

  // combine the extracted array containing Day, Month and Year
  // with the appropriate keys
  $slice = array_combine(array('Day', 'Month', 'Year'), $slice);

  $Birth[] = $slice;
}

You should also be aware that this could become much more complicated depending on what data your structure contains, consider this small c program:

#include <stdio.h>
#include <stdlib.h>

// pack structure members with a 1 byte aligment
struct __attribute__((__packed__)) person_t {
  char name[5];
  unsigned int age;
};

struct person_t persons[2] = {
  {
    { 
      'l', 'i', 's', 'a', 0 
    },
    16
  },
  {
    { 
       'c', 'o', 'r', 'n', 0 
    },
    52
  }
};

int main(int argc, char** argv) {
  FILE* fp = fopen("binary.bin", "wb");
  fwrite(persons, sizeof(persons), 1, fp);
  fclose(fp);
  return 0;
}

The above will write each packed structure into the file binary.bin, the size will be exactly 18 bytes. To get a better grasp on alignment/packing you can check out this so post: Structure padding and packing

Then in you php code you could read each block in a loop like so:

$filesize = filesize("binary.bin");
$fp = fopen("binary.bin", "rb");
$binary = fread($fp, $filesize);
fclose($fp);

// this variable holds the size of *one* structure
$block_size = 9;
$num_blocks = $filesize/$block_size;

// extract each block in a loop from the binary string
for ($i = 0, $offset = 0; $i < $num_blocks; $i++, $offset += $block_size) {
   $unpacked_block = unpack("C5char/Iint", substr($binary, $offset));
   $unpacked_block = array_values($unpacked_block);

   // walk over the 'name' part and get the ascii value
   array_walk($unpacked_block, function(&$item, $key) {
      if($key < 5) {
        $item = chr($item);
      }
   });
   $name = implode('', array_slice($unpacked_block, 0, 5));
   $age = implode('', array_slice($unpacked_block, 5, 1));
   print 'name: ' . $name . '<br />';
   print 'age: ' . $age . '<br />';
}
Community
  • 1
  • 1
Cyclonecode
  • 29,115
  • 11
  • 72
  • 93
  • the binary file was generated in C using a structure similar to what i have posted in the main question. I need to unpack it in PHP, idea is that user would upload it online. – Vlad Vladimir Hercules Aug 09 '16 at 20:40
  • The above shows how you can unpack it using php. Every third element in the `$unpacked` array will hold `Day`, `Month` and `Year` for every structure. – Cyclonecode Aug 09 '16 at 20:42
  • 1
    After looking at your code and at the structure that I have (150+ lines of it) I have noticed that completed binary files has data everywhere and it is seem to me that it is up to me how many should have structure A and which lines should have structure B. I like the way you have an class and put data into it. – Vlad Vladimir Hercules Aug 09 '16 at 21:15
  • I have tried using your later code but still not much luck... my file has some line breaks, might be something to do with it? while i was able to decode first few stings i am strugling to decode everything after a line break in the binary file – Vlad Vladimir Hercules Nov 11 '16 at 10:04
  • @user3402600 - Can you edit your question and add an complete example of your data? Have you also packed your structure so the data does not gets padded? Check out the link about padding and packing that I added above: http://stackoverflow.com/questions/4306186/structure-padding-and-packing – Cyclonecode Nov 11 '16 at 10:57
  • Sure, I will update the question, complete data is 160 lines long, so I will attach a snippet; The binary file is created by someone else and I have no control over it. However, based on the link sent by you I do believe that data is most likely to be padded. – Vlad Vladimir Hercules Nov 11 '16 at 11:51
  • @user3402600 - You can easily see if the file is padded if you know how many entries it should hold. Then you simply compare the filesize against the `sizeof(FILE_HEADER_tag) * NUM_ENTRIES` - Notice that this structure **must** be packed. If each entry is padded in the file, you'll need to figure out how its padded =) – Cyclonecode Nov 11 '16 at 12:19
  • @user3402600 - Best would if you could give me the binary file so I can check how it's structured. – Cyclonecode Nov 11 '16 at 15:57
  • 1
    I've managed to get it done. The Issue was with pointers, file had blocks of code; each block had different meaning and had a specific type, type defined the start and the end of the block. So essentially i have grabbed the block from starting point to ending point and decoded the way you have described. – Vlad Vladimir Hercules Nov 19 '16 at 15:47