2

I have a binary file that I need to be able to parse through. I'm looking to specify an offset and then have the program return the byte value at that location.

I'm not sure about how to go about this. I've got the file open part, but I don't know how to get the program to jump to the location.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jason W.
  • 153
  • 1
  • 4
  • 10
  • Can you show what you've tried so far? – gpojd Mar 08 '13 at 16:42
  • 2
    [`perldoc -f seek`](http://search.cpan.org/perldoc?perlfunc#seek) – mob Mar 08 '13 at 16:47
  • I was looking at seek, but I didn't know what to really do with it. Do I have that go into a variable to print or do I need something else to actually pull our the byte data? – Jason W. Mar 08 '13 at 16:50

2 Answers2

5
use Fcntl qw(:seek);

my($fh, $filename, $byte_position, $byte_value);

$filename      = "/some/file/name/goes/here";
$byte_position = 42;

open($fh, "<", $filename)
  || die "can't open $filename: $!";

binmode($fh)
  || die "can't binmode $filename";

sysseek($fh, $byte_position, SEEK_CUR)  # NB: 0-based
  || die "couldn't see to byte $byte_position in $filename: $!";

sysread($fh, $byte_value, 1) == 1
  || die "couldn't read byte from $filename: $!";

printf "read byte with ordinal value %#02x at position %d\n",
     ord($byte_value), $byte_position;
tchrist
  • 78,834
  • 30
  • 123
  • 180
  • 1
    tchrist, you are awesome. That is exactly what I needed. – Jason W. Mar 08 '13 at 17:12
  • You may rely on and teach autodie and cut that SLOC in half. https://rt.cpan.org/Public/Bug/Display.html?id=54777 is fixed. – daxim Mar 09 '13 at 18:57
  • 2
    @daxim Only if you are really careful to write `use autodie v2.12` every time, which I don’t really trust people to do, nor have. Remember that this has to run on a basic vendor Perl installation. I’m also worried about folks snipping pieces of it without that, in which case they have really big trouble. It’s a cute feature for quickies, but you have to be very careful with such advice. – tchrist Mar 09 '13 at 19:26
2

Use seek and the handy constants from the Fcntl module as in

#! /usr/bin/env perl

use bytes;
use strict;
use warnings;

use Fcntl ':seek';

open my $fh, "<", $0 or die "$0: open: $!";

seek $fh, 0, SEEK_END or die "$0: seek: $!";

my $last = tell $fh;
die "$0: tell: $!" if $last < 0;

for (1 .. 20) {
    my $offset = int rand($last + 1);
    seek $fh, $offset, SEEK_SET or die "$0: seek: $!";
    defined read $fh, my $byte, 1 or die "$0: read: $!";
    $byte = "\\$byte" if $byte eq "'" || $byte eq "\\";
    printf "offset %*d: \\x%02x%s\n",
      length $last, $offset,
      unpack("C", $byte),
      $byte =~ /[[:print:]]/a ? " '$byte'" : "";
}
__DATA__
          
 :   ℞:      
                          
                                      
 ¡ƨdləɥ ƨᴉɥʇ ədoɥ puɐ ʻλɐp əɔᴉu ɐ əʌɐɥ ʻʞɔnl poo⅁ 

Sample output:

offset   47: \x65 'e'
offset  392: \x20 ' '
offset  704: \xf0
offset  427: \x5c '\''
offset  524: \x61 'a'
offset 1088: \x75 'u'
offset  413: \x20 ' '
offset 1093: \xbb
offset 1112: \xc9
offset  377: \x24 '$'
offset   64: \x46 'F'
offset  361: \x62 'b'
offset  898: \xf0
offset  566: \x5d ']'
offset  843: \xf0
offset 1075: \xc9
offset  280: \x20 ' '
offset    3: \x2f '/'
offset  673: \x8a
offset  153: \x20 ' '

The contents of the __DATA__ section were borrowed from Tom’s excellent suggestions for dealing with UTF-8 in Perl programs.

Community
  • 1
  • 1
Greg Bacon
  • 134,834
  • 32
  • 188
  • 245
  • 3
    Probably shouldn’t print those bytes as characters: you don’t know the encoding. Plus if you have PERL_UNICODE set to something with S in it, Perl treats those as code points and may end up writing out multiple bytes where you thought you were getting just 1. – tchrist Mar 08 '13 at 17:14