3

I have a struct with some properties (like int A1, int A2,...). I store a list of struct as binary in a file.

Now, I'm reading the bytes from file using binary reader into Buffer and I want to apply a filter based on the struct's properties (like .A1 = 100 & .A2 = 12).

The performance is very important in my scenario, so I convert the filter criteria to byte array (Filter) and then I want to mask Buffer with Filter. If the result of masking is equal to Filter, the Buffer will be converted to the struct.

The question: What is the fastest way to mask and compare two byte arrays?

Update: The Buffer size is more than 256 bytes. I'm wondering if there is a better way rather than iterating in each byte of Buffer and Filter.

Amir Pournasserian
  • 1,600
  • 5
  • 22
  • 46
  • How large is the buffer, and what is the nature of the mask? For example, one common approach is to use unsafe code to treat a byte[] as a ulong*, so that you can process 8 bytes at a time instead of 1 (plus the last few bytes manually if it isn't an exact multiple of 8) – Marc Gravell Nov 24 '13 at 10:58
  • The buffer size is more than 256 bytes. I'm wondering if there is a better way rather than iterating in each byte of Buffer and Filter. – Amir Pournasserian Nov 24 '13 at 11:02

3 Answers3

3

Try a simple loop with System.BitConverter.ToInt64(). Something Like this:

byte[] arr1;
byte[] arr2;

for (i = 0; i < arr1.Length; i += 8) 
{
    var P1 = System.BitConverter.ToInt64(arr1, i);
    var P2 = System.BitConverter.ToInt64(arr2, i);

    if((P1 & P2) != P1) //or whatever
        //break the loop if you need to.
}

My assumption is that comparing/masking two Int64s will be much faster (especially on 64-bit machines) than masking one byte at a time.

dotNET
  • 33,414
  • 24
  • 162
  • 251
  • Don't you think that using bit shifting to fill p1 and p2 has better performance? – Amir Pournasserian Nov 24 '13 at 11:10
  • @MarcGravell Please see ToInt64 method in: http://www.dotnetframework.org/default.aspx/4@0/4@0/DEVDIV_TFS/Dev10/Releases/RTMRel/ndp/clr/src/BCL/System/BitConverter@cs/1305376/BitConverter@cs – Amir Pournasserian Nov 24 '13 at 11:24
  • @Amir I'm very familiar with the method (I do a lot of binary protocol work); you haven't said how shifting would help here – Marc Gravell Nov 24 '13 at 11:27
  • @MarcGravell sorry, you are right. I was thinking of the performance cost of 2 function (Int64) call. But it reduces the iteration to Length/8 (rather than working on a byte). – Amir Pournasserian Nov 24 '13 at 11:39
  • @MarcGravell Is there any P/Invoke function to do that faster? I read memcmp is the fastest way to compare 2 byte arrays, but I want to do AND operation on the arrays. – Amir Pournasserian Nov 24 '13 at 11:46
  • @Amir only for direct equality compare, which this isn't. – Marc Gravell Nov 24 '13 at 11:54
  • 1) `BitConverter` is *slow*. Using bitshifts and binary or is faster. 2) It uses native endianness. A file should have fixed endianness. So IMO this is a bad solution. – CodesInChaos Nov 24 '13 at 13:48
3

The way I would usually approach this is with unsafe code. You can use the fixed keyword to get a byte[] as a long*, which you can then iterate in 1/8th of the iterations - but using the same bit operations. You will typically have a few bytes left over (from it not being an exact multiple of 8 bytes) - just clean those up manually afterwards.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Which can cause issues with endianness (file usually has fixed endianness, `long*` has native endianness) and on some architectures unaligned memory access can be slow or even broken. – CodesInChaos Nov 24 '13 at 13:46
  • 2
    @CodeInChaos if you are using bit operations, endianness doesnt usually matter. That is more of a concern when thinking about the value *an an integer*. Just applying a mask: not a problem. The trick is to build the mask using bit ops rather than integer ops, so that the same endianness rules are applied to both. Then you dont need to know which it is. – Marc Gravell Nov 24 '13 at 14:58
0

Once you've got the two arrays - one from reading the file and one from the filter, all you then need is a fast comparison for the arrays. Check out the following postings which are using unsafe or PInvoke methods.

What is the fastest way to compare two byte arrays?

Comparing two byte arrays in .NET

Community
  • 1
  • 1
Matthias
  • 5,574
  • 8
  • 61
  • 121
  • Thanks, but first of all I want to AND the Filter and Buffer. The posts check for equality. – Amir Pournasserian Nov 24 '13 at 10:58
  • Try [this one](http://stackoverflow.com/questions/5251355/compare-byte-arrays-with-masking) or [this second example](http://stackoverflow.com/questions/6868680/how-to-combine-two-image-byte-arrays-together-quickly-utilizing-a-mask-array-i). – Matthias Nov 24 '13 at 11:03