18

I am building a class to represent an IPv4 subnet. I am storing the network address and the subnet mask as 4 byte binary strings, which are built during the constructor based on the arguments. One of the representations I would like the constructor to accept is CIDR notation.

My bitwise operations are a little rusty, and where I have got stuck is in converting a decimal integer CIDR representation of the subnet mask into a 4 byte binary string, and vice versa. I have also found I am unable to perform left/right shifts on string - which I'm sure I have successfully done before?


I have managed to get the conversion to the binary string to work with the following code:

// An example input value.
$mask = 24; // 255.255.255.0

if ($mask < 0 || $mask > 32) {
  // Invalid prefix size
  throw new RangeException('Invalid CIDR prefix size');
} else if ($mask === 0) {
  // Handle 0
  $mask = "\x00\x00\x00\x00";
} else {
  // Left-pad a 4-byte string with $mask set bits
  $mask = pack('N', (0x01 << 31) >> ($mask - 1));
}

I don't like this logic for two reasons:

  • I don't like treating 0 as a special case
  • I don't like the right-shift followed by a left-shift

I feel sure there is a way to do this more efficiently in a way that will handle 0 correctly without treating it as a special case.


When converting a binary string back to the decimal representation of a CIDR prefix size, I am currently using the code below. I have another very similar block of code when validating a subnet mask provided in other formats to ensure the set bits are contiguous.

// An example input value.
$mask = "\xff\xff\xff\x00"; // /24

// Convert the binary string to an int so bit shifts will work
$mask = current(unpack('N', $mask));

// A counter to represent the CIDR
$cidr = 0;

// Loop and check each bit
for ($i = 31; $i > 0; $i--) {
  if (($mask >> $i) & 0x01) {
    $cidr++;
  } else {
    break;
  }
}

// Return the result
return $cidr;

I don't like this because of the loop - I feel sure there is a more intelligent bitwise way to do it.


Is there a more intelligent way to do either of these tasks?

Thoughts/suggestions/general abuse please...


EDIT:

Any solutions need to work on PHP 4.3.10 and upwards, and must work on both 32- and 64-bit platforms. Please keep in mind that all integers in PHP are signed, and on a 32-bit platform anything >= 0x80000000 will be stored as a double (and will therefore not play nice with bitwise operations).

DaveRandom
  • 87,921
  • 11
  • 154
  • 174

5 Answers5

8

Your second problem can be alternatively seen as finding the first set bit in the inverted number (instead of finding the first not set bit in the non-inverted number), which is equivalent to finding the integer log2 of the number.

This is a fairly common problem in the bitwise world and there are numerous speed-optimized algorithms for it. You are using the (slow) obvious algorithm: http://www-graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious

But I assume that you don't really care about speed, but rather about brevity, in that case you could do something like this:

$cidr = (int) (32 - log(~current(unpack('N', $mask)) & 0xffffffff, 2));

The & 0xffffffff is necessary to be compatible with 64 bit integers.

NikiC
  • 100,734
  • 37
  • 191
  • 225
  • Ahh logarithms - that is definitely a more intelligent approach. I'm having a little difficulty wrapping my head around the `& 0xffffffff` though - in particular, won't this potentially break it on 32 bit platforms, since `0xffffffff` will be a float and this can produce unexpected results from bitwise operations (IIRC)? – DaveRandom Aug 23 '12 at 23:51
  • @DaveRandom probably http://chat.stackoverflow.com/transcript/message/5062806#5062806 – Dejan Marjanović Aug 23 '12 at 23:56
  • @DaveRandom Yeah, I was testing this on 64bit. I'm not sure whether the float would have affected anything (as it is still in the exact range of it). In any case the `log` solution is rather unstable; applying float operations in bitwise context doesn't seem to work out well. I'll delete the answer. – NikiC Aug 24 '12 at 00:01
  • Right, after playing around with this it seems that even with the fix for `0xffffffff` this still doesn't work, because `log(2, 2) == 1` and `log(1, 2) == 0` - which I think is what you were saying yesterday, I just didn't get it at the time. – DaveRandom Aug 30 '12 at 08:27
6

The second problem may be solved by a textual approach:

$mask = "\xff\xff\xff\x00";

$cidr = strspn(sprintf('%b', current(unpack('N', $mask))), 1);

It uses sprintf() to convert the integer into binary text representation and strspn() counts the number of initial ones.

Update

On 64-bit machines, the binary text representation is left-padded with 32 zeroes, so the code needs to be patched with ltrim() like so:

$cidr = strspn(ltrim(sprintf('%b', current(unpack('N', $mask))), 0), 1);

Update 2

The first problem can be solved with a textual approach as well, albeit necessitating the use of str_split() (which doesn't work in PHP 4.x):

$mask = vsprintf('%c%c%c%c', array_map('bindec', str_split(str_pad(str_repeat(1, $mask), 32, 0), 8)));

Update 3

What works for me is the following (tested on both 32 and 64 bit):

$mask = pack('N', 0xffffffff << (32 - $mask));

In the process, the number becomes a float but retains enough accuracy to deal with the bit shifting.

Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • In it's current form this won't work on 64 bit, `strspn()` will return `0` because the string will have 32 high order `0`s. However, adding an `ltrim($s , '0')` will fix that and this will indeed be a loopless 1-liner that should work everywhere, thanks for the answer. I'm going to hang fire with the accept until I've got some more opinions on the first block, but your definitely worthy of an upvote. – DaveRandom Aug 24 '12 at 08:14
  • @DaveRandom thanks for testing :) I ran it on my EC2 instance (which should be 64bit), but apparently the PHP build is not 64bit aware? Anyway, I've updated the answer with your suggestion :) – Ja͢ck Aug 24 '12 at 08:57
  • @DaveRandom regarding the first block, I would have suggested using `str_repeat()` and `str_pad()` to create the binary representation and then use `str_split()` and `bindec()` to create an array of bytes which are then packed ... but `str_split()` won't work on 4.3 =( – Ja͢ck Aug 24 '12 at 22:38
  • Well you can easily mimic `str_split()` in PHP4 with `preg_split()`, I'll have a play around with it. – DaveRandom Aug 25 '12 at 09:28
  • @DaveRandom not sure how you could mimic `str_split()` actually; in any case, I've updated my answer to include a workable solution assuming that it's possible :) – Ja͢ck Aug 30 '12 at 01:57
  • Cheers for the update. FTR, [here](http://codepad.viper-7.com/EuGKfz) is how you could mimic `str_split()` with `preg_split()`, min PHP version 4.0.5. – DaveRandom Aug 30 '12 at 08:08
  • @DaveRandom ugh, of course :) nice touch with the `str_repeat()` clone hehe – Ja͢ck Aug 30 '12 at 08:20
  • As much as it goes against the original question (a *bitwise* way of doing it) the `sprintf('%b')` approach does seem to make everything a lot easier, and solves a couple of other problems as well. I just find it a little ridiculous that there is no bitwise solution to this, especially since (quoting [Wikipedia](http://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_blocks)) `CIDR is principally a bitwise, prefix-based standard for the interpretation of IP addresses`. – DaveRandom Aug 30 '12 at 08:29
  • @DaveRandom I whole heartedly agree with that; but handling big numbers (mainly) and binary representations (to some degree) in PHP are just not trustworthy enough :) – Ja͢ck Aug 30 '12 at 08:32
  • By the way, I do now have a pure-bitwise solution to the first one, a slightly modified version of the solution provided by @sixeightzero: `$intMask = ~((~0 << 16) << 16) << (32 - $decCidr);`. The left side is a slightly nasty but definitely works way to get only the 32 least significant bits set in a way that works on 32 and 64 bit, then left shift it `32 - $cidr` bits, and then when `pack('N')`ed it gives the right result. – DaveRandom Aug 30 '12 at 08:40
  • @DaveRandom wouldn't that be the same as `(0xffffffff << (32 - $mask)) & 0xffffffff` though? – Ja͢ck Aug 30 '12 at 09:05
  • Doesn't work on 32 bit, `0xffffffff` ends up as a float because PHP doesn't have an unsigned integer type. The `& 0xffffffff` is not necessary because `pack('N')` only looks at the 32 least significant bits, regardless of platform. – DaveRandom Aug 30 '12 at 09:21
  • @DaveRandom Yes, it becomes a float, but it has enough significant bits to represent that value; see here: http://codepad.viper-7.com/QzMly3 – Ja͢ck Aug 30 '12 at 09:28
  • Hmmm... interesting. When I run that code on the target platform for this specific task is get `bool(false)` on the 13th iteration. Although I would be the first to admit I don't really understand double precision floating point in terms of how it is stored, I do know it is heavily platform dependent. This is somewhat unique I suspect - it's a little-endian MIPS processor. – DaveRandom Aug 30 '12 at 10:24
  • @DaveRandom perhaps an issue with the `precision` php.ini setting? – Ja͢ck Aug 30 '12 at 10:25
3

Why not do:

$netmask = ( (1<<32) -1 ) << ( 32 - $cidr);

You said you don't like the left then right shifts, how about two left shifts ;)

After this, I toss it into ip2long or long2ip. To go from mask to CIDR, I'll do:

$mask = ip2long($mask);
$base = ( ( 1 << 32 ) - 1 );
$cidr = 32 - log( ( $mask ^ $base ) + 1 , 2);

Of course, you can pack or dechex, or unpack the above as needed to fit your storage type.

Mike Mackintosh
  • 13,917
  • 6
  • 60
  • 87
  • 1
    `$netmask = ((1 << 32) - 1) << (32 - $cidr);` doesn't work on 32-bit platforms because (ridiculously) PHP won't let you shift 32 bits. However (equally ridiculously) you can work around this with `$netmask = (((1 << 31) << 1) - 1) << (32 - $cidr);`. I don't particularly like the `- 1` approach, I'd rather do `~((~0 << 31) << 1)` to get the 32 least significant bits set. But the `<< (32 - $cidr)` is definitely better. – DaveRandom Aug 29 '12 at 09:26
  • The same "can't shift 32 bits" issue also applies to the inverse, but can be worked around in the same way, the key idea is the `log(2)` which does work. I'll ignore the fact that `ip2long()` requires a dotted decimal :-P - +1 – DaveRandom Aug 29 '12 at 09:26
  • The other solution i use instead of `ip2long` is `inet_pton` and `ntop`. Have you looked into them before? – Mike Mackintosh Aug 29 '12 at 16:52
  • `PHP 5 >= 5.1.0` - aside from that, the pack operation is not a problem, it's specifically converting the integer CIDR that's causing me the problem. I am quite familiar with `pton`/`ntop` already, but thanks for the suggestion :-) – DaveRandom Aug 29 '12 at 21:48
  • Can you install custom extensions? Im working on one now for ipv6 and ill add ipv4 support. – Mike Mackintosh Sep 01 '12 at 17:12
  • If I could build anything for the box I'd just put PHP5.4 on it. Unfortunately the manufacturer don't supply any compilers and I have yet to successfully build a cross compiler for it - although I have tried and continue to try. – DaveRandom Sep 01 '12 at 18:33
3

Why compute it at all? Just create an array of 32 subnet masks.

$cidr2mask = array( "\x00\x00\x00\x00", "\x80\x00\x00\x00", "\xc0\x00\x00\x00", "\xe0\x00\x00\x00",
                    "\xf0\x00\x00\x00", "\xf8\x00\x00\x00", "\xfc\x00\x00\x00", "\xfe\x00\x00\x00", 
                    "\xff\x00\x00\x00", "\xff\x80\x00\x00", "\xff\xc0\x00\x00", "\xff\xe0\x00\x00", 
                    "\xff\xf0\x00\x00", "\xff\xf8\x00\x00", "\xff\xfc\x00\x00", "\xff\xfe\x00\x00", 
                    "\xff\xff\x00\x00", "\xff\xff\x80\x00", "\xff\xff\xc0\x00", "\xff\xff\xe0\x00", 
                    "\xff\xff\xf0\x00", "\xff\xff\xf8\x00", "\xff\xff\xfc\x00", "\xff\xff\xfe\x00", 
                    "\xff\xff\xff\x00", "\xff\xff\xff\x80", "\xff\xff\xff\xc0", "\xff\xff\xff\xe0", 
                    "\xff\xff\xff\xf0", "\xff\xff\xff\xf8", "\xff\xff\xff\xfc", "\xff\xff\xff\xfe");

$mask2cidr = array_flip($cidr2mask);

Then just use $cidr2mask[$cidr]; and $mask2cidr[$mask].

Mike Mackintosh
  • 13,917
  • 6
  • 60
  • 87
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • While this is not a completely stupid idea, I'd like to avoid this if possible - mainly on the principle of the thing, but also if I ever decide to expand this to support IPv6 it will become much messier. +1 for effort though. – DaveRandom Sep 01 '12 at 11:44
  • It's just 4 times bigger for IPv6, not messier at all. Whereas the bit-twiddling will be 4 times as expensive. Using arrays like this is O(1), algorithms will likely be O(n). It's too easy to get hung up on trying to find "clever" algorithms when a simple approach like this will work. Check any reference on program optimization, they should all recommend solutions like this when they're applicable. – Barmar Sep 01 '12 at 11:51
  • It's a fair point and I don't disagree with you - I just find it a little ridiculous that since (to quote Wikipedia) `CIDR is principally a bitwise, prefix-based standard for the interpretation of IP addresses` there is not a nice, simple, bitwise way to do it. But I take your point, I may simply be over thinking it. I admit that part of the point of this is a technical exercise to try and improve my bitwise skillz/understanding. – DaveRandom Sep 01 '12 at 12:49
1

(Shameless self promotion)

I've built a PHP library which does very similar things for IP addresses.

Here's how I build an IPv4 subnetmask:

<?php
$mask = (~0) << (32 - $cidr);
$binary_mask = pack('N', $mask);
echo implode('.', unpack('C4', $binary_mask));

It won't work on older versions of PHP due to namespaces, but have branches from before their addition to which I'd be happy to accept pull requests to fix compatibility issues. The code is (almost) 100% covered by unit tests :)

The only dependency is the pear Math_BigInteger package.

Lethargy
  • 1,859
  • 14
  • 15