4

Presently I am attempting to do this challenge (http://cryptopals.com/sets/1/challenges/1) and I am having some trouble completing the task in C#. I can not seem to parse the number into a big integer.

So code looks like below:

        string output = "";
        BigInteger hexValue = BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6");

        output = Convert.ToBase64String(hexValue.ToByteArray());
        Console.WriteLine(hexValue);
        Console.WriteLine(output);
        Console.ReadKey();
        return "";

And at present the problem I am getting is when I run the program it fails with the error

System.FormatException: 'The value could not be parsed.' and I am not entirely sure why.

So, what is the appropriate way to get a large integer from a string into a BigInt?

nvoigt
  • 75,013
  • 26
  • 93
  • 142
John
  • 219
  • 1
  • 4
  • 15

3 Answers3

9

The initial problem

The BigInteger.Parse method expects the value to be decimal, not hex. You can "fix" that by passing in NumberStyles.HexNumber.

The bigger problem with using BigInteger for this

If you're just trying to convert a string of hex digits into bytes, I would avoid using BigInteger at all. For one thing, you could end up with problems if the original byte array started with zeroes, for example. The zeroes wouldn't be in the resulting byte array. (Sample input: "0001" - you want to get two bytes out, but you'll only get one, after persuading it to parse hex.)

Even if you don't lose any information, the byte[] you receive from BigInteger.ToByteArray() isn't what you were probably expecting. For example, consider this code, which just converts the data to byte[] and back to hex via BitConverter:

BigInteger bigInt = BigInteger.Parse("1234567890ABCDEF", NumberStyles.HexNumber);
byte[] bytes = bigInt.ToByteArray();
Console.WriteLine(BitConverter.ToString(bytes));

The output of that is "EF-CD-AB-90-78-56-34-12" - because BigInteger.ToByteArray returns the data in little-endian order:

The individual bytes in the array returned by this method appear in little-endian order. That is, the lower-order bytes of the value precede the higher-order bytes.

That's not what you want - because it means the last part of the original string is the first part of the byte array, etc.

Avoiding BigInteger altogether

Instead, parse the data directly to a byte array, as in this question, or this one, or various others. I won't reproduce the code here, but it's simple enough, with different options depending on whether you're trying to create simple source code or an efficient program.

General advice on conversions

In general it's a good idea to avoid intermediate representations of data unless you're absolutely convinced that you won't lose information in the process - as you would here. It's fine to convert the hex string to a byte array before converting the result to base64, because that's not a lossy transformation.

So your conversions are:

  • String (hex) to BigInteger: lossy (in the context of leading 0s being significant, as they are in this situation)
  • BigInteger to byte[]: not lossy
  • byte[] to String (base64): not lossy

I'm recommending:

  • String (hex) to byte[]: not lossy (assuming you have an even number of nybbles to convert, which is generally a reasonable assumption)
  • byte[] to String (base64): not lossy
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I am terribly confused: hex string to BigInteger is **not lossy**. Nor is BigInteger to byte array. The sign is not lost, but encoded in the top byte (that is why sometimes, for positive values, an extra top 0 byte must be returned, or for negative values, an extra 0xFF byte). And `-0x80000000`is not the same as `0x80000000`, when converted to BigInteger. – Rudy Velthuis Apr 08 '18 at 16:37
  • @RudyVelthuis: Yes, it *is* lossy. "00000001" is parsed to the same value as "01", which is a problem when you're trying to parse a hex string as binary data. The first value represents 4 bytes, the second value represents 1 byte. They both parse to the same `BigInteger`, therefore information has been lost. You're right about the `ToByteArray` not losing information though - I'll edit for that. – Jon Skeet Apr 08 '18 at 16:39
  • 1
    @RudyVelthuis it's not lossy if hex string represents a number, but in this case hex string represents byte array (because the goal is to convert it to base64 string). Of course it doesn't make sense to use big integer in this case at all, but that's what this answer talks about. – Evk Apr 08 '18 at 16:41
  • I see that in the challange he linked to, he must convert to Base64. Then he is doing it the wrong way, indeed. – Rudy Velthuis Apr 08 '18 at 16:46
5

Use NumberStyles.HexNumber:

BigInteger.Parse("49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6", 
                 NumberStyles.HexNumber,
                 CultureInfo.InvariantCulture);

If your number is supposed to be always positive, add a leading zero to your string.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
  • That's still potentially going to lose information though, because the OP is fundamentally trying to parse to a byte array, not an integer - they've just chosen to go *via* an integer, which is a bad choice IMO. For example, with your code "0001" will end up producing a single byte. – Jon Skeet Apr 08 '18 at 15:56
3

The problem is that the input is not decimal but hexadecimal, therefore you need to pass an additional parameter for parsing:

BigInteger number = BigInteger.Parse(
            hexString,
            NumberStyles.AllowHexSpecifier);
Maaaaa
  • 388
  • 2
  • 16
  • 2
    @CodesInChaos: It seems to, despite the name. (Modulo the problems I mention in my answer.) In fact, the documentation states "Strings that are parsed using this style cannot be prefixed with "0x" or "&h"." So it's really, really badly named, as it *doesn't* allow a hex specifier. – Jon Skeet Apr 08 '18 at 16:14