0

Assuming I have this Method.

private static void Example(string data)
    {
        Console.WriteLine("Initial : {0}", data);

        data = data.PadRight(data.Length + 1, '0'); 
        Console.WriteLine("Step 1 : {0}", data);

        data = data.PadRight(data.Length + 4 - data.Length % 4, '=');
        Console.WriteLine("Step 2 : {0}", data);

        byte[] byteArray = Convert.FromBase64String(data);
        string newData = Convert.ToBase64String(byteArray);
            Console.WriteLine("Step 3 : {0}", newData);
    }

I expect the output given the input string "1" to be as follows

Initial : 1
Step 1 : 10
Step 2 : 10==
Step 3 : 10==

Instead the output is this.

Initial : 1
Step 1 : 10
Step 2 : 10==
Step 3 : 1w==

And I have no idea why. I would expect the output to be the same as the input but it isn't. I have tried replacing

data = data.PadRight(data.Length + 1, '0'); 

with

data = data + "0";

It appears with longer input strings too, for example strings with a length of 5 or 9. It works fine if I add "=" but then I exceed my padding limit with Convert.FromBase64String()

So my question is really what is going on and how can I get my expected output,? What am I doing wrong?

Edit: For those confused as to why I'm using bas64 it is related to this PHP decrypting data with RSA Private Key

Community
  • 1
  • 1
  • this sonuds like an ENCODING Issue – MethodMan Aug 21 '14 at 21:58
  • would this work for you `string newData = System.Text.Encoding.Default.GetString(byteArray);` – MethodMan Aug 21 '14 at 22:00
  • @DJKRAZE: No, absolutely not. This is nothing to do with that kind of encoding. – Jon Skeet Aug 21 '14 at 22:03
  • @DJKRAZE your suggestion resulted in `Step 3 : x` But the thing is if the input is 2 characters long and I add 2 padding "=" then its fine. Also if I add multiple characters to the string that aren't padding then any added characters become "w" – That Homeless Guy Aug 21 '14 at 22:09
  • 2
    @ThatHomelessGuy: What are you actually trying to *achieve*? You should be trying to encode *binary* data using base64... why would you take a string and pad it like this? – Jon Skeet Aug 21 '14 at 22:11
  • I'm padding the string because ¬Convert.FromBase64String¬ only works with strings that are divisible by 4 in length. – That Homeless Guy Aug 21 '14 at 22:25
  • Garbage in, garbage out. Each base64 character encodes 6 bits. You are supplying 12 bits of data with "10", that's not a multiple of 8. So it can only be decoded to a single byte, 0xd7. Which is encoded as "1w==". – Hans Passant Aug 21 '14 at 22:31

1 Answers1

3

Basically, there's no byte array which would be encoded to 10==.

If a base64 string ends with ==, that means that the final 4 characters only represent a single byte. So only the first character and the first 2 bits of the second character are relevant. Looking at the Wikipedia table, 10 means values of:

'1' = 53  '0' = 52
110101    110100

So that's encoding a byte of 1101 0111, and then the final four bits (0100) are ignored. When you re-encode the data, it's using 0s for the final four bits instead, giving:

'1' = 53  'w' = 48
110101    110000

Fundamentally, it's not clear what you're trying to do - but if your input is part of a base64-encoded value, that's pretty odd. The code is behaving the way I'd expect it to - it's just not useful code...

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I am using base64 as I am encrypting the bytes and the converting them to a hex string (for easy transport) then I am sending the hex string to a php script on a web-server which converts the hex into a binary string and decrypts them back before converting them to text again. Which all works up till the point of padding non uniform length inputs. – That Homeless Guy Aug 21 '14 at 22:15
  • @ThatHomelessGuy: What does encryption and hex have to do with base64? If you're encrypting, the result of that encryption should be *binary* data, not text - so you can pass that binary data to `Convert.ToBase64String`. If you're expecting base64 to be encryption, it's simply not, and shouldn't be thought of as such. It's still not really clear what you're doing - you should show *all* the steps (what you've started with, where you're encrypting, every conversion basically) - and put that **in the question**. I've a sneaking suspicion you've just misunderstood the purpose of base64 though. – Jon Skeet Aug 21 '14 at 22:18
  • I am replacing the padding character with "/" I think I misunderstood base64 padding. Thanks. – That Homeless Guy Aug 21 '14 at 22:20
  • @ThatHomelessGuy: That really doesn't explain anything... Again, please give more detail in the *question*. – Jon Skeet Aug 21 '14 at 22:21
  • http://stackoverflow.com/questions/25222881/php-decrypting-data-with-rsa-private-key-solved – That Homeless Guy Aug 21 '14 at 22:22
  • 1
    @ThatHomelessGuy: Well that question assumes you're starting with a valid base64 string containing data to encrypt. Why are you accepting a base64 string rather than just a byte array? And why would you need to add padding yourself? What is the input to your `Encrypt` method? Because if it's *not* already a base64 string - if it's arbitrary text - then this is *not* the right approach. You should be using something like `Encoding.UTF8.GetBytes(data)` instead. – Jon Skeet Aug 21 '14 at 22:25
  • I'll try that solution (I previously was using ¬Encoding.Unicode.GetBytes(data)¬ ) but now I need to figure out how to convert from binary string to UTF8 in PHP. I was pointed at Base64 as it was apparently better for lossless data conversion with bytes as I previously encountered. – That Homeless Guy Aug 21 '14 at 22:31
  • 1
    @ThatHomelessGuy: You use base64 for a lossless conversion *from* arbitrary binary data *to* text data. You use something like UTF-8 for a lossless conversion *from* arbitrary text data *to* binary data. It's really important to understand all the conversions you perform on your data - if you use an inappropriate one, you're almost bound to lose information. In PHP, I suspect you just want [`utf8_decode`](http://php.net/manual/en/function.utf8-decode.php) by the way... – Jon Skeet Aug 21 '14 at 22:32
  • Thanks I'll go rebuild my authentication around the info you have given me. I am not very experienced with PHP and that very much reflects on how I interact with it. I got lost trying to compromise with PHP. Oh and thanks alot. – That Homeless Guy Aug 21 '14 at 22:37
  • ¬Encoding.UTF8.GetBytes(data)¬ was the perfect solution and ¬utf8_decode¬ php side was the job. Again thanks. – That Homeless Guy Aug 21 '14 at 22:45
  • After seeing what your answer is @Jon Skeet this makes total sense it was hard to determine from the OP's original question what he was truly trying to achieve.. I should have asked that question first – MethodMan Aug 22 '14 at 13:52