1

I have some string representing a number in hex and I want to convert it to base-64. How do I do this and how does this suggestion work? As I need to have an understanding of how it works.

The first thing I thought of was (I'm a noob) implementing a simple algorithm that would proceed as one would when working with pen and paper, though I imagine Python has this sort of stuff "built-in".

I can implement this through searching on the internet however I need an understanding of how it works.

Some sample questions to give you an idea of the explanations I need:

  • If your answer requires strings of the type b'...' could you please explain what they are and why does b64encode() seems to only accept objects of this type as arguments?
  • Why does int() only work up to base-36 and how can I in general handle conversion between different bases tidily and could you give a further explanation if your solution involves this function?

So if anyone could give me some pointers here. I am not being able to extract much from the documentation as this type of knowledge seems to be already expected.

Thanks.

AER
  • 1,549
  • 19
  • 37
scarlett
  • 21
  • 4
  • As a quick guess the base 36 issue may just be encoding (26 letters and 10 numerals to encode is all you have to output using `10=A`) – AER Oct 05 '16 at 22:43
  • Are there any other functions such as int() that I can use to convert between bases >36 as succinctly? – scarlett Oct 05 '16 at 22:46
  • With regards to your question about strings of the form `b"hi scarlett"` see this question: http://stackoverflow.com/questions/6269765/what-does-the-b-character-do-in-front-of-a-string-literal – juanpa.arrivillaga Oct 05 '16 at 22:47
  • 2
    You're asking many different questions in one Stack Overflow question. Please keep it limited to one question per question. – Colonel Thirty Two Oct 05 '16 at 22:48
  • *what's with the `b'...'`'". Those are [`bytes`](https://docs.python.org/3/whatsnew/2.6.html?#pep-3112-byte-literals) – OneCricketeer Oct 05 '16 at 22:49
  • 1
    @Colonel kind of. I wasn't really looking for answers to those many questions particularly, the purpose of mentioning them was rather to serve as an example of the type of question that I had, and I was hoping that someone could show me the way to somewhere I could learn about those. Anyway, I'm sorry, I wasn't really sure how to phrase it. – scarlett Oct 05 '16 at 22:54

2 Answers2

2

Here is a code that walks you trough the process of converting from a hex string to a b64 encoded string.

import base64
x=int('0xABCDEF01',base=16)
print("x  : ",x)
b=x.to_bytes(length=4,byteorder='big')
print("b  : ",b)
e=base64.b64encode(b)
print("e  : ",e)
b2=base64.b64decode(e)
print("b2 : ",int(b2.hex(),base=16))

Output:

x  :  2882400001
b  :  b'\xab\xcd\xef\x01'
e  :  b'q83vAQ=='
b2 :  2882400001

Some (lengthy) explanations: so we start with a hex in a string, nothing special, int takes it along with the base to turn it into a regular integer x. To python x is a bunch of bits representing a number that will be printed in base 10 most of the time. We can ask the bit representation using x.to_bytes. The result is a sequence of bytes that will print as b'...' note that the printing process automatically tries to convert the bytes to ascii caracters or to somthing like\xab if that given byte isn't associated with a ascii character. So we then feed the bytes to b64encode that is usually use to process files hence that byte-object requirement and it spits an ascii-string as a byte object. With that result the reverse process is similar : b64 ascii-string -> binary number in bytes -> hex -> int.

jadsq
  • 3,033
  • 3
  • 20
  • 32
  • `b64encode` returns an ASCII-string? So, for instance the first `q` in `e` is a byte holding the value 113? For the computer, that string is a different number than the one it happens to represent for us (a number in base-64)? – scarlett Oct 05 '16 at 23:36
  • Base64 orignal purpose is to take raw bytes of data (that may not be printable as ascii) and turn it to an ascii sequence that can be copy/pasted in a text message, to give an example. So **the real/expected output of b64 is the sequence of characters you can read**. But as you have noted, for the computer (and for python) those characters are just a sequence of bytes; to make python understand that they are indeed a string of ascii-characters you would have to call `e.decode("ascii")`. – jadsq Oct 05 '16 at 23:54
1

assume input of 'aaccffdde5e5ff'

import binascii,base64
input_str = 'aaccffdde5e5ff'
dehexed_str = binascii.unhexlify(input_str)
base64_str = base64.b64encode(dehexed_str)

b'...' is just a bytestring you can encode a normal unicode string to bytes with

as_bytes = u'hello world'.encode('utf-8')

to handle arbitrary base conversion to base10 see this tutorial

http://mathbits.com/MathBits/CompSci/Introduction/tobase10.htm

here is a function to convert any string in any alphabet to decimal base 10

def int10(s,alphabet):
    base = len(alphabet)
    return sum([alphabet.index(c)*base**i for i,c in enumerate(s[::-1])])

hexAlphabet="0123456789abcdef"
print(int10('f3',hexAlphabet))
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179