0

I am trying to port a python script that use scrypt to generate a password.

import pylibscrypt
scrypt = pylibscrypt.scrypt

password = 'test123'.encode("utf_16_be", "replace")
salt = '384eaed91035763e'.decode('hex')

key = scrypt(password, salt, 16384, 8, 1, 32)
print (key.encode('hex'))

Result should be: 9f2117416c7b2de9419a81c4625a28e40c16ac92aaa3000bb2c35844b3839eb1

In C, I use libsodium crypto_pwhash_scryptsalsa208sha256_ll() function.

If I try to pass the password from that variable:

const uint8_t password[] = "test123"; 

I am not getting the same key. I tried to remove ".encode("utf_16_be", "replace")" from the python script and then I get the same result.

So how can I convert my C variable to be the same as the python variable?

EDIT: The original software is in Java and seems to encode strings as UTF-16 big endian. Now I need to encode a C string to the same format.

01BTC10
  • 506
  • 3
  • 9
  • 19
  • Why not encode it as ASCII? Is there a chance the password will contain any non-ascii characters? – Eugene Sh. Feb 15 '17 at 15:11
  • I think the python software must support unicode so this is why it is not ascii. – 01BTC10 Feb 15 '17 at 15:12
  • Since it is a password, you may restrict it to be ascii-only.. – Eugene Sh. Feb 15 '17 at 15:13
  • I am converting a python script to C and the problem is that if I use ascii then it doesn't work since it give a different key. – 01BTC10 Feb 15 '17 at 15:14
  • Well.. on Linux you can use [`iconv`](https://linux.die.net/man/3/iconv) and friends to convert between encodings. – Eugene Sh. Feb 15 '17 at 15:17
  • Thanks I am reading the docs. – 01BTC10 Feb 15 '17 at 15:23
  • Why `utf_16_be` and not UTF-8? – zaph Feb 15 '17 at 16:16
  • @zaph I have never encountered services allowing passwords outside of ASCII range... Not saying there are none though. – Eugene Sh. Feb 15 '17 at 16:16
  • Do you think the Chinese use ASCII for passwords or that they do not use passwords? ASCII died a decade or more ago. – zaph Feb 15 '17 at 16:18
  • @zaph I don't know about Chinese, but I were living in two countries using a non-Latin based alphabets. And yes, the password on their websites are still using ASCII passwords (if you don't like the ASCII term, it is saying "English letters, numbers and special characters such as @#$). Simply because you can type these thing on *any* keyboard. – Eugene Sh. Feb 15 '17 at 16:20
  • NIST recommendation ([Special Publication 800-63-3: Digital Authentication Guidelines](https://pages.nist.gov/800-63-3/)): "Applications must allow all printable ASCII characters, including spaces, and **should accept all UNICODE characters**, too, including emoji!" – zaph Feb 15 '17 at 16:22
  • I understand my problem now... The original sofware is in Java and I think it encode string in UTF-16 big endian format. Now I need to figure out how to do the conversion in C so I get the same output. – 01BTC10 Feb 15 '17 at 16:39
  • If I use the UTF-16 variable then it works: const uint8_t password[] = {0x00, 0x74, 0x00, 0x65, 0x00, 0x73, 0x00, 0x74, 0x00, 0x31, 0x00, 0x32, 0x00, 0x33}; – 01BTC10 Feb 15 '17 at 16:58
  • I just posted a solution if any of you want to review it. Thanks. – 01BTC10 Feb 15 '17 at 18:13

1 Answers1

0

I found a solution inspired from: Create UTF-16 string from char*

 size_t uint8_t_UTF_16(uint8_t *input, uint8_t *output, size_t inlen) {
  size_t outlen = 0.0;

  for (size_t i = 0; i < inlen; i++) {
    if (isascii(input[i])) {
      output[i] = 0;
      output[i + 1] = input[i / 2];
      i++;
      outlen += 2;
    } else {
      output[i] = input[i];
      outlen += 0.5;
    }
  }

  return outlen;
}

It does work but I'm not sure that this solution is safe.

Community
  • 1
  • 1
01BTC10
  • 506
  • 3
  • 9
  • 19