-2

I need to generate a hashed string(md5 or sha-1 or whatever) which doesn't contain a comma or next line because I'm storing with some more data in a csv file. I've tried, after generating md5 normally, replacing all occurrences of ',' and '\n' by String.replace, but somehow, there are still some newline characters or some characters that behave like newline. Is there a way, to generates only alphabets and numbers in the string while encrypting?

Edit: I realised that what I am asking for is a bad thing because it'll make 'abcd,' and ',abcd' same, as pointed out by GhostCat. Actually, it's not production environment and I just need some of hashing that produces consistent result for each input, and to be able to save it in csv files.

This is my code.

    public static String hashPassword(String password)
    {
        MessageDigest md=null;
        try {
            md=MessageDigest.getInstance("MD5");
        } catch (NoSuchAlgorithmException e) {
            e.printStackTrace();
        }
        String passwd=new String (md.digest(password.getBytes()));

        //To convert it to ',' so that it'll be removed with other ',''s
        passwd=passwd.replace('\n',',');
        passwd=passwd.replace(",","");
        return passwd;
    }
Naveen Attri
  • 96
  • 2
  • 13

2 Answers2

2

Don't use new String() or getBytes() without specifying an encoding. As soon as you move to a different platform (with a different default encoding), that will blow up spectacularly.

The safest way would be to convert the bytes to a hex string, although that will also double the size needed to store it. E.g. (using UTF-8 as encoding)

BigInteger foo = new BigInteger(md.digest(password.getBytes("UTF-8"))); 
String hex = foo.toString(16);

Note that the example code is not "production grade", see comments.

Kayaman
  • 72,141
  • 5
  • 83
  • 121
  • 1
    I used `passwd= HexBin.encode(md.digest(password.getBytes()));`. Though its not exactly your solution,but I got this from your solution, where you wrote about converting to hex string. Thanks a lot. – Naveen Attri Nov 27 '16 at 20:55
  • 2
    I would prefer `new BigInteger(sign, magnitude)`, with a sign of 1. And in any case the returned string should have the same length, which `BigInteger.toString` doesn't guarantee. – Roland Illig Nov 27 '16 at 21:00
  • @NaveenAttri Yes, the `BigInteger` was just the first method that came to mind without using 3rd party libraries. – Kayaman Nov 27 '16 at 21:05
  • ...and even that was faulty, thanks @RolandIllig for the pointers. – Kayaman Nov 27 '16 at 21:06
  • 1
    Java 8 has a [Base64](https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html) class which should be used in preference to hex encoding solutions. – President James K. Polk Nov 28 '16 at 00:34
  • @JamesKPolk True, but that will allow those commas that we were trying to get rid of. – Kayaman Nov 28 '16 at 06:54
  • There are no commas in the base64 alphabet. However, if you really only want letters and numbers then hex encoding is probably best. – President James K. Polk Nov 28 '16 at 13:09
0

Sounds like the wrong approach to me.

The point is: when hashing things, every bit of input and output is important! Just removing some arbitrary output elements because they give you trouble in your file format representation ... is not the way to go.

You see, at some point you want to compare two hashes, and I am pretty sure that you want

hash1: abcdf,

and

hash2: ,abcdf

to be different. When you just drop those values you don't like ... you wont be able to do that any more!

So, when you look here, the real answer is:

Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.

In other words: your code that reads/writes values to CSV should be able to deal with those , and new lines. If not, your CSV parser is the problem.!

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • Yeah, you're right, approach that I was using was wrong. I used Kayaman's solution for encryption, because I was already taking care of not getting a comma anywhere. – Naveen Attri Nov 27 '16 at 21:01