2

When computing MD5 hashes using Python and Powershell I am getting different results. It appears that the Python code returns the 'correct' version.

When not using multi-line variables the results are the same. So if I set xml = 'test' they both give the same result.

I am thinking maybe it has something to do with formatting or newline character, but maybe there is something else wrong with my Powershell code.

When I use Powershell to compute the hash I use this:

Function Get-StringHash([String] $String,$HashName = "MD5") 
{ 
$StringBuilder = New-Object System.Text.StringBuilder 
[System.Security.Cryptography.HashAlgorithm]::Create($HashName).ComputeHash([System.Text.Encoding]::UTF8.GetBytes($String))|%{ 
[Void]$StringBuilder.Append($_.ToString("x2")) 
} 
$StringBuilder.ToString() 
}
$xml = @"
<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<!DOCTYPE OPS_envelope SYSTEM 'ops.dtd'>
<OPS_envelope>
    <header>
        <version>0.9</version>
    </header>
    <body>
        <data_block>
            <dt_assoc>
                <item key="protocol">XCP</item>
                <item key="action">get</item>
                <item key="object">nameserver</item>
                <item key="domain">domainname</item>
                <item key="attributes">
                    <dt_assoc>
                        <item key="name">all</item>
                    </dt_assoc>
                </item>
            </dt_assoc>
        </data_block>
    </body>
</OPS_envelope>
"@
$key = '12345'

$obj = $xml + $key
$signature = Get-StringHash $obj "MD5"
$signature

It returns the result: 1680ea9b5d8b09ef6c9bd02641246fc4

When I use Python:

    import hashlib

xml = '''
<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<!DOCTYPE OPS_envelope SYSTEM 'ops.dtd'>
<OPS_envelope>
    <header>
        <version>0.9</version>
    </header>
    <body>
        <data_block>
            <dt_assoc>
                <item key="protocol">XCP</item>
                <item key="action">get</item>
                <item key="object">nameserver</item>
                <item key="domain">domainname</item>
                <item key="attributes">
                    <dt_assoc>
                        <item key="name">all</item>
                    </dt_assoc>
                </item>
            </dt_assoc>
        </data_block>
    </body>
</OPS_envelope>
'''
key = '12345'
md5_obj = hashlib.md5()
md5_obj.update(xml + key)
signature = md5_obj.hexdigest()
print("SIGNATURE: " + signature)

It results with: d2faf89015178b2ed50ed4a90cbab9ff

LK86
  • 23
  • 3
  • 3
    I would also think it has something to do with the newline characters. Why don't you check? – that other guy Mar 26 '18 at 19:06
  • @thatotherguy I couldn't figure out how to check that, but any advice on how to check for such things in the future is appreciated. Thank you. – LK86 Mar 26 '18 at 19:16
  • Get a hex dump of both inputs and compare. Your PS code already has code for formatting bytes as hex, while Python code for the same is [easy to find](https://stackoverflow.com/questions/12214801/print-a-string-as-hex-bytes). – that other guy Mar 26 '18 at 19:21

1 Answers1

2

The two input strings are not actually identical, for two reasons:

1) Triple-quoted strings in python start and end on the same line as the quotes - here-strings in PowerShell start on the line below @"/@' and end on the line above "@/'@, so change that:

$xml = @'

<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<!DOCTYPE OPS_envelope SYSTEM 'ops.dtd'>
<OPS_envelope>
    <header>
        <version>0.9</version>
    </header>
    <body>
        <data_block>
            <dt_assoc>
                <item key="protocol">XCP</item>
                <item key="action">get</item>
                <item key="object">nameserver</item>
                <item key="domain">domainname</item>
                <item key="attributes">
                    <dt_assoc>
                        <item key="name">all</item>
                    </dt_assoc>
                </item>
            </dt_assoc>
        </data_block>
    </body>
</OPS_envelope>

'@

2) Line breaks in PowerShell here-strings default to [Environment]::NewLine, which in windows would be \r\n, whereas python defaults to \n, so make sure you normalize those:

$obj = $obj.Replace([System.Environment]::NewLine,"`n")
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206