2

Can anyone please explain with the easy coding how to reverse this algorithm so that I get back the original text string?

Public Function CreateIntChecksum(ByVal s As String) As Integer
    Dim r As Integer = 0
    For i As Integer = 0 To (s.Length() - 1)
        Dim bchar As Byte = Convert.ToByte(s(i))
        r = bchar + ((r << 5) - r)
    Next
    Return r
End Function
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
Tush
  • 153
  • 3
  • 13
  • 5
    Two words: im-possible. You need to look up what a hash or checksum function is and it's purpose. In any case, there are more `String` values than `Integer` values so a reverse mapping is not possible. – CB Bailey Aug 22 '10 at 10:38
  • 8
    @Charles Bailey Two words? – Stephen Aug 22 '10 at 10:39
  • 1
    It was a deliberate Goldwynism: http://en.wikipedia.org/wiki/Samuel_Goldwyn – CB Bailey Aug 22 '10 at 10:46
  • 1
    The title of the following web page **is** two words. Do not settle for inferior two words. http://en.wikipedia.org/wiki/Pigeonhole_principle – Pascal Cuoq Aug 22 '10 at 11:09
  • 5
    @Tush: You already asked a *very* similar question, and you got a *very* similar answer: http://stackoverflow.com/questions/3539013/i-need-crc-reverse-code-for-my-crc64-checksum-coding-closed I'm sorry, but what you're asking for is just NOT possible. It's not because we don't *know* how to do this, or because we wouldn't *want* to tell you. It is because *there is no way to do what you're asking for*. It's like asking "I made orange juice out of an orange, now how do I reverse that?" You can't do that, and nobody else can, either. – Piskvor left the building Aug 22 '10 at 11:12
  • @Piskvor - I love that analogy. Absolutely wonderful. XD. – Stephen Aug 22 '10 at 11:26
  • It actually might be reversible if the input strings are very short or of a known set. – Eiko Aug 22 '10 at 16:27

6 Answers6

4

Although it is impossible to find the original text, it is possible find a preimage easily. The crucial point is

r = bchar + ((r << 5) - r)

which is equivalent to

r = bchar + r*31

Therefore, the hash encodes the string in base-31. To find a preimage, just rewrite the integer in base 31.

For instance, if the result is 3456, we know 3456 = 3 × 312 + 18 × 31 + 15, so one possible original text is "\x03\x12\x0f". Of course, we could rearrange the numbers to give 110 × 31 + 46 ("n.") or 109 × 31 + 77 ("mM") etc, which shows there is no unique preimage.

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
2

how to reverse this algorithm so that I get back the original text string?

You can't, by design.
This is a (simple) checksum or hashing function.

Take a look at the size of the information going in and out: the function transforms a string s of arbitrary length into a 32-bit Integer. For each integer value there will be many input strings that will yield that result.


Edit: Apparently you want a shuffling algorithm. Maybe take a look at ROT13. Be advised this is not a (very) safe form of encryption.

H H
  • 263,252
  • 30
  • 330
  • 514
  • I also need a formula based shuffling algorithm in c#. I was searching internet when I found this code. If anyone has such formula based shuffling which has reversing function too so that even if I shuffle a single text hundreds of times, it can be reversed. – Tush Aug 22 '10 at 11:13
2

You can't. What you have here is a Checksum, which is a basic hashing function. The the whole point of a hashing function is that it is irreversible. Hashing functions map a set of inputs (usually infinite) to a smaller set of outputs - so multiple inputs can end up with the same output, and thus this makes reversing a hash impossible (assuming the hash is correctly done). This is why they are used to store passwords - there is no way to read a hash and go "Oh, that is password XYZ".

One way of trying to find out the original value is to use a Rainbow Table. This is merely a massive table of inputs and their equivalent hashed (or in this case checksummed) values. If you have the hashed value of your unknown string you can search for it in the table and get the set of possible inputs. This is not a way to reverse a hash function, which is impossible; it is merely a brute force guessing method. Note also that in this case (assuming the hashing function is not biased) there are an infinite number of strings that match each checksummed value, as a visual basic string can be of arbitrary length. This would make a rainbow table for this very impractical - you could cover the set of probably inputs for a hashing (as most normal users won't enter more than a 10 character code), but nothing stops the user using a 67 character code, or 109, or...

Read the wikipedia articles for Hash Functions, Checksums and Rainbow Tables.

Stephen
  • 6,027
  • 4
  • 37
  • 55
  • +1 Note however, that with rainbow tables, it may be possible to read a hash and say "the string XYZ will hash to this, and so will infinitely many other (but probably dissimilar) strings". RT are basically a solution by brute force. – Piskvor left the building Aug 22 '10 at 10:42
  • This is not a hashing function. I picked it up to find out if Reversing is possible. This is a bit-wise encoding function which outputs an integer through this algorithm. It might be reversible? – Tush Aug 22 '10 at 10:47
  • @Pisk: but that will only get you a colliding string, OK if you want a (equivalent) password but it is very unlikely a rainbow table will give you the original text. – H H Aug 22 '10 at 10:48
  • @Tush: You have a checksum (= simple hash) function. The answer is No. – H H Aug 22 '10 at 10:49
  • How should it be possible to encode an arbitrary length string into 32 bits? – Eiko Aug 22 '10 at 10:49
  • @Eiko if you could bye-bye rar bye-bye zip. nearly every file on the computer would be 32 bits. – Wes Aug 22 '10 at 10:54
  • @Henk Holterman: That's what I was trying to say - you'll get something which could have been the original string, but there isn't a way to find out. Sorry if it wasn't clear enough. (For many intents and porpoises, this is sufficient; plus it's more likely that the RTs will be made from the [a-zA-Z0-9] space, so the colliding string has some propability of being meaningful) – Piskvor left the building Aug 22 '10 at 10:55
  • 64 Bits? Can anyone code for me? – Tush Aug 22 '10 at 11:02
  • @Eiko: Oh, you can *encode* it all right, see the OP's algorithm. The problem lies in the decoding - you'll have many strings (countably infinite, to be precise), of various lengths and content - and *all* fo them will correctly map to the same integer. There's the problem - you can not find out which of the inputs was encoded into the output. – Piskvor left the building Aug 22 '10 at 11:02
  • @Tush - use a [Long](http://msdn.microsoft.com/en-us/library/y595sc15(v=VS.80\).aspx), which is 8 bytes (64 bits) instead of an Int, which is 32 bits? – Stephen Aug 22 '10 at 11:15
  • What is OP's algorithm, please provide sample in c# if possible. I require a reversible formula based shuffle algorithm which is able to shuffle, then reverse the text string. It would be nice of you to help me now. :-) – Tush Aug 22 '10 at 11:16
  • I am learning. Please provide a sample for 64 bits reversible algorithm in c#. – Tush Aug 22 '10 at 11:23
  • @Tush - That's both a rude question (questions of the type "Plz I can has code now?" are considered rude - we are here to **help**, not to write your code for you), and a different question from what you asked here. You should mark an answer for this question (choose the best given answer and click on the outline of the tick beside it), and then open a new question to ask about your shuffling and reversing of text strings in C#. – Stephen Aug 22 '10 at 11:24
  • @Tush: You are the OP (Original Poster) here. – H H Aug 22 '10 at 11:49
  • 1
    @Tush: And you don't seem to get the answers: 64 bits or 128 bits wouldn't solve the irreversibility problem here. – H H Aug 22 '10 at 11:50
  • 1
    @Wes & Piskvor: Maybe I should have said "encode lossless". I think it is fairly obvious that you cannot decode an arbitrary length text from a fixed-length number. ;-) – Eiko Aug 22 '10 at 16:23
  • @Eiko My comment was actually meant to support yours not contradict you. – Wes Aug 22 '10 at 17:09
  • It is all right :-) Nothing bad should matter. – Tush Aug 22 '10 at 20:03
  • Long String Text may or may not be crushed and compressed into hash kind of ascii "checksum" by using sophisticated mathematical formula/algo. To compress megabytes of asci text into a 128 or so bytes. When we are decompressing it, the last character is extracted first, then we just go on decompression using the formula and the sequential keys. This is the terra compression I was thinking about. – Tush Aug 22 '10 at 21:57
1

You simply can't reverse it. It's a checksum which is not reversible.

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
0

You can't. If nothing else, then simply because Integer contains only 32 bits of data and String can have any length.

svick
  • 236,525
  • 50
  • 385
  • 514
  • Do you agree that if I use 64 Bit Integer, will I get back any fixed length of string by reversing? – Tush Aug 22 '10 at 10:49
  • 1
    @Tush no. Unfortunatly not. What is it you want reversible encryption? – Wes Aug 22 '10 at 10:56
  • Yes. I am learning and weak in mathematics. I wish to have fixed string length = fixed reversible encryption algorithm. – Tush Aug 22 '10 at 11:22
  • 2
    @Tush: Hmm, that's quite a different question, IMO. Try asking for that in a new question - something like "I wish to have a reversible encryption algorithm for a fixed string length, which one is good for my situation" and describe your use case (for what you need the encryption, what you need to protect against, etc.). I'd say this should get you some answers. – Piskvor left the building Aug 22 '10 at 11:41
  • 1
    @Piskvor +1 @Tush consider this an upvote to asking a new question. – Wes Aug 22 '10 at 16:58
  • @Wes: The question seems to be already asked, as of 5 hours ago: http://stackoverflow.com/questions/3541378/dont-know-how-to-code-reversible-shuffle-algorhythm-by-using-key-anyone-can-hel – Piskvor left the building Aug 22 '10 at 17:52
0

First of all the code as posted will throw an exception for anything but the shortest of strings.

The following code includes the OP's original code, plus a simple checksum, and a guess at how the checksum might be used.

Private Sub Button2_Click(ByVal sender As System.Object, _
                          ByVal e As System.EventArgs) Handles Button2.Click

    Dim buffer() As Byte

    Dim tstString As String = "Calculate a checksum for a given string"

    Dim chkSumFix As Integer = CreateIntChecksumFixed(tstString) 'get a checksum

    buffer = SendPacket(tstString, chkSumFix) 'create a byte buffer to send
    tstString = decodePacket(buffer)

    'do the same using the OP's original code
    Dim chkSum As Integer = CreateIntChecksum(tstString) 'error

    buffer = SendPacket(tstString, chkSum)
End Sub

'OP
Public Function CreateIntChecksum(ByVal s As String) As Integer
    Dim r As Integer = 0
    For i As Integer = 0 To (s.Length() - 1)
        Dim bchar As Byte = Convert.ToByte(s(i))
        r = bchar + ((r << 5) - r)
    Next
    Return r
End Function

'a very simple checksum
Public Function CreateIntChecksumFixed(ByVal s As String) As Integer
    Dim r As Integer = 0
    For i As Integer = 0 To (s.Length() - 1)
        Dim bchar As Byte = Convert.ToByte(s(i))
        r = (r And &HFFFF) + bchar
    Next
    Return r
End Function

Private Function SendPacket(ByVal aString As String, _
                            ByVal aChecksum As Integer) As Byte()
    'construct a packet to be sent
    'Packet format
    'returns a byte buffer
    'byte(0 -3) = length of original string - use BitConverter.ToInt32 to extract
    'byte(4-n) = original string.  n = (4 + string length) - 1
    'byte(n + 1, n + 2, n + 3, n + 4) = checksum

    Dim length As Integer
    Dim retV As New List(Of Byte)
    retV.AddRange(System.Text.Encoding.ASCII.GetBytes(aString)) 'add string to packet
    length = retV.Count 'get length - use this length in case a different encoding is used
    retV.AddRange(BitConverter.GetBytes(aChecksum)) 'add checksum
    retV.InsertRange(0, BitConverter.GetBytes(length)) 'insert length at start of packet

    Return retV.ToArray
End Function

Private Function decodePacket(ByVal buffer As Byte()) As String
    Dim sLen As Integer = BitConverter.ToInt32(buffer, 0)
    If sLen + 8 <> buffer.Length Then Throw New ArgumentOutOfRangeException

    Dim s As String = System.Text.Encoding.ASCII.GetChars(buffer, 4, sLen)

    Dim chksum As Integer = CreateIntChecksumFixed(s)

    Dim embeddedChecksum As Integer = BitConverter.ToInt32(buffer, sLen + 4)

    If chksum <> embeddedChecksum Then Throw New ArgumentException("Invalid checksum")
    Return s
End Function
dbasnett
  • 11,334
  • 2
  • 25
  • 33
  • And why exactly does a simple statement of fact deserve a down vote? – dbasnett Aug 22 '10 at 14:14
  • This leads to a new thought.. If the string is sufficient short enough, it actually *might* be reversible. :-) Not sure why it should throw an exception - do overflows generate exceptions in vb.net? – Eiko Aug 22 '10 at 16:28
  • As has been pointed out, the code appears to be used to generate a checksum (name of the function, code). I will edit my post with the code and intended purpose(guess). – dbasnett Aug 22 '10 at 17:45
  • There are bugs in the above coding. Multiple Invalid characters are added in the left and right corners of the reversed string thereby making the reversed string corrupted. The string gets reversed though. Please fix the bugs so that the code becomes useful. – Tush Aug 22 '10 at 20:00
  • I guess 64 bit UInt64 integer can hold 8 bytes of integer. So if you can provide the coded function for 64 Bit and not integer. – Tush Aug 22 '10 at 20:05
  • @tush - I am lost? The original string (assumed to be ascii) is not touched. – dbasnett Aug 22 '10 at 20:48
  • Added the code in place to decode the packet and check the checksum. – dbasnett Aug 22 '10 at 21:02
  • Thanks, now the code is working without corruption of output string upon reversing. I must ask if it is possible to encode the input ascii string into only one 64 bit integer, by using a formula/algo and a key/password? That I was asking for in here. Because if that is possible, string would just be crushed into the 64 bit integer. Thanks. – Tush Aug 22 '10 at 21:50