1

I wish to convert text to base 4 (AGCT), by first converting it to binary (I've done this bit) and then break it into 2 bit pairs.

can someone help me turn this into code using vb.net syntax?

if (length of binary String is an odd number) add a zero to the front (leftmost position) of the String. Create an empty String to add translated digits to. While the original String of binary is not empty { Translate the first two digits only of the binary String into a base-4 digit, and add this digit to the end (rightmost) index of the new String. After this, remove the same two digits from the binary string and repeat if it is not empty. }

in this context:

    Dim Base2Convert As String = ""
    For Each C As Char In Result.Text
        Dim s As String = System.Convert.ToString(AscW(C), 2).PadLeft(8, "0")
        Base2Convert &= s
    Next
    Result.Text = Base2Convert 

    Dim Base4Convert As String = ""
    For Each C As Char In Result.Text
        '//<ADD THE STATEMENT ABOVE AS CODE HERE>//
        Base4Convert &= s
    Next
    Result.Text = Base4Convert 
Ashtopher
  • 43
  • 6
  • Why the detour via base-2? Just do the conversion into base-4 directly. – Konrad Rudolph Mar 11 '14 at 23:57
  • Would that be easier? I thought it was simpler to convert it via base 2 from the advice of others. – Ashtopher Mar 12 '14 at 00:04
  • It’s literally changing the `2` in the first loop into a `4`, and maybe change the padding. You should try to *understand* the code you’re writing. – Konrad Rudolph Mar 12 '14 at 00:20
  • Sure, that's why I'm asking. To further my understanding. If I change the 2 to a 4, the program crashes. I'm not sure what to change the padding to. – Ashtopher Mar 12 '14 at 00:25
  • 1
    Ah, I had forgotten how much `Convert` sucked. Most useless class in the whole .NET framework (and that includes `ArraySegment`). My bad. – Konrad Rudolph Mar 12 '14 at 00:44

2 Answers2

1

.NET does not support conversion to non-standard base, such as 4, so this will not work:

Dim base4number As String = Convert.ToString(base10number, 4)

From MSDN:

[...] base of the return value [...] must be 2, 8, 10, or 16.

But you can write your own conversion function, or take the existing one off the web:

Public Function IntToStringFast(value As Integer, baseChars As Char()) As String
  Dim i As Integer = 32
  Dim buffer(i - 1) As Char
  Dim targetBase As Integer = baseChars.Length

  Do
    buffer(System.Threading.Interlocked.Decrement(i)) =
      baseChars(value Mod targetBase)
    value = value \ targetBase
  Loop While value > 0

  Dim result As Char() = New Char(32 - i - 1) {}
  Array.Copy(buffer, i, result, 0, 32 - i)

  Return New String(result)
End Function

Used this answer. Converted with developer fusion from C# + minor adjustments. Example:

Dim base2number As String = "11110" 'Decimal 30
Dim base10number As Integer = Convert.ToInt32(base2number, 2)
Dim base4number As String = IntToStringFast(base10number, "0123") 
Console.WriteLine(base4number) 'outputs 132

Notice that you don't need base 2 there as an intermediate value, you can convert directly from base 10. If in doubt, whether it worked correctly or not, here is a useful resource:

Community
  • 1
  • 1
Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151
  • The name `base10number` is completely wrong (and misleading!). `"30"` is a base-10 number. `base10number` is simply the number’s value. – Konrad Rudolph Mar 12 '14 at 00:45
  • @KonradRudolph: What exactly is misleading? 11110 base 2 is decimal 30, or 30 base 10. This is what I wrote in a comment. If you hover over `base10number` in debugger, it will display `30`, which is base 10. So what's wrong there? – Victor Zakharov Mar 12 '14 at 00:50
  • You are confusing value and representation. What you see in the debugger isn't the variable's actual content, it's its *string representation* after being converted into base 10. If anything, the actual internal representation of an integer is base 2; certainly not base 10. – Konrad Rudolph Mar 12 '14 at 00:55
  • @KonradRudolph: Have you ever tried to imagine a base 10 number without its *string representation*? Then why are you putting them separately? – Victor Zakharov Mar 12 '14 at 01:14
  • Thanks Neolisk, finally got it working! However, the custom function only works for small length strings (lengths <4 only). Otherwise it crashes. Any ideas on how I can expand that length ? Sorry to bother you again. – Ashtopher Mar 12 '14 at 03:03
  • @Neolisk “base 10” **is** the representation. Nothing more. The representation, and *only* the representation, is what “base x” refers to. Integers (and most other number types) most certainly don’t use a base-10 representation inside the computer’s memory – since a computer can only store bits, the natural (and actually used) representation for storing and manipulating numbers inside memory is base 2 (more precisely, two’s complement, to map negative numbers). To state that an integer uses representation base 10 is simply wrong. – Konrad Rudolph Mar 12 '14 at 08:03
  • @user3403843: No, it works for any - I just tried octal (base 8), as well as longer strings. Both the base, input number and output number can be of length greater than 4. I believe the result is limited to 32 characters, but it's more than enough for most needs. If not, feel free to adjust the code. Please double check your code, if still and issue, you can adjust your question, include any relevant information, and I'll see if I can help. – Victor Zakharov Mar 12 '14 at 14:23
0

Converting the number to base first and then to base 4 doesn’t make a lot of sense, since directly converting to base 4 is the same algorithm anyway. In fact, representation of a number in any base requires the same general algorithm:

Public Shared Function Representation(number As Integer, digits As String) As String
    Dim result = ""
    Dim b = digits.Length

    Do
        result = digits(number Mod b) & result
        number \= b
    Loop While number > 0

    Return result
End Function

Now you can verify that Representation(i, decimal) does the same as i.ToString():

Dim decimalDigits = "0123456789"
For i = 0 To 30 Step 3
    Console.WriteLine("{0}, {1}", i.ToString(), Representation(i, decimalDigits))
Next

It’s worth noting that i.ToString() converts i to decimal base because this is the base we, humans, are mostly using. But there is nothing special about decimal, and in fact internally, i is not a decimal number: its representation in computer memory is binary.

For conversion to any other base, just pass a different set of digits to the method. In your case, that’d be "ACGT":

Console.WriteLine(Representation(i, "ACGT"))

Hexadecimal also works:

Console.WriteLine(Representation(i, "0123456789ABCDEF"))

And, just to repeat it because it’s such a nice mathematical property: so does any other base with at least two distinct digits.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214