2

We have data in which keys are strings that can contain quotes. The problem is that Visual Basic doesn't respect the difference between normal double quotes and slanted quotes. E.g. the statement:

MessageBox.Show("""1""" = "””1””") 

returns True. Note that we are generating extremely large script files from a python program and running them in a VB scripting environment. We are NOT reading data from files. How can I get VB to respect the difference between the 2 types of quotes?

Blackwood
  • 4,504
  • 16
  • 32
  • 41
  • VB.NET IsNot VBScript - as the text on the tags tell you, they are very different. So start with figuring out what you are really using – Ňɏssa Pøngjǣrdenlarp Mar 27 '18 at 21:13
  • I've removed the VBScript tag. – Zen Skunkworx Mar 27 '18 at 21:14
  • he is using vb.net as there is no MessageBox.Show in vbs – Sorceri Mar 27 '18 at 21:14
  • ` first_key = "new activity ""1"":activity" second_key = "new activity ””1””:activity" if first_key = second_key Then 'Do Something End If` – Zen Skunkworx Mar 27 '18 at 21:21
  • 1
    Which of these? `"ʺ"c = ChrW(&H2BA) ' Modifier Letter Double Prime`, `"ˮ"c = ChrW(&H2EE) ' Modifier Letter Double Apostrophe` – djv Mar 27 '18 at 21:24
  • Thanks @djv but how do I use that in the example: MessageBox.Show("""1""" = "””1””") ? – Zen Skunkworx Mar 27 '18 at 21:26
  • I'm asking which of those characters you have. – djv Mar 27 '18 at 21:27
  • We need to support ", ” and “ – Zen Skunkworx Mar 27 '18 at 21:28
  • 1
    I see now it's neither. You have `ChrW(&H201D)` – djv Mar 27 '18 at 21:28
  • What do you want to do with them? What do you mean support? To interpret as VB.NET double quotes? – djv Mar 27 '18 at 21:29
  • We need to be able to know that """1""" is not equivalent to "””1””" – Zen Skunkworx Mar 27 '18 at 21:30
  • 1
    Related: https://stackoverflow.com/questions/4665510/using-left-double-quotation-marks-in-strings-in-vb – Caramiriel Mar 27 '18 at 21:32
  • `ChrW(&H201D)` is treated as a double quote. Then if you have two of them in a string, they are escaped as one. It is equivalent to `ChrW(&H22)` the real double quote. But if you have access to the "script", you could replace all occurrences of `ChrW(&H201D)` in the code with `ChrW(&H2BA)` before running the script. https://www.compart.com/en/unicode/U+02BA. That character is not interpreted as a double quote by vb.net. – djv Mar 27 '18 at 21:39
  • The Replace function doesn't respect the difference between " and ”. Neither does the InStr function. – Zen Skunkworx Mar 27 '18 at 21:43
  • Not in vb.net, I mean in Python or whatever you create the script with – djv Mar 27 '18 at 21:43
  • Interestingly, VS 2008 just replaces `"””1””"` with `"""1"""` when you type or paste it. VS 2017 keeps `"””1””"` in the source editor, but treats it like `"""1"""` and displays `"""1"""` in the debugger. – GSerg Mar 27 '18 at 21:55
  • The Option Compare statement lets you change this, but that is a pretty big hammer. Consider the String.Compare() overload that lets you pass the StringComparison that you want. – Hans Passant Mar 27 '18 at 21:58
  • 1
    Option Compare doesn't seem have any effect on the equivalence of the 2 strings. – Zen Skunkworx Mar 27 '18 at 21:59
  • Are you using Python to generate VB.NET programs? (You mention that you are not reading data from files, so I take it that you know that if you did then the strings would be seen as different.) – Andrew Morton Mar 27 '18 at 21:59
  • 1
    @HansPassant String.Compare Ordinal sees them as being equal when they are literals in the code - it is not even possible to use `"”"` as a literal string. – Andrew Morton Mar 27 '18 at 22:01
  • Yeah, String.Compare doesn't respect the difference. – Zen Skunkworx Mar 27 '18 at 22:03
  • Okay, it is built in the language parser then. Surely an Office/VBA inspired feature :) Use ChrW() where necessary to avoid it being helpful. – Hans Passant Mar 27 '18 at 22:07
  • Can you write that part of the program in C#? – Andrew Morton Mar 28 '18 at 10:02

2 Answers2

6

You are fighting the VB.Net language specification that treats the three different double quote characters as the same character when used in code statements.

From String Literals:

A string literal is a sequence of zero or more Unicode characters beginning and ending with an ASCII double-quote character, a Unicode left double-quote character, or a Unicode right double-quote character. Within a string, a sequence of two double-quote characters is an escape sequence representing a double quote in the string.

StringLiteral
    : DoubleQuoteCharacter StringCharacter* DoubleQuoteCharacter
    ;

DoubleQuoteCharacter
    : '"'
    | '<unicode left double-quote 0x201c>'
    | '<unicode right double-quote 0x201D>'
    ;

StringCharacter
    : '<Any character except DoubleQuoteCharacter>'
    | DoubleQuoteCharacter DoubleQuoteCharacter
    ;

In the above quoted specification, the usage of "ASCII double-quote character" means the inch character or Chrw(34).

Prior to VS2015, you couldnot even paste """1""" = "””1””" into the code editor without it being automatically converted to """1""" = """1""".

If you need to construct code statements that include the Unicode double quotes, they will need to be constructed using their respective character representations.

Const ucDoubleLeftQuote As Char = ChrW(&H201C) ' "“"c
Const ucDoubleRightQuote As Char = ChrW(&H201D) ' "”"c
Const asciiDoubleQuote As Char = ChrW(34) ' """"c

Dim asciiQuoted As String = """1"""
Dim asciiQuotedv2 As String = asciiDoubleQuote & "1" & asciiDoubleQuote

Dim unicodeQuoted As String = ucDoubleLeftQuote & "1" & ucDoubleLeftQuote

MessageBox.Show((asciiQuoted = asciiQuotedv2).ToString()) ' yields true
MessageBox.Show((asciiQuoted = unicodeQuoted).ToString()) ' yields false

Edit: To demonstrate the substitution of the ASCII double quote for any Unicode Double quotes in string literals by the VB compiler, please consider the following code.

Module Module1
    Sub Main()
        T1("““ 1 ””") ' unicode quotation marks left and right
        T2(""" 1 """) ' ascii quotation mark
    End Sub
    Sub T1(s As String) ' dummy method to highlight unicode quotation mark
    End Sub
    Sub T2(s As String) ' dummy method to highlight asci quotation mark
    End Sub
End Module

This code will yield the following IL when viewed in ILDASM.

.method public static void  Main() cil managed
{
  .entrypoint
  .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       24 (0x18)
  .maxstack  8
  IL_0000:  nop
  IL_0001:  ldstr      "\" 1 \""
  IL_0006:  call       void ConsoleApp1.Module1::T1(string)
  IL_000b:  nop
  IL_000c:  ldstr      "\" 1 \""
  IL_0011:  call       void ConsoleApp1.Module1::T2(string)
  IL_0016:  nop
  IL_0017:  ret
} // end of method Module1::Main

IL_0001: ldstr "\" 1 \"" corresponds to the loading of the string for the call statement: T1("““ 1 ””").

You can see, this is identical to IL_000c: ldstr "\" 1 \"" that corresponds to the loading of the string for the call statement: T2(""" 1 """).

TnTinMn
  • 11,522
  • 3
  • 18
  • 39
  • A bit more "inline" (without string concatenation but with more overhead in the generated code): `$"{ldquo}1{ldquo}"` (borrowing the name for the Const from HTML) or fully inline: `$"{ChrW(&H201C)}1{ChrW(&H201C)}"` – Tom Blodget Mar 27 '18 at 23:46
0

Since a String is essentially a Char array, you can compare if the two arrays are the same. Take a look at this example:

Option Strict On

Imports System
Imports System.Linq
Public Module Module1
    Public Sub Main()
        Dim regular_quotes As String = """1"""
        Dim slanted_quotes As String = "””1””"

        'Using = operator
        Console.WriteLine(regular_quotes = slanted_quotes) 'True

        'Using equals method
        Console.WriteLine(regular_quotes.Equals(slanted_quotes)) 'True

        'Using LINQ comparison
        Console.WriteLine(CompareCharArray(regular_quotes, slanted_quotes)) 'False
    End Sub

    Private Function CompareCharArray(ByVal value1 As String, ByVal value2 As String) As Boolean
        'Return a False value if the Lengths don't match
        If value1.Length <> value2.Length Then
            Return False
        End If

        'Return a False value if the Char at the current index doesn't match
        For index As Integer = 0 To value1.Length - 1
            If Not value1(index).Equals(value2(index)) Then
                Return False
            End If
        Next

        'Return a True value if everything is squared up
        Return True
    End Function
End Module

Fiddle: Live Demo

Update - Apparently when Visual Basic .NET renders the String, the curly quotes(ascii codes 137 & 138) are replaced with normal quotes(ascii code 34); so even iterating through the two arrays and comparing the Char at the given index doesn't help. Bruh, I don't know what else to do.

David
  • 5,877
  • 3
  • 23
  • 40
  • I don't think `CompareCharArray` is correct here, though, since it doesn't actually compare the contents. – Sören Kuklau Mar 27 '18 at 21:48
  • Yeah, MessageBox.Show(CompareCharArray("new activity ""1"":activity", "new activity ""1"":activity")) returns False but the strings are identical. – Zen Skunkworx Mar 27 '18 at 21:50
  • @SörenKuklau - Thank you for catching that. I've updated the code to simply iterate through each Char manually and compare the values. – David Mar 27 '18 at 21:54
  • MessageBox.Show(CompareCharArray("new activity ""1"":activity", "new activity ""1"":activity")) still returns False – Zen Skunkworx Mar 27 '18 at 21:55
  • In your Function in the If don't you mean If Not ? – Mary Mar 28 '18 at 07:23