1

I would like opinions/suggestions for improvement etc on the following function. It will be wrapped around every value being passed into a CSV file on a classic ASP page. As is below it is handling all situations we are up against, for now....

I would like to know why If Len(txt) = 0 Then fails when it is run in the page that creates the CSV file althouh runs fine when in a regular ASP page. I ma having to use If "" & txt = "" Then to make it work on both pages

function insertCSV(txt)
  'If Len(txt) = 0 Then    
  If "" & txt = "" Then
      Exit Function
  Else
    Dim tmp
    tmp = txt
    if isNumeric(tmp) then
        if left(tmp, 1) = "0" then
            tmp = """" & tmp & """"
        else
            tmp = tmp
        end if
    else
        tmp = Replace(tmp, """", """""")
        if instr(tmp, ",") or instr(tmp, """") then tmp = """" & tmp & """" 
        if instr(tmp, "®") then tmp = replace(tmp, "®", "®")
        if instr(tmp, "™") then tmp = replace(tmp, "™", "™")
        if instr(tmp, "©") then tmp = replace(tmp, "©", "©")
        if instr(tmp, vbcrlf) then tmp = replace(tmp, vbcrlf, "")
        if instr(tmp, "Â") then tmp = replace(tmp, "Â", "")
    end if
    insertCSV = tmp
  End If
end function
Mat41
  • 1,287
  • 4
  • 15
  • 29
  • How many values are you passing into the function. A large range - if so repeated calls to this function could be very expensive time wise. – brettdj Jul 10 '14 at 05:51
  • What exactly is your question here? What is the function supposed to do, and in what way would you like to improve it? – Ansgar Wiechers Jul 10 '14 at 06:48
  • @AnsgarWiechers The function is *attempting* to clean up data which is displaying incorrectly in Excel when the CSV is downloaded. There are tons of topics about correcting `UTF-8` encoding in Excel using the `BOM (Byte Order Mark)` for `UTF-8`. – user692942 Jul 10 '14 at 16:21
  • @Lankymart: Probably. I was asking the *OP* for clarification about *his* actual question, though. – Ansgar Wiechers Jul 11 '14 at 08:03
  • @AnsgarWiechers *Their* function is called `insertCSV()` what do you think it's supposed to do? – user692942 Jul 11 '14 at 08:52
  • 1
    @Lankymart Maybe insert something into a CSV. Maybe insert a CSV into something. Maybe do something entirely different. I have seen *far* too many misnamed functions to rely on that. As a matter of fact, the function doesn't really insert anything, but does a bunch of replacements. Anyway, you're free to jump to conclusions. I, for one, won't. – Ansgar Wiechers Jul 11 '14 at 09:30
  • @Lankymart Although appreciated your input surprises me. The question could not be more obvious obvious. Im sorry you are having difficulty with it, I can think of no way to make it clearer. – Mat41 Jul 11 '14 at 10:27
  • @Mat41 What are you on about?, it's Ansgar Wiechers who is asking for clarification of your question not me, I understand perfectly my response is rhetorical. Just to be clear though your question is not that clear, for one you don't explain in your question that the data in your database is already corrupted. – user692942 Jul 11 '14 at 10:30
  • My mistake appologies, was late last night when I wrote that... – Mat41 Jul 11 '14 at 21:50

2 Answers2

1

Few things:

  1. This section:

    If "" & txt = "" Then
        insertCSV = ""
    Else
    

    If you just want to return an empty string if txt is empty, you can just do this:

    If Len(txt) = 0 Then Exit Function
    
  2. You don't need to use End If for single-line If statements.

  3. This line:

    if isNumeric(tmp) AND left(tmp, 1) <> "0" then tmp = tmp end if
    

    You're assigning the value back to itself? What purpose does this serve?

  4. Don't you want to replace just the symbol © with &copy;? The way you have it written, you're replacing the entire text with &copy; (same goes for your other tests). I would think you'd want to do this instead:

    If InStr(tmp, "©") Then tmp = Replace(tmp, "©", "&copy;")
    

Try making those changes and then post an updated version of your routine and let's see how it looks.

Bond
  • 16,071
  • 6
  • 30
  • 53
  • Thank you, as you can see I have taken almost all you suggestions on. Number 4 was something I had picked up after my post yesterday. Clearly a very stupid mistake as you pointed out. The objective of number three is leading into (but not finishing) is to make sure a number with leading zeros is surrounded by double quotes and is a string but a number without leading zeros is not surrounded by quotes and is interpreted as a number – Mat41 Jul 11 '14 at 00:10
  • I have made and tested the changes to make sure a number with leading zeros is surrounded by double quotes and is a string but a number without leading zeros is not surrounded by quotes and is interpreted as a number – Mat41 Jul 11 '14 at 03:34
1

Update (based on latest comments)

If the database already has values stored like ’ then you need to consider that the web application is inserting the data using the wrong encoding and you are ending up with a mismatch in encoding in the database.


You don't need any of that the issue is Excel interprets the CSV as ASCII when you are sending it as UTF-8. There is a useful trick that fools Excel into using the correct encoding rather then assuming ASCII, I use this technique all the time and works well. It involves writing a BOM (Byte Order Mark) to the stream before displaying your data that tells Excel the data is UTF-8 not ASCII encoded.

'BOM to dup Excel into encoding the CSV correctly instead of using ASCII!!!
Call Response.BinaryWrite(ChrB(239) & ChrB(187) & ChrB(191))

This should be specified before you use Response.Write() to output your text from the insertCSV() function.


Using this kind of code

if instr(tmp, "Â") then

is just wrong when what the output (Â) is trying to do is tell you something is wrong with the encoding.


Useful Links

Community
  • 1
  • 1
user692942
  • 16,398
  • 7
  • 76
  • 175
  • Thank you for your comment. This is very interesting. I will read all about this and take in on-board! FY we not onyl have 'Â' in the DB but we also have ’, • and – - I am using a findAndReplace stored proc to deal with these. I suspect this is something to do with foreign language wrord characters.... – Mat41 Jul 11 '14 at 00:14
  • @Mat41 If you have values like that already in the database it is because the wrong encoding has been used to insert them. – user692942 Jul 11 '14 at 07:23
  • Yes there are lots of things wrong with the array of applications that touch the database. Its very complex and undocumented. Oh and I have only recently come on board. Part of my job is to stop it happening in the paces we know about, then to find the places we dont. Thank you for your detailed and useful post – Mat41 Jul 11 '14 at 10:32
  • That's fair enough, systems more often then not are undocumented and complex. But your approach to *"stop it happening"* is just compounding the problem while the heart of the problem remains. Good luck. – user692942 Jul 11 '14 at 10:34