4

I am writing some html code to an SQL databse using RMySQL (but I guess my problem is rather a general R question than really related to SQL or RMySQL). So I am trying something like this:

con <- RMySQL(...) # some connection    
html <- "<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>"    
    query <- c('INSERT INTO table (htmlfield) VALUES (\"', html, '"') 
    dbSendQuery(con,paste(query, collapse = ""))

Trouble is, R's paste will replace the double quotes in single quotes (i.e. '"') to the escaped sequence \", i.e.:

> paste(query, collapse = "")
[1] "INSERT INTO table (htmlfield) VALUES (\"<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>\""

If I change the single quotes in the vector query to double quotes, and the single quotes in html to doubles, the problem is then on the side of the character string html, since then the double quotes in html get replaced by the escaped sequence.

What's the easiest way to handle a substitution of the escaped characters?

I tried gsub('\\\"','"',html) which did not work as intended and the solutions suggested in the post Ignore escape characters (backslashes) in R strings but I could not make it work.

Thanks for your attention, Philipp

Community
  • 1
  • 1
Philipp
  • 479
  • 5
  • 10
  • I suspect you're confused between the string and it's representation. Start [here](http://stackoverflow.com/questions/4379681/can-r-paste-output) for enlightenment. – hadley Jan 15 '11 at 00:49
  • 3
    Then study [this](http://upload.wikimedia.org/wikipedia/en/b/b9/MagrittePipe.jpg) – hadley Jan 15 '11 at 00:50
  • `paste` does not replace " with \", it's the `print` method which inserts them. You example should work. What is the error you get? – VitoshKa Jan 15 '11 at 11:09
  • 1
    @VitoshKa There is a missing trailing `)` in the `query` string, as I mention in my answer, which, I venture, is why it is not working. There are several things going on here I suspect, a typo leading to a SQL syntax error, and then @Philipp noticing the escaped `"` in the *printed* representation and putting two and two together to come up with five. – Gavin Simpson Jan 15 '11 at 12:02

3 Answers3

5

I see two problems with what you included in your Question. The first looks like a typo. After:

html <- "<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>"   

You have:

query <- c('INSERT INTO table (htmlfield) VALUES (\"', html, '"')
                                                  ^^^^^^^^^^^^^^^

Notice you escape one string but not the other. You don't need to escape them, but it doesn't matter if you do. You also meant '")' for the last string which is, I suspect, the real source of the error you are getting. paste rather than c is more useful here. If I combine these, we get:

query <- paste('INSERT INTO table (htmlfield) VALUES ("', html, '")', sep = "")

that we can use directly:

dbSendQuery(con, query)

The second problem, and one that many people make, is to confuse the printed representation of an object with the object itself. If we print query, we see this:

> query
[1] "INSERT INTO table (htmlfield) VALUES (\"<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>\")"

The printed representation of the string is always enclosed in "" double quotes, and as such the internal " need to be escaped. What you want to look at is the actual string. We can do that with cat or writeLines - I prefer the latter as it adds the "\n" to the end of the string automagically:

> writeLines(query)
INSERT INTO table (htmlfield) VALUES ("<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>")

Notice how the " are now not escaped. That is the SQL that would be executed by the database server. If that is valid SQL for your DB then it will work.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
4

You forgot to escape the backslashes themselves in the strings (I think, it would be the case with most programming languages, not sure if the same goes for R).

gsub("\\\"", "\"", html)
orlp
  • 112,504
  • 36
  • 218
  • 315
  • I think your idea makes perfect sense, R does not... Maybe it has something to do with the way R prints the string. – Philipp Jan 15 '11 at 00:32
  • What do you see if you print this string? `"Replacing \\\" with \"!"` – orlp Jan 15 '11 at 00:33
  • I think this way will not work, as `print` will always escape `"`, and `sub` also behaves something like that (or at least I have never succeeded this way). `cat` and `write` would work thought, but that would be only useful, if you wanted to write the concatenated string to a file. – daroczig Jan 15 '11 at 00:55
  • WTF is this R programming language? Different escape sequences for different contexts? I doubt it. – orlp Jan 15 '11 at 00:57
  • 2
    @nightcracker: `cat` and `write` do nothing with strings only show them, while `print`, `gsub` and others converts and format the input, that is why the quotes are escaped as I think, and are not in the functions cited first. – daroczig Jan 15 '11 at 01:04
  • No, you don't understand what escape sequences are. They are for the compiler/interpreter to understand where a string ends and begins. `""""`, is that a string containing two quotes or two empty strings? That's what the escape sequences are for: `"\"\""`. (Plus other difficult characters like newlines) – orlp Jan 15 '11 at 01:07
  • @nightcracker: as I wrote `cat` and `write` does not escape strings, as these functions are not intended to, while `print` and others do, because they do format the input. See `?cat`: "‘cat’ performs much less conversion than ‘print’." E.g.: `cat` will not be able to deal with situations you described above. – daroczig Jan 15 '11 at 01:20
  • 1
    @daroczig you are confusing the **printed** representation of an object with an object. Consider `print("\\\\\n")` vs. `cat("\\\\\n")`. In both I *have* to escape the backslashes, but `print` shows me the internal representation and `cat` the actual string I wanted. Also, `print` doesn't do anything like you describe, it just returns the object (in this case), it is `cat` et al that are converting from the R representation *to* the required string. – Gavin Simpson Jan 15 '11 at 10:24
  • @Gavin Simpson: I already tried to explain it to him, but didn't succeed :) – orlp Jan 15 '11 at 10:40
  • @Gavin, `print` does not show the internal representation. `nchar("\"") is 1. `print` inserts the escape which is not there. At the same time `paste` does not insert "\" either and thus original example `dbSendQuery(con,paste(query, collapse = "")) ` should work as desired. – VitoshKa Jan 15 '11 at 11:07
  • @Gavin Simpson and @nightcracker: thank you both, I am getting what you meant. What I was up to (but not specifying nicely as you did), that `cat` and `write` do not put quotes around strings while showing data like `print`, and also the escape chars behaves in another way (`writeLines("\\")` vs. `print("\\")`). That is why I thought that the difference in getting back/showing strings between the two functions could be interesting here, as in this question pasting a string was in central. Thanks for clarification, I had not in mind the inner/outer (printed) representation discrepancy. – daroczig Jan 15 '11 at 11:34
  • @Gavin Simpson - what I still not get about you saying "`print` does not do anything" I write is: that the manual (`?cat`) says the following: "‘cat’ performs much less conversion than ‘print’." – daroczig Jan 15 '11 at 11:36
  • @VitoshKa apologies, "internal" was a poor choice of words, I should have said "printed". – Gavin Simpson Jan 15 '11 at 11:53
  • @daroczig Might be poor choice of words on my part. `print` is returning R's printed version of the string (effectively what one types in but with the switch to enclosing in `""` and any changes to the string that requires. `cat` is showing you the string you intended after all the escaping is accounted for. It is a matter of viewpoint, I guess, which is *doing* more than the other, as I like to thing of `cat` etc showing me the converted string, converted from the way I need to enter it in R etc. But you can view that the other way round. – Gavin Simpson Jan 15 '11 at 11:57
  • @Gavin Simpson: thank you again, the diff between the two functions is clear now. – daroczig Jan 15 '11 at 12:11
  • @Gavin, Good point! From now on we should differentiate between printed, internal and __input__ representations:) – VitoshKa Jan 15 '11 at 12:28
3

Try to paste this query:

query <- c('INSERT INTO table (htmlfield) VALUES (\'', html, '\'') 

The only change is with the quotes: \' instead of ".

So in whole:

html <- "<div style='text-align: center; font-family: Arial;'><span style='font-size: 14pt;'>Some text without any tricky symbols.</span></div>"    
query <- c('INSERT INTO table (htmlfield) VALUES (\'', html, '\'')
dbSendQuery(con, paste(query, collapse = ""))

I hope it will work fine!

daroczig
  • 28,004
  • 7
  • 90
  • 124
  • 1
    That is just avoiding the problem - the real problem; one of not differentiating between the printed representation and the actual representation of the string. – Gavin Simpson Jan 15 '11 at 10:42