23

I have an application which uses a Javascript-based rules engine. I need a way to convert regular straight quotes into curly (or smart) quotes. It’d be easy to just do a string.replace for ["], only this will only insert one case of the curly quote.

The best way I could think of was to replace the first occurrence of a quote with a left curly quote and every other one following with a left, and the rest right curly.

Is there a way to accomplish this using Javascript?

tchrist
  • 78,834
  • 30
  • 123
  • 180
BlueVoid
  • 929
  • 1
  • 9
  • 26
  • 1
    You might want to play with a word processor a bit and see what rules it uses to determine which quotes to use. From what I've seen they are based on the context of the quote, not pairing. – Nate C-K Feb 04 '10 at 21:36
  • Does this answer your question? [Ideas for converting straight quotes to curly quotes](https://stackoverflow.com/questions/509685/ideas-for-converting-straight-quotes-to-curly-quotes) – Dave Jarvis Jul 10 '22 at 04:42

7 Answers7

15

You could replace all that preceed a word character with the left quote, and all that follow a word character with a right quote.

str = str.replace(/"(?=\w|$)/g, "“");
str = str.replace(/(?<=\w|^)"/g, "&#8221;"); // IF the language supports look-
                                             // behind. Otherwise, see below.

As pointed out in the comments below, this doesn't take punctuation into account, but easily can:

/(?<=[\w,.?!\)]|^)"/g

[Edit:] For languages that don't support look-behind, like Javascript, as long as you replace all the front-facing ones first, you have two options:

str = str.replace(/"/g, "&#8221;"); // Replace the rest with right curly quotes
// or...
str = str.replace(/\b"/g, "&#8221;"); // Replace any quotes after a word
                                      // boundary with right curly quotes

(I've left the original solution above in case this is helpful to someone using a language that does support look-behind)

Nicole
  • 32,841
  • 11
  • 75
  • 101
  • +1 for actually answering the question. Though users of this should take into account that it's not perfect in every situation - for example, the dashes indicating feet and inches. – Anon. Feb 04 '10 at 20:15
  • ... or punctuation that ends a quotation. – Pointy Feb 04 '10 at 20:20
  • Thank you! This is what I was looking for. One note, copying the code exactly gave me an error. the '?<=' portion was changed to '?='. Also, I had to remove the quotation character from the end case for it to match correctly. The code: s = s.replace(/"(?=\w|$)/g, "“"); s = s.replace(/(?=[\w,.?!\-")]|^)"/g, "”"); – BlueVoid Feb 04 '10 at 20:33
  • @BlueVoid - You are correct about the error, I discovered this and was updating my answer as you commented :) Be careful with your code - `?=` is a look-ahead, which matches because it looks ahead and sees the quote, which is in your character class. I would go with the first "alternative" solution in my edited answer -- just replace all of them with **right** curly quotes *after* replacing the **left** curly quotes. – Nicole Feb 04 '10 at 20:37
  • @Renesis Good point. This simplifies things anyway. It's working great. – BlueVoid Feb 04 '10 at 20:44
  • @Renesis, @BlueVoid: That regex seems **far too fragile** for general use, because you’ve to consider sentences quite properly ending with *“logical quoting”.* It only works with dialogue like: *Jimmy said, “Are we there yet, Mom?”* Please check out [The Economist](http://www.economist.com/) for what I’m referring to; [this entry](http://www.economist.com/research/styleGuide/index.cfm?page=805701) from their Style Guide explains what you’ve missed here. I don’t know how you will handle ‘single quotes’, let alone [apostrophes](http://www.economist.com/research/styleGuide/index.cfm?page=841359). – tchrist Nov 29 '10 at 05:28
  • @tchrist There is big difference in dev time between an 80 or 90% solution and a 100% solution. The important part here is not the `\w`, it's the strategy of using one regex for the leading quotes and another for the trailing quotes. It will properly encode *most* cases (the first suggestion after the [Edit] will cover the problem case you brought up), one incorrect case will not break the others, and **it's easily edited to cover even more cases**. For example, you can invert it to trigger based on whitespace on the other side. Alternatively, you can add punctuation to the character class. – Nicole Nov 29 '10 at 06:31
5

You might want to look at what Pandoc does—apparently with the --smart option, it handles quotes properly in all cases (including e.g. ’tis and ’twere).

I recently wrote a Javascript typography prettification engine that does, among other things, quote replacement; I wound up using basically the algorithm suggested by Renesis, but there’s currently a failing test up waiting for a smarter solution.

If you’re interested in cribbing my code (and/or submitting a patch based on work you’ve done), check it out: jsPrettify. jsprettify.prettifyStr does what you’re looking for. If you don’t want to deal with the Closure dependency, there’s an older version that runs on its own—it even works in Rhino.

Community
  • 1
  • 1
Steven Dee
  • 273
  • 2
  • 7
  • Plus 1 for Pandoc. I try to use a mature and tested tool whenever I can versus baking my own regex. Hand built regex's won't can be overly greedy, or not greedy enough, and they may not be sensitive to word boundary's and comma's etc. Pandoc accounts for most this and more. – Paulb Feb 27 '16 at 14:21
4
'foo "foo bar" "bar"'.replace(/"([-a-zA-Z0-9 ]+)"/g, function(wholeMatch, m1){
    return "“" + m1 + "”";
});
Luca Matteis
  • 29,161
  • 19
  • 114
  • 169
3

The following just changes every quote by alternating (this specific example however would leave out the orphaned quotes).

str.replace(/\"([^\"]*)\"/gi,"&#8220;$1&#8221;");

Works perfectly, as long as the text you're texturizing isn't already screwed up with improper use of the double quote. In English, quotes are never nested.

tchrist
  • 78,834
  • 30
  • 123
  • 180
Jordan
  • 31
  • 1
  • 4
    There is one legitimate situation in English where this rule breaks down. When you have consecutive paragraphs representing quoted speech *by the same speaker*, one must start each of those paragraphs with the appropriate quote marks (single, double, single+double, double+single, etc), but one omits the closing quote except for the last paragraph by the same speaker. – tchrist Nov 29 '10 at 05:17
0

I don't think something like that in general is easy at all, because you'd have to interpret exactly what each double-quote character in your content means. That said, what I'd do is collect all the text nodes I was interested in, and then go through and keep track of the "on/off" (or "odd/even"; whatever) nature of each double quote instance. Then you can know which replacement entity to use.

Pointy
  • 405,095
  • 59
  • 585
  • 614
0

I didn't find the logic I wanted here, so here's what I ended up going with.

value = value.replace(/(^|\s)(")/g, "$1“"); // replace quotes that start a line or follow spaces
value = value.replace(/"/g, "”"); // replace rest of quotes with the back smart quote

I have a small textarea that I need to replace straight quotes with curly (smart) quotes. I'm just executing this logic on keyup. I tried to make it behave like Microsoft Word.

David Lee
  • 171
  • 6
0

Posting for posterity.

As suggested by @Steven Dee, I went to Pandoc.

I try to use a mature and tested tool whenever I can versus baking my own regex. Hand built regex's can be overly greedy, or not greedy enough, and they may not be sensitive to word boundaries and commas etc. Pandoc accounts for most this and more.

From the command line (the --smart parameter turns on smart quotes):

pandoc --smart --standalone -o output.html input.html

..and I know a command line script may or may not fit OP's requirement of using Javascript. (related: How to execute shell command in Javascript)

Community
  • 1
  • 1
Paulb
  • 1,471
  • 2
  • 16
  • 39