18

In PHP 5.3, am trying to replace double quotes in a string as such:

$bar = str_replace('"','\'',$foo);

But some quotes that are saved in the utf8-Database are not being replaced, although they look perfectly normal:

"Some text"

Are there different character types I have to search for? If so, which are they?

Urs
  • 4,984
  • 7
  • 54
  • 116

3 Answers3

42

There are many characters that look like quotation marks, most of them are used infrequently. The ones that are used more often are these three:

"   U+0022 QUOTATION MARK
“   U+201C LEFT DOUBLE QUOTATION MARK
”   U+201D RIGHT DOUBLE QUOTATION MARK

Some rarer ones are FULLWIDTH QUOTATION MARK, the DITTO MARK, the DOUBLE PRIME, the DOUBLE PRIME QUOTATION MARK, and so on. The Unicode.org "confusables" tool finds 15 characters similar to ".

Why don't you copy and paste the offending character here so we can identify it? Or you could use the HEX function to get the hexadecimal encoding of the character, that's another way of identifying it.

Update The unicode.org confusables utility seems to be down, but the data is available as a text file. The current list of characters that are "confusable" with double quote are:

1CD3 ;  0027 0027 ; MA  #* ( ᳓ → '' ) VEDIC SIGN NIHSHVASA → APOSTROPHE, APOSTROPHE # →″→→"→
0022 ;  0027 0027 ; MA  #* ( " → '' ) QUOTATION MARK → APOSTROPHE, APOSTROPHE   # 
FF02 ;  0027 0027 ; MA  #* ( " → '' ) FULLWIDTH QUOTATION MARK → APOSTROPHE, APOSTROPHE # →”→→"→
201C ;  0027 0027 ; MA  #* ( “ → '' ) LEFT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE   # →"→
201D ;  0027 0027 ; MA  #* ( ” → '' ) RIGHT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE  # →"→
201F ;  0027 0027 ; MA  #* ( ‟ → '' ) DOUBLE HIGH-REVERSED-9 QUOTATION MARK → APOSTROPHE, APOSTROPHE    # →“→→"→
2033 ;  0027 0027 ; MA  #* ( ″ → '' ) DOUBLE PRIME → APOSTROPHE, APOSTROPHE # →"→
2036 ;  0027 0027 ; MA  #* ( ‶ → '' ) REVERSED DOUBLE PRIME → APOSTROPHE, APOSTROPHE    # →‵‵→
3003 ;  0027 0027 ; MA  #* ( 〃 → '' ) DITTO MARK → APOSTROPHE, APOSTROPHE   # →″→→"→
05F4 ;  0027 0027 ; MA  #* ( ‎״‎ → '' ) HEBREW PUNCTUATION GERSHAYIM → APOSTROPHE, APOSTROPHE   # →"→
02DD ;  0027 0027 ; MA  #* ( ˝ → '' ) DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE  # →"→
02BA ;  0027 0027 ; MA  # ( ʺ → '' ) MODIFIER LETTER DOUBLE PRIME → APOSTROPHE, APOSTROPHE  # →"→
02F6 ;  0027 0027 ; MA  #* ( ˶ → '' ) MODIFIER LETTER MIDDLE DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE   # →˝→→"→
02EE ;  0027 0027 ; MA  # ( ˮ → '' ) MODIFIER LETTER DOUBLE APOSTROPHE → APOSTROPHE, APOSTROPHE # →″→→"→
05F2 ;  0027 0027 ; MA  # ( ‎ײ‎ → '' ) HEBREW LIGATURE YIDDISH DOUBLE YOD → APOSTROPHE, APOSTROPHE  # →‎יי‎→
Joni
  • 108,737
  • 14
  • 143
  • 193
  • Thanks @Joni, this sounds great! We're looking into it. – Urs Sep 11 '13 at 15:01
  • 2
    We've solved it with `$bar = rawurlencode(utf8_decode($foo));` now, but your "confusables" link is great – Urs Sep 12 '13 at 13:24
3

I was searching for the Double low quote character but it was not listed in the above answers. Finally found it, so I'm sharing it to save some time:

„ A nice quotation ”

„ = Double low quote / „ / „ / „ / U+201E

” = Right Double Quotation / ” / ” / ” / U+201D

szegheo
  • 4,175
  • 4
  • 31
  • 35
  • 1
    This is not used in English, but is the standard opening quote in German and some other languages which were under German influence at some point during the formative years of local typographic conventions. If you want to cover quotation marks in multiple languages, you also need to cover guillemets, etc. Here's a popular reference: https://jakubmarian.com/map-of-quotation-marks-in-european-languages/ – tripleee Feb 04 '21 at 14:29
0

Able to insert a quotation using "numerical HTML encoding of the Unicode character"

http://www.utf8-chartable.de/unicode-utf8-table.pl?unicodeinhtml=dec&htmlent=1

The unicode code point did not work for me:

"   U+0022 QUOTATION MARK

Alternatively, this worked for me:

"   "  QUOTATION MARK
S.Doe_Dude
  • 151
  • 1
  • 5