0

If this pattern is detected inside a string:

Double quote (# or : character) rest of word, ends in double quote

I'd like to remove the double quotes from the match

Here is an example

"#sql/inline"

to

#sql/inline

or

":username"

to

:username

but "test" would stay as "test"

Looks like this does what I'm looking for assuming there are no \ characters inside the word

(clojure.string/replace example-string #"(\")(#|:)(.*?)(\")" "$2$3")
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
arcanine
  • 1,933
  • 1
  • 15
  • 22
  • Something like ```"(:|#).*?"``` would perform the match but no idea how to say just replace the quotes – arcanine Jun 16 '19 at 11:44

5 Answers5

6

A regex for that can be

\"([#:][^\"]*)\"

Replace with $1. See the regex demo and the regex graph:

enter image description here

Closure command:

(clojure.string/replace example-string #"\"([#:][^\"]*)\"" "$1")

Regex details

  • \" - a double quotation mark
  • ([#:][^\"]*) - Capturing group #1:
    • [#:] - a # or : char
    • [^\"]* - 0 or more chars other than double quotation marks
  • \" - a double quotation mark.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
2

Or if we might have unnecessary spaces in our ", this expression would remove those:

"\s*([#:].+?)\s*"

and our desired data is in this capturing group: ([#:].+?).

Demo

Our code might look like:

(clojure.string/replace example-string #"\"\s*([#:].+?)\s*\"" "$1")
Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
1

Looks like this does what I'm looking for assuming there are no \ characters inside the word

(clojure.string/replace example-string #"(\")(#|:)(.*?)(\")" "$2$3")

arcanine
  • 1,933
  • 1
  • 15
  • 22
  • 1
    If you want to capture the separate parts, you might use 2 capturing groups as `(\")` is not really necessary to capture. Then in the replacement you could use `$1$2` – The fourth bird Jun 16 '19 at 19:07
1

There are several good regex answers already, but you don't need a regex to do this in Clojure:

(defn remove-quote-wrapper [s]
  (if (and (or (cs/starts-with? s "\"#")
               (cs/starts-with? s "\":"))
           (cs/ends-with? s "\""))
    (subs s 1 (dec (count s)))
    s))

If you're concerned with performance, this approach is ~4x faster than the clojure.string/replace with regex.

Taylor Wood
  • 15,886
  • 1
  • 20
  • 37
0

One problem with the proposed solutions is that they do not recognize correctly the quoted parts in the text.

Let us call the quoted parts starting with # or : "special" and the rest "non-special".

As an example, in the text "a"#b"c", "#b" is recognized as a special part and "a#bc" is produced, whereas "a" and "c" should be recognized as non-special parts and the text should be left unchanged.

Another problem is that the escaping of " and \ inside quoted parts is not handled.

One possible solution that takes account of these issues is the following:

(defn remove-quotes [s]
  (clojure.string/replace s
    #"\"([#:]?)(?:([^\"\\]+)|\\([\"\\]))*\""
    #(if (empty? (second %)) (first %) (apply str (rest %)))))

EDIT:

After reading Taylor Wood's answer which treats only a limited case, I decided to add a no-regex solution (which does not handle escaping):

(defn remove-quotes [s]
  (loop [processed "" remaining s]
    (if-let [i (clojure.string/index-of remaining \u0022)]
      (let [j (clojure.string/index-of remaining \u0022 (inc i))]
        (recur
          (str processed
               (subs remaining 0 i)
               (apply subs remaining
                      (if (#{\# \:} (get remaining (inc i)))
                        [(inc i) j]
                        [i (inc j)])))
          (subs remaining (inc j))))
      (str processed remaining))))

\u0022 is just \". The latter messes up the code's appearance in Stack Overflow.

peter pun
  • 384
  • 1
  • 8