2

In my bash script, I am trying to replace all apostrophes(') with two apostrophes('') in a csv file in order to commit it to a postgres database. As you know all single quotes in a postgres query need to be escaped with an apostrophe.

I can't use double quotations around my variables in my query because they also have double quotes in them too. So there is no easy way out apart from doing a blanket replace using sed. I have experimented with the following ways but to no avail:

sed "s/\'/\'\'/g" test.txt   #Does not work

sed "s/'/''/g" test.txt      #Does not work  

sed s/\'/"\'\'"/g test.txt   #Does not work  

Does anyone have ideas on how I can get this to work?

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • 4
    I can't see anything wrong with the most simple option `sed "s/'/''/g"`. – Tom Fenech Feb 18 '16 at 15:25
  • `As you know all single quotes in a postgres query need to be escaped with an apostrophe.` - a very odd thing to assume everyone reading your question knows! Post concise, testable but truly representative sample input and expected output if you'd like help. – Ed Morton Feb 18 '16 at 17:47
  • Try `sed "s/\x27/\x27\x27/g"` or `sed "s/'/''/g"`, you don't need to escape the single quotes within double quotes. –  Feb 19 '16 at 05:40

3 Answers3

1

This will work for you:

sed "s/'/''/g"  <<< "test'test"

Since you are using double quotes to enclose the command, the apostrophe doesn't need to get escaped.

Btw, if you want to use single quotes to enclose the command itself, you may use the octal representation of the apostrophe \047 in the command:

sed 's#\047#\047\x47#g' <<< "test'test"
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • I admit that I asking myself why `sed "s/\'/\'/g"` doesn't work. – hek2mgl Feb 18 '16 at 15:32
  • IMHO, the above command should work. It works also with BSD sed. It looks like a bug for me. Good catch – hek2mgl Feb 18 '16 at 15:47
  • The regexes /\`/ and /\'/ are not interpreted as a backquote and apostrophe, as might be expected. **Instead, they are used to represent the beginning-of-line and end-of-line (respectively)**, to conform with similar regexes in the GNU versions of Emacs and awk. As a consequence, there is no clear way to indicate an apostrophe, since a bare apostrophe (') has special meaning to the Unix shell and the quoted apostrophe (\') is interpreted as the EOL. – 123 Feb 18 '16 at 16:00
  • A double-quote apostrophe (\\') was interpreted as a backslash to sed and a quote mark to the shell--again, not providing the expected results. This syntax changed in the next version of gsed. – 123 Feb 18 '16 at 16:00
  • @123 Interesting and yes, the problem disappears if I pass `--posix`. Where do you know that from? – hek2mgl Feb 18 '16 at 16:04
  • @123, ok but the section you quoted applies to GNU sed v2.0.5. The are explicitly saying that `This syntax changed in the next version of gsed`. I would suppose that it has been changed already. I mean with have GNU sed 4 nowadays. – hek2mgl Feb 18 '16 at 16:11
  • Yeah but it clearly hasnt, you can clearly see it still works this way – 123 Feb 18 '16 at 16:12
  • Thanks for the input. It works fine off the command line for me too but it does not manipulate the text from a file. I have all the correct read write permissions on the file. I have also used the switch -i to write back to the same file. – user3236182 Feb 18 '16 at 16:38
  • If you need double quotes inside the double quotes, simply backslash-escape them. – tripleee Feb 18 '16 at 16:47
  • ... or use their hexadecimal representation, as I've shown. It's `\x22` – hek2mgl Feb 18 '16 at 16:52
  • @hek2gml - Thank you for digging that out of the docs. Any alternative ideas of how I can make this replacement? Hex does not also work. – user3236182 Feb 18 '16 at 17:29
  • @123 Hexadecimal does not also work. I have javascript in my file which I need to commit to database, I have no power of changing that. – user3236182 Feb 18 '16 at 17:30
  • Don't use hex `\x27`, use octal `\047` instead. Try `gawk 'BEGIN{print "\x27foo!\x27"}'` vs `gawk 'BEGIN{print "\047foo!\047"}'` and see http://awk.freeshell.org/PrintASingleQuote for details on why the hex version outputs the truncated `oo!'` while the octal version outputs the desired `'foo!'`. – Ed Morton Feb 18 '16 at 17:33
  • @user3236182 `I have javascript in my file which I need to commit to database, I have no power of changing that.` - I can't begin to guess what that statement means in respect to your question or this answer or any of the preceding comments. I think you're assuming we all work in the cubicle next to you - you'd be better off assuming we don't know anything except what you tell us and edit your question to reflect that new assumption. – Ed Morton Feb 18 '16 at 17:50
  • 1
    @EdMorton Thanks for pointing out the advantage of the *octal* representation (again :) and thanks for the awk.freeshell.org resource. I've changed that in my answer. – hek2mgl Feb 18 '16 at 18:44
0

This might work for you (GNU sed):

sed 's/'\''/&&/g' file

The second ' ends the quoting of the first part of the substitute command. The third ' is in the shell and is quoted thus \'. The remainder of the substitution command is then quoted. Using the && as shorthand for entire match in the LHS of the substitutin command doubled up.

N.B. It is always preferable to use single quotes to quote sed commands as it removes the chance of the shell accidentally interpreting characters in the command.

potong
  • 55,640
  • 6
  • 51
  • 83
0

Since you indicated:

I can't use double quotations

you might wish to consider using $'...' if your shell allows it (sh, ksh, bash and others do):

sed $'s/\'/\'\'/g'
peak
  • 105,803
  • 17
  • 152
  • 177