1

I would like to use sed to change a line breaks preceding a specific character and replace it with a simple space:

Example:

<link rel="colorSchemeMapping
"
href="marinedrugs-790193-original.fld/colorschememapping.xml">

Should be:

<link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml">

I'm aware of the

':a;N;$!ba;s/\n/ /g'

but am willing to add the double quotes as being mandatory preceding the line break.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Milos Cuculovic
  • 19,631
  • 51
  • 159
  • 265

2 Answers2

2

I suggest replacing newline+"+newline with the " and space, and any other newline with a space:

sed -i -E ':a;N;$!ba;s/\n(")\n|\n/\1 /g' file
sed -i ':a;N;$!ba;s/\n"\n/" /g; s/\n/ /g' file

or

sed -e ':a;N;$!ba' -e 's/\n"\n/" /g' -e 's/\n/ /g' file > newfile

LINE ENDING NOTE: If your endings are CRLF, you need to replace \n with \r\n in the above patterns.

Note -E will enable POSIX ERE syntax (to avoid using too many backslashes in the pattern). The regex means

  • \n(")\n - a newline, then " is captured into Group 1 and then a newline
  • | - or
  • \n - a newline.

The replacement is Group 1 value (" if it was matched) and a space.

See the online sed demo:

s='<link rel="colorSchemeMapping
"
href="marinedrugs-790193-original.fld/colorschememapping.xml">'
sed -E ':a;N;$!ba;s/\n(")\n|\n/\1 /g' <<< "$s"
# => <link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml"> 
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
2

Since you're using GNU sed anyway:

$ sed -z 's/\n"\n/" /g' file
<link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml">

If you find yourself using constructs other than s, g, and p (with -n) in sed then you're using the wrong tool and should instead be using awk or similar. All other sed constructs became obsolete 40+ years ago when awk was invented.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185