How to use sed to delete a string with wildcards

Question

File1:

<a>hello</b> <c>foo</d>
<a>world</b> <c>bar</d>

Is an example of the file this would work on. How can one remove all strings which have a <c>*</d> using sed?

What do you mean by "remove all strings"? Do you mean remove that whole line or just that block of text? — Adam Batkin, Oct 20 '09 at 07:02
All strings beginning with and ending with . The command below worked perfectly. Anyone using the command also, obviously, needs to add the file at the end of the command. — user191960, Oct 20 '09 at 07:07
Note that parsing XML-like strings with regex may cause issues: https://stackoverflow.com/a/1732454/384617 — David Pärsson, Jul 08 '20 at 10:16

score 4 · Accepted Answer · answered Oct 20 '09 at 07:03

4

The following line will remove all text from <c> to </d> inclusive:

sed -e 's/<c>.*<\/d>//'

The bit inside the s/...// is a regular expression, not really a wildcard in the same way as the shell uses, so anything you can put in a regular expression you can put in there.

answered Oct 20 '09 at 07:03

Adam Batkin

51,711
9
123
115

Works perfectly! Remember to users of this command to add input/output file at end to redirect sed: sed -e 's/.*<\/d>//' In > Out. – user191960 Oct 20 '09 at 07:12

score 0 · Answer 2 · edited Feb 20 '13 at 18:26

Great Swiss-Army knife!

I modified it to pull header info out of eMails for an archiving script. It involved renaming the IMAP eMails with both date and sender info (otherwise IMAP just numbered 1, 2, 3, etc.). Here's the two mods:

for i in $mailarray; do date -d $(less -f $i | grep -im 1 "Date:\ " | sed -e 's_^.*$ate: $__') +%F_%T%Z; done

for i in $mailarray; do less -f "$i" | grep -iEm 1 "From:\ " | sed -e 's_^.*$rom$.*<\|^.*$rom:$.__' | sed -e 's_@.*$__'; done

They saved a great deal of extraneous coding. Thank you.

score 0 · Answer 3 · answered Oct 20 '09 at 08:40

0

if all your data is like that of the example

# gawk 'BEGIN{FS=" <c>"}{print $1}' file
<a>hello</b>
<a>world</b>

answered Oct 20 '09 at 08:40

ghostdog74

327,991
56
259
343

How to use sed to delete a string with wildcards

3 Answers3