regex to exclude string and delete line

Question

I have the following lines in an XML file

<User id="10338" directoryId="1" sometext txt text test/>
<User id="10359" directoryId="100" some more text text text/>
<User id="103599" directoryId="100" some more text text text/>
<User id="10438" directoryId="1" sometext txt text test/>

I am trying to remove any lines that start with User id=" but I want to keep the ones that have directoryId="1"

my current sed command is

sed -i '' '/<User id="/d' file.xml

I have looked at A regular expression to exclude a word/string and a few other stack overflow posts but not able to get this to work. Please can someone help me write the regex. I essentially need to delete any lines that start with <User id= but excluding the ones where directoryId="1"

[Don't Parse XML/HTML With Regex.](https://stackoverflow.com/a/1732454/3776858) I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). — Cyrus, Mar 01 '21 at 23:54

score 0 · Accepted Answer · answered Mar 02 '21 at 00:20

0

You can use

sed -i '' -e '/directoryId="1"/b' -e '/<User id="/d' file.xml

With this sed command,

/directoryId="1"/b skips the lines containing directoryId="1" and
/<User id="/d deletes the other lines that contain <User id=".

See an online demo.

answered Mar 02 '21 at 00:20

Wiktor Stribiżew

607,720
39
448
563

regex to exclude string and delete line

1 Answers1