19

I have the following file

titi
tata
toto
tata

If I execute

sed -i "/tat/d" file.txt

It will remove all the lines containing tat. The command returns:

titi
toto

but I want to remove only the first line that occurs in the file containing tat:

titi
toto
tata

How can I do that?

pillravi
  • 4,035
  • 5
  • 19
  • 33
MOHAMED
  • 41,599
  • 58
  • 163
  • 268
  • So you want to remove all first ocurrences of lines that appear at least twice? – fedorqui May 16 '14 at 13:50
  • @fedorqui only the first occurence of the line and not all the occurence of the line – MOHAMED May 16 '14 at 13:51
  • But just for "tata" or for all of them? Meaning, you want to remove the first occurrence of "tata" and that's all or you want to remove all the first occurrences of all lines appearing at least twice? – fedorqui May 16 '14 at 13:52

6 Answers6

22

You could make use of two-address form:

sed '0,/tat/{/tat/d;}' inputfile

This would delete the first occurrence of the pattern.

Quoting from info sed:

 A line number of `0' can be used in an address specification like
 `0,/REGEXP/' so that `sed' will try to match REGEXP in the first
 input line too.  In other words, `0,/REGEXP/' is similar to
 `1,/REGEXP/', except that if ADDR2 matches the very first line of
 input the `0,/REGEXP/' form will consider it to end the range,
 whereas the `1,/REGEXP/' form will match the beginning of its
 range and hence make the range span up to the _second_ occurrence
 of the regular expression.
devnull
  • 118,548
  • 33
  • 236
  • 227
5

If you can use awk, then this makes it:

$ awk '/tata/ && !f{f=1; next} 1' file
titi
toto
tata

To save your result in the current file, do

awk '...' file > tmp_file && mv tmp_file file

Explanation

Let's activate a flag whenever tata is matched for the first time and skip the line. From that moment, keep not-skipping these lines.

  • /tata/ matches lines that contain the string tata.
  • {f=1; next} sets flag f as 1 and then skips the line.
  • !f{} if the flag f is set, skip this block.
  • 1, as a True value, performs the default awk action: {print $0}.

Another approach, by Tom Fenech

awk '!/tata/ || f++' file

|| stands for OR, so this condition is true, and hence prints the line, whenever any of these happens:

  • tata is not found in the line.
  • f++ is true. This is the tricky part: first time f is 0 as default, so first f++ will return False and not print the line. From that moment, it will increment from an integer value and will be True.
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    or for the ultimate in readability: `awk '!/tata/ || f++' file` – Tom Fenech May 16 '14 at 13:59
  • Oh that's very bright, @TomFenech! I would consider editing my answer to include this approach. – fedorqui May 16 '14 at 14:02
  • It's hardly the most self-documenting of approaches but by all means, go ahead :) – Tom Fenech May 16 '14 at 14:06
  • @TomFenech `awk '!/tata/ || f++'` This one, is just up my alley. I love those short marvels. So I add this to my favorite list. I did not know that it just skips testing if some is true, so that `f` is not incremented of there are no `tata`. If you like to remove the two first hits of a word, just do `awk '!/tata/ || f++==2'` – Jotne May 16 '14 at 15:57
  • Thanks for the attribution fedorqui, and good explanation :) @Jotne I'm glad you liked it. As in most languages `||` is short-circuiting so the second part isn't evaluated when the first is true. – Tom Fenech May 16 '14 at 16:01
  • 1
    @TomFenech I do see that `awk '!/tata/ || f++'` do the same as `awk '!/tata/ || f--'` and if you like to delete second hits, `awk -v f=1 '!/tata/ || f--' and third hits `awk -v f=2 '!/tata/ || f--' etc – Jotne May 16 '14 at 16:07
  • Or better `awk -v f=2 '!/tata/ || --f'` for second etc – Jotne May 16 '14 at 16:14
4

Here's the general way to do it:

$ cat file
     1  titi
     2  tata
     3  toto
     4  tata
     5  foo
     6  tata
     7  bar
$
$ awk '/tat/{ if (++f == 1) next} 1' file
     1  titi
     3  toto
     4  tata
     5  foo
     6  tata
     7  bar
$
$ awk '/tat/{ if (++f == 2) next} 1' file
     1  titi
     2  tata
     3  toto
     5  foo
     6  tata
     7  bar
$
$ awk '/tat/{ if (++f ~ /^(1|2)$/) next} 1' file
     1  titi
     3  toto
     5  foo
     6  tata
     7  bar

Note that with the above approach you can skip whatever occurrence(s) of an RE you like (1st, 2nd, 1st and 2nd, whatever) and you only specify the RE once (as opposed to having to duplicate it for some alternative solutions).

Clear, simple, obvious, easily maintainable, extensible, etc....

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

Here is one way of doing it with sed:

sed ':a;$!{N;ba};s/\ntat[^\n]*//' file
titi
toto
tata
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
1

This might work for you (GNU sed):

sed '/pattern/{x;//!d;x}' file

Print all lines other than those containing the pattern as normal. Otherwise if the line contains the pattern and hold space does not (the first occurrence), delete that line.

potong
  • 55,640
  • 6
  • 51
  • 83
-1

You may find the first matching line number with grep and pass it to sed for deletion.

sed "$((grep -nm1 tat file.txt || echo 1000000000:) | cut -f 1 -d:) d" file.txt

grep -n combined with cut finds the line number to be deleted. grep -m1 ensures at most one line number is found. echo handles the case when there is no match so as not to return an empty result. sed "[line number] d" deletes the line.

Vytenis Bivainis
  • 2,308
  • 21
  • 28
  • Please roll your two answers into one answer, separated by a title or bold heading illustrating the two ways to do it, and pros/cons of each. Code only answers are low quality and subject to deletion, there should at least be 1 or two sentences explaining what question you are addressing, what it's doing in laymans terms, and optionally other pitfalls. See: https://meta.stackoverflow.com/questions/345719/low-quality-posts-and-code-only-answers Address these and I'll remove the comment and downvote. – Eric Leschinski Sep 09 '17 at 20:53