8

I want to remove a pattern with sed, only at second occurence. Here is what I want, remove a pattern but on second occurrence.

What's in the file.csv:

a,Name(null)abc.csv,c,d,Name(null)abc.csv,f
a,Name(null)acb.csv,c,d,Name(null)acb.csv,f
a,Name(null)cba.csv,c,d,Name(null)cba.csv,f

Output wanted:

a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

This is what i tried:

sed -r 's/(\(null)\).*csv//' file.csv

The problem here is that the regex is too greedy, but i cannot make is stop. I also tried this, to skip the first occurrence of "null":

sed -r '0,/null/! s/(\(null)\).*csv//' file.csv

Also tried but the greedy regex is still the problem.

sed -r 's/(\(null)\).*csv//2' file.csv

I've read that ? can make the regex "lazy", but I cannot make it workout.

sed -r 's/(\(null)\).*?csv//' file.csv
jww
  • 97,681
  • 90
  • 411
  • 885
BeGreen
  • 765
  • 1
  • 13
  • 39
  • If you may have 3 or more `(null)`s and you still want to only remove the 2nd occurrence, I think it would be easier to do with perl, using `.*?` instead of `.*`. – Wiktor Stribiżew Sep 15 '17 at 11:55

3 Answers3

18

sed does provide an easy way to specify which match to be replaced. Just add the number after delimiters

$ sed 's/(null)[^.]*\.csv//2' ip.csv
a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

$ # or [^,] if there are no , within fields
$ sed 's/(null)[^,]*//2' ip.csv
a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f

Also, no need to escape () when not using extended regular expressions

Sundeep
  • 23,246
  • 2
  • 28
  • 103
  • 1
    I've tried it if you look closer in my post. The problem was the Greedy Regex. I had to change `.*` with `[^,]*` like in your example. Thank you. – BeGreen Sep 15 '17 at 12:41
  • 2
    well I didn't notice that you had tried `//1` (later edited to `//2`) ... so you were only put off by greedy issue... easy to solve in this case as there are workarounds with `[^,]` or `[^.]`... for generic case you might need proper csv parsers available in perl/python/etc – Sundeep Sep 15 '17 at 12:49
  • 1
    You are right, i could of done this with pyexcel which i use in my script. Didn't thought about that! – BeGreen Sep 15 '17 at 12:53
  • 1
    ahhh, this is exactly what I needed as well, Thanks! – Radamand Nov 02 '20 at 05:27
0

The more robust awk solution:

Extended sample file input.csv:

12,Name(null)randomstuff.csv,2,3,Name(null)randomstuff.csv, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name(null)AotherRandomStuff.csv, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name(null)alphaNumRandom.csv, false,Name(null)randomstuff.csv

The job:

awk -F, '{ c=0; for(i=1;i<=NF;i++) if($i~/\(null\)/ && c++==1) sub(/\(null\).*/,"",$i) }1' OFS=',' input.csv

The output:

12,Name(null)randomstuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)AotherRandomStuff.csv,2,3,Name, false,Name(null)randomstuff.csv
12,Name(null)alphaNumRandom.csv,2,3,Name, false,Name(null)randomstuff.csv
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
-3

Execute:

awk '{sub(/.null.....csv,f/,",f")}1' file

And the output should be:

a,Name(null)abc.csv,c,d,Name,f
a,Name(null)acb.csv,c,d,Name,f
a,Name(null)cba.csv,c,d,Name,f
noamt
  • 7,397
  • 2
  • 37
  • 59
Claes Wikner
  • 1,457
  • 1
  • 9
  • 8