delete text with delimiter in unix

Question

I have a text file in the below format . I need to remove the text between the first and second semicolon (delimiter ), but retain the second semicolon

$cat test.txt
abc;def;ghi;jkl
mno;pqr;stu,xxx

My expected output

abc;ghi;jkl
mno;stu,xxx

I tried using sed 's/^([^;][^;]*);.*$/\1/', but it removes everything after the first semicolon. I also tried with cut -d ';' -f2, this only give the 2nd field as output.

Please do add your efforts in form of code in your question, which is highly encouraged on SO, thank you. — RavinderSingh13, Jun 03 '21 at 07:19
@RavinderSingh13 I tried using sed 's/^$[^;][^;]*$;.*$/\1/' , but it removes everthing after the first semicolon. I also tried with cut -d ';' -f2 , this only give the 2nd field as output. — Meenakshisundaram Ramanathan, Jun 03 '21 at 07:22
Sure, please do add them in your question(comments are not meant for posting codes), thank you. — RavinderSingh13, Jun 03 '21 at 07:24
See: [Delete specific columns from csv file maintaining same structure on output](https://stackoverflow.com/q/47813118/3776858) — Cyrus, Jun 03 '21 at 07:38

RavinderSingh13 · Answer 1 · 2021-06-03T12:16:28.740

Trying to fix OP's attempts here, with sed you could try following code. Simple explanation would be, create 1st back reference which has value till 1st occurrence of ; then from 1st ; to 2nd ; don't keep it in backreference and keep rest of the value in 2nd back reference. Finally while substituting substitute it with 1st and 2nd back reference values.

sed -E 's/^([^;]*);[^;]*;(.*)/\1;\2/' Input_file

OR as per Ed's comment please try following;

sed -E 's/^([^;]*);[^;]*/\1/' Input_file

David C. Rankin · Answer 2 · 2021-06-03T09:03:26.180

3

You can do it directly by simply removing the 2nd occurrence of the characters in question, e.g.

sed 's/[^;]*;//2' test.txt

Example Use/Output

$ sed 's/[^;]*;//2' test.txt
abc;ghi;jkl
mno;stu,xxx

A thanks to @EdMorton for improvements here as well.

If you did want to use awk, you could simply replace the 2nd field with nothing as well, e.g.

awk -F';' '{sub(/;[^;]*/,"")}1' test.txt

(same output)

With a thanks to @EdMorton for the improvement to the original.

Or as Cyrus suggest with cut, deleting field 2, e.g.

cut -d';' -f-1,3- test.txt

(same output)

edited Jun 03 '21 at 09:03

answered Jun 03 '21 at 07:26

David C. Rankin

81,885
6
58
85

`cut -d';' -f2- test.txt` does not work as we expect for this question. It removes the first column, not the second. It essentially says "keep from column 2 and on". – billpcs Jun 03 '21 at 07:56
I had one more hair-brained `awk` scheme. What about looping over all fields and just not outputting the 2nd? A little longer, but avoids all potential issues? – David C. Rankin Jun 03 '21 at 09:04
That'd be fine too. Or with GNU awk you could use `gensub()` just like you use `sed`. – Ed Morton Jun 03 '21 at 09:07

score 3 · Answer 3 · answered Jun 03 '21 at 07:26

3

You may use this sed:

sed 's/;[^;]*//' file

abc;ghi;jkl
mno;stu,xxx

answered Jun 03 '21 at 07:26

anubhava

761,203
64
569
643

1

That's another creative way to get the 2nd occurrence `:)` – David C. Rankin Jun 03 '21 at 07:28

billpcs · Accepted Answer · 2021-06-03T07:47:17.237

3

Using cut

cut -d";" -f2 --complement file

-d is for delimeter, i.e ";" in your case
-f is for field, i.e keep the fields listed
--complement is to reverse the selection, i.e remove the fields listed

So:

$ cat test.txt
abc;def;ghi;jkl
mno;pqr;stu;xxx

$ cut -d";" -f2 --complement test.txt
abc;ghi;jkl 
mno;stu;xxx

edited Jun 03 '21 at 07:47

answered Jun 03 '21 at 07:34

billpcs

633
10
17

3

You should mention that will only work with GNU `cut`, not a POSIX `cut`. – Ed Morton Jun 03 '21 at 09:12
1

If the `--complement` option seems a bit unwieldy remember the option can be shortened to the first unambiguous possibility i.e `--comp` or even `--co`. – potong Jun 03 '21 at 12:02

RARE Kpop Manifesto · Answer 5 · 2021-06-05T14:01:31.313

0

super lazy awk solution

gawk/mawk/mawk2 'sub(/;[^;]+/,"")'

a more verbose solution but makes it clearer what it's doing

g/mawk 'BEGIN {FS=";+"; OFS=";"} ($2="")||($0=$0)&&($1=$1)'

clean out 2nd field, but since null string is assigned in, it returns 0 (false), thus requiring logical or || to continue.

$0=$0 plus $1=$1 to clean up extra ;, which will also print it.

edited Jun 05 '21 at 14:01

answered Jun 05 '21 at 13:53

RARE Kpop Manifesto

2,453
3
11

delete text with delimiter in unix

5 Answers5