1

I am trying to manipulate a file lets say :

76ers23 Philadelphia 76ers announced today that
76ers24 Lakers announced today 
76ers25 blazers  plays today 
76ers26 celics announced today that
76ers27 Bonston has Day off
76ers28 Philadelphia 76ers announced today that
76ers29 the blazzers announced today that
76ers30 76ers Training day
76ers31 Philadelphia 76ers has a day off  today 
76ers32 Philadelphia 76ers  humiliate Lakers 
76ers33 celics announced today that

I want to remove all the entries containing the term 76ers from the second column so as to obtain:

 76ers24    Lakers announced today 
 76ers25    blazers  plays today 
 76ers26    celics announced today that
 76ers27    Bonston has Day off
 76ers29    the blazzers announced today that
 76ers33    celics announced today that

my issue here is that if I will use the grep -v "76ers" it returns null

I am looking to use the grep (or another command) in the second line only.

I found this complicate way but which is pretty much what I want, but I got an_at the beginning of the second column.

cat file|awk '{print $1}' >file1
cat file|awk '{$1="";print $0}'|tr -s ' ' | tr ' ' '_' >file2
paste file1 file2 |grep -v "_76ers"

I'm not a bash expert so I guess there will be an easier way for that. Thank you in advance!

KGee
  • 323
  • 1
  • 9
  • @Barmar Indeed. So thats why I tried to "find a trick" to hide the first column and use the grep -v for the entire text. :-( – KGee Dec 23 '20 at 21:10
  • Awk, like basically every file oriented utility (except `tr`!) accepts zero or more file name arguments; it doesn't need `cat` to feed it standard input. See also [useless use of `cat`.](https://stackoverflow.com/questions/11710552/useless-use-of-cat) – tripleee Dec 23 '20 at 21:14
  • In your example adding a space in front of the pattern would already to the trick. `grep -v ' 76ers'`. – Socowi Dec 23 '20 at 21:18
  • @Socowi thats right too but sometimes the space doesnt work properly... – KGee Dec 23 '20 at 21:32

6 Answers6

5

Use a regular expression that skips over the first column.

grep -v '^[^ ]* .*76ers' file

[^ ]* matches everything up to the first space.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • First of all, thank you for the response. I tried the command you suggest but it returns null :-( – KGee Dec 23 '20 at 21:33
  • Try it now. It was matching an empty string at the beginning, but now I forced it to match up to the space. – Barmar Dec 23 '20 at 21:36
  • Yes, you are absolutely right... So I have to be fair. You posted the solution first so If I want to be fair, I have to accept yours over @Scottie H response. Ps You make it look so simple, and I was struggling for too many hours without success :D !!! Thank you again – KGee Dec 23 '20 at 21:44
2

using awk:

awk '{ found=0;for(i=2;i<=NF;i++) { if (match($i,"76ers")) { found=1 } } if (found==0) { print $0 } }' file

Loop through the second space separated field to the last field and use match to check if that field contains 76ers. If it does, set a found flag. Only print the line if found is 0 after we have looped through each field for every line..

Raman Sailopal
  • 12,320
  • 2
  • 11
  • 18
1

You can create an Extend Reqular Expression to Ignore the first column. Not knowing exactly what you "flavor" of the OS is, I'll give you two different formats.

grep -E is the same as egrep
[[:digit:]] is the same as [0-9]
[[:space:]] is the same as []

First option: Look for 76ers with white space after it:
grep -Ev '76ers[[:space:]]' <file>

Second Option: Look for 76ers, followed by one or more digits, , then a second 76ers:
grep -Ev '76ers[[:digit:]][[:digit:]]*.*76ers' <filename>

Scottie H
  • 324
  • 1
  • 7
  • 1
    Both commands works great !!!! I tried also after a command like command|grep -Ev '76ers[[:space:]]' thanks a lot!!! – KGee Dec 23 '20 at 21:36
1

With GNU grep, requiring that the match is "whole word" with the -w/--word-regexp option:

grep -vw '76ers' infile

From the manual:

-w
--word-regexp
Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word constituent characters are letters, digits, and the underscore. This option has no effect if -x is also specified.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • yes indeed. In this case works file as "76ers" was not following by any other word. :-) – KGee Dec 23 '20 at 22:09
1

Here is an alternative approach using awk. Similar to the idea of Balmer, ensure that the first column does not match the ERE.

$ awk -v ere='76ers' '$0~ere && $1!~ere' file

This will print all the records/lines which match the regular expression ere ($0~ere) but only if the first column does not match that regular expression $1!~ere.

kvantour
  • 25,269
  • 4
  • 47
  • 72
  • Thanks a lot for the response. It looks more simply than the other solutions and more clear and logic for an amateur like me. Unfortunately I tried your command but it prints "null". – KGee Dec 24 '20 at 10:01
  • sorry for my previous comment. Actually your command was fine just I had to add an extra space in the '76ers' like 'awk -v ere='(space)76ers' '$0~ere && $1!~ere' file' – KGee Dec 24 '20 at 10:03
0
$ grep -v ' .*76ers' file
76ers24 Lakers announced today
76ers25 blazers  plays today
76ers26 celics announced today that
76ers27 Bonston has Day off
76ers29 the blazzers announced today that
76ers33 celics announced today that
Ed Morton
  • 188,023
  • 17
  • 78
  • 185