-1

As the example below, I want to keep only the word before the first 'John'.

However, the pattern I applied seems to replace John from the end to the head. So I need to call sed twice.

How could I find the correct way?

PATTERN="I am John, you are John also"
OUTPUT=$( echo "$PATTERN" | sed -r "s/(^.*)([ \t]*John[ ,\t]*)(.*$)/\1/" )
echo "$OUTPUT"
OUTPUT=$( echo "$OUTPUT" | sed -r "s/(^.*)([ \t]*John[ ,\t]*)(.*$)/\1/" )
echo "$OUTPUT"

My expectation is only call sed one time. Since if "John" appears several times it will be a trouble.

By the procedure above, it will generate output as:

Firstly it matches & trims the word after the final John; then the first John.

I am John, you are

I am

I want to execute one time and get

I am

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
chang jc
  • 489
  • 8
  • 16
  • Try [`awk -F'[[:blank:]]*John[[:blank:]]*' '{ print $1 }'`](http://rextester.com/GWJ61030) – Wiktor Stribiżew May 11 '18 at 07:25
  • 1
    Don't use `ALL_UPPERCASE` variables in the shell. Those tend to be used for system stuff (e.g. `HOME`) or have a special meaning to the shell itself (e.g. `PATH`, `RANDOM`). – melpomene May 11 '18 at 07:31
  • 2
    @chang jc, please mention always expected output in your post too. – RavinderSingh13 May 11 '18 at 07:38
  • 1
    @chang, I'd suggest to look up about greediness.. `.*` will try to match as much as possible and then backtrack from end if needed... https://stackoverflow.com/questions/5319840/greedy-vs-reluctant-vs-possessive-quantifiers might help.. – Sundeep May 11 '18 at 07:54

3 Answers3

2

Following sed may help you on same.

echo "I am John, you are John also" | sed 's/ John.*//'

Or with variables.

pattern="I am John, you are John also"
output=$(echo "$pattern" | sed 's/John.*//')
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1

Another way of doing it is to use the grep command in Perl mode:

echo "I am John, you are John also" | grep -oP '^(?:(?!John).)*';
I am 
#there will be a space at the end
echo "I am John, you are John also" | grep -oP '^(?:(?!John).)*(?=\s)';
I am
#there is no space at the end

Regex explanations:

^(?:(?!John).)*

This will accept all characters from the beginning of the lines until it reaches the first John.

Regex demo

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Allan
  • 12,117
  • 3
  • 27
  • 51
0

Awk solution:

s="I am John, you are John also and there is another John there"
awk '{ sub(/[[:space:]]+John.*/, "") }1' <<<"$s"

The output:

I am
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • I paid attention to OP's input that has only a single space :) – anubhava May 11 '18 at 09:48
  • @anubhava, We professionals should tend to unified solutions to prevent that often-used comments like *"This won't work if .... or this won't work in case of"*. The difference is that the posters need the solution for theirselfs but we provide a solution for the community. That's all – RomanPerekhrest May 11 '18 at 09:53