-1

Given the string: lolololololol

Grep will not overlap match when I try to find the number of lol's.

echo "lolololololol" | grep -o 'lol' | wc -l

The above snippet returns 3. When in fact the answer I would like is that there are 6 lol's. For this case I know I could simply grep for lo, but this example is meant to represent the general question of how allow grep or some other search tool to find overlapping matches.

Other examples

echo "1 2 3 4 5" | grep -E '(^| )[0-9]( |$)'
echo "10101" | grep '101'

The best solution can be found at https://unix.stackexchange.com/questions/276159/grep-that-works-with-overlapping-patterns

Barak Binyamin
  • 174
  • 1
  • 11
  • 1
    Good that you have shown your efforts in your question. Could you please do add sample of input and expected output in your question and let us know then for better understanding of question. – RavinderSingh13 Nov 03 '20 at 19:27
  • 1
    I don't understand why `[^ ]+` is insufficient. Can you provide the current output versus desired output? – MonkeyZeus Nov 03 '20 at 19:27
  • This part `[^ ]` matches any char except a space, is that intended? – The fourth bird Nov 03 '20 at 19:28
  • 1
    Looks like the question is oversimplified and the real problem is quite different. Please explain what real-life issue you are having. At least, please provide the expected output for the sample input given. – Wiktor Stribiżew Nov 03 '20 at 19:28
  • After you added the expected output, "*I don't understand why `[^ ]` is insufficient*" x 2. – Wiktor Stribiżew Nov 03 '20 at 19:32
  • @WiktorStribiżew I think I get it now. It has to do with subsequences like you showed on https://stackoverflow.com/q/64579766/2191572 – MonkeyZeus Nov 03 '20 at 19:41
  • You're using the wrong tool. `$: while IFS="$IFS'" read -ra hits; do (( ${#hits[@]} )) && echo "${hits[@]} (${#hits[@]})"; done < txt` – Paul Hodges Nov 03 '20 at 20:41
  • If you have found a [good solution](https://unix.stackexchange.com/questions/276159/grep-that-works-with-overlapping-patterns) either mark this question as a duplicate of that other question, or reproduce that good solution as an answer here. – Lenna Nov 04 '20 at 23:53
  • you need reputation to mark a question as a duplicate – Barak Binyamin Nov 05 '20 at 05:50

1 Answers1

1

It looks like you expect to capture the space from the end of one match into the start of the next match:

.*?(?=( [^\r\ ]+ ))

In grep:

#!/bin/bash

string=" 1 -1 1 
 a bc d 
 alpha beta delta "

echo "$string" | grep -Po '.*?(?=( [^\r\n ]+ ))'

Using https://www.jdoodle.com/test-bash-shell-script-online/ it doesn't show the spaces but I believe they are there.

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77