Description
First off, regex isn't the most ideal solution for this, but I'm sure you have your reasons for using it.
((\b[a-z]{1,}\b).*?)(\b\2\b)(.*)$
Replace with: \1and\4

Summary
This regex will find two identical words in a string and replace the second one with and
.
Example
Live Demo
https://regex101.com/r/yG3yM6/2
Sample text
Green shirt green hat
Green shirt greenish hat
You are an artistically gifted musically gifted individual
Sample Matches
Green shirt and hat
Green shirt greenish hat
You are an artistically gifted musically and individual
Explanation
NODE EXPLANATION
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
[a-z]{1,} any character of: 'a' to 'z' (at least
1 times (matching the most amount
possible))
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
\2 what was matched by capture \2
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
( group and capture to \4:
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
) end of \4
----------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
----------------------------------------------------------------------
Extra credit
Although not addressed in the OP, if the words in question use non a-z
characters, then you could replace [a-z]
with [a-z]|[^\x00-\x7F]
which will match non-english characters. But then we'll need to change the \b\2\b
to (?<=\s|^)\2(?=\s|$)
so we can ensure correct matching.
((\b(?:[a-z]|[^\x00-\x7F]){1,}\b).*?)((?<=\s|^)\2(?=\s|$))(.*)$

Live Demo
https://regex101.com/r/wD8yF5/2