As far as I can tell, you're looking for a '
where either the next or previous character is not a letter.
The regex I came up with to do this, contained in some test code:
String str = "bob can't do 'well'";
String[] splits = str.split("(?:(?<=^|[^a-zA-Z])'|'(?=[^a-zA-Z]|$)|[^a-zA-Z'])+");
System.out.println(Arrays.toString(splits));
Explanation:
(?<=^|[^a-zA-Z])'
- matches a '
where the previous character is not a letter, or we're at the start of the string.
'(?=[^a-zA-Z]|$)
- matches a '
where the next character is not a letter, or we're at the end of the string.
[^a-zA-Z']
- not a letter or '
.
(?:...)+
- one or more of any of the above (the ?:
is just to make it a non-capturing group).
See this for more on regex lookaround ((?<=...)
and (?=...)
).
Simplification:
The regex can be simplified to the below by using negative lookaround:
"(?:(?<![a-zA-Z])'|'(?![a-zA-Z])|[^a-zA-Z'])+"