It seems that you want to remove only slashes which are at start or end of words. So such slashes need to
- have space before
- have space after
- be placed at start of the string
- be placed at end of the string
This approach has potentially one flaw which is removing last slash in URL address like http://www.some.address/
would become http://www.some.address
.
If this is what you are looking for you can try with look-around mechanisms,
replaceAll("(?<=\\s|^)/|/(?=\\s|$)", "")
which will change
Bodies of 5 /Irish/ immigrants /'murdered and killed by cholera'
while building a railroad/ in 1832 to http://www.bbc.com/news/
into
Bodies of 5 Irish immigrants 'murdered and killed by cholera'
while building a railroad in 1832 to http://www.bbc.com/news
^as you see it also
removed last slash
in this url
Way around of removing last /
in URL problem would be make regex match URL first and replace it with itself. This will prevent slashes from this URL being matched (tested) again for having space or start-of-the-string before OR having space or end-of-the-string after it.
I mean regex in form
(matchesURL)|matchesSlashesAtStartOfWord|matchesSlashesAtEndOfWord
for such regex /
matched by (matchesURL)
will not be able to matched again by matchesSlashesAtStartOfWord|matchesSlashesAtEndOfWord
.
So you can use something like
replaceAll("(https?://[^/]+(/[^/]+)*/?)|(?<=\\s|^)/|/(?=\\s|$)", "$1")
which will first match urls, put them into group 1 and replace them with content of group 1 $1
. Since other cases of regex (?<=\\s|^)/|/(?=\\s|$)
can't place anything in group 1, for them $1
will be empty so you will replace such /
with nothing (you will remove them).
DEMO
String data = "Bodies of 5 /Irish/ immigrants /'murdered and killed by cholera' \r\nwhile building a railroad/ in 1832 to http://www.bbc.com/news/";
System.out.println(data);
System.out.println();
System.out.println(data.replaceAll("(https?://[^/]+(/[^/]+)*/?)|(?<=\\s|^)/|/(?=\\s|$)", "$1"));
Output
Bodies of 5 /Irish/ immigrants /'murdered and killed by cholera'
while building a railroad/ in 1832 to http://www.bbc.com/news/
Bodies of 5 Irish immigrants 'murdered and killed by cholera'
while building a railroad in 1832 to http://www.bbc.com/news/