We have a site using wordpress and we have discovered that at some point, a bad plugin or user error has added double slashes after the siteurl (for example, http://example.site//category1/
or http://example.site/category1//category2/
, etc.
This seems to work but it looks like there aren't quite enough results.
SELECT id, post_content
FROM `wp_posts`
where post_content
regexp '(href="[^"]*[^:]\/\/[^"]*)'
and post_status in('draft','publish')
order by id asc
Is there a better way to do this? I don't want it to match on the double slash that comes after the http:, hence the negative match on the :.
Edit: for clarification, I want to find all posts (the body of a wordpress post/page) that have a url hard-coded into the page that has double slashes, but do not match on the double slashes after the http:.
Regexp should match on the following:
http://example.site//category1/
or http://example.site/category1//category2/
or even http://example.site/category1/category2//
or example.site/category1//category2/
But should not match on the following:
http://example.site/category1/
or http://example.site/category1/category2/