0

I have a big Wordpress site (>10,000 posts) and I would like to remove all outbound links + anchor texts from its posts.

I was able to code something that does the job through a regex search and replace on each post. But because I have so many posts, this script is virtually useless (problems of memory and time of execution on a shared server).

What's the best way to make a regex search and replace on the database while consuming the least amount of memory? Can I make a regex search and replace via mysql?

Can you also confirm that this is the regex that will match all links except those containing "mysite.com" (except internal links):

(<a.*>)(?!mysite\.com)(.*)(<\/a>)
Sulli
  • 763
  • 1
  • 11
  • 33

1 Answers1

1

If I were faced with the same situation, I would write a script which would process the data in batches. This makes your script more performant, and if you are running replication, will ensure it doesn't cause replication lag downstream.

I'd recommend doing the work in batches of 200 (use OFFSET and LIMIT for the read queries), then sleep for anywhere from 2-5 seconds, then process the next batch.

I won't speak to the regex you provided, as the wrong response could render your links broken. I'd also suggest you write a small test script with some sample links on which you could test the regex, once you get it locked down then add it to the main script.

Mike Purcell
  • 19,847
  • 10
  • 52
  • 89