0

We are in the process to update an existing WordPress website and imported posts from one site to another. On the old website we used a WYSIWYG editer and all the posts have styles embedded in them

Example:

<p style="margin: 6px 0px; color: rgb(20, 24, 35); font-family: helvetica, arial, 'lucida grande', sans-serif; font-size: 14px; line-height: 19.3199996948242px; background-color: rgb(255, 255, 255);">
    <span style="font-family: helvetica, arial, 'lucida grande', sans-serif; font-size: 14px; letter-spacing: 0.4pt; line-height: 19.3199996948242px;">Content</span>
</p>

Expected output:

<p>
    <span>Content</span>
</p>

Now I'm looking for a way to remove the style tag from the html elements. Self I was thinking about an MySQL query but have trouble to figuring out how to preg_replace in it.

S.Visser
  • 4,645
  • 1
  • 22
  • 43
  • Do you want to remove the entire style attribute it just parts of it? Could you give an example of the end result you would like to achieve? – maxpelic Mar 22 '19 at 13:42
  • Updated the question with and expected output. – S.Visser Mar 22 '19 at 13:50
  • Also can you provide the SQL you are currently using? – maxpelic Mar 22 '19 at 13:53
  • Well the query should be something like this `UPDATE wp_posts SET post_content = replace(post_content, 'HERE REGEX', '');` . I just need to find the correct way to rexeg the post_content – S.Visser Mar 22 '19 at 14:08
  • With regard to your example, are you expecting it to replace the inline styles with class attributes, or have no such attributes, and do it all with complicated CSS selectors, perhaps involving :nth-child? – Loren Rosen Mar 22 '19 at 19:28

4 Answers4

3

If you want to perform the migration on database level, I would recommend to add a new column to your post table so you can first create a copy of your old data and then compare the results of a migration script. After adding the new column, you could use REGEXP_REPLACE as part of an UPDATE to copy the data into the new column. You may have to change the regular expression syntax so it catches all characters you encounter in the "style" section but this could be a start:

UPDATE 
  post_table 
SET 
  stripped_text = regexp_replace(original_text, 
  ' style="[[:alnum:][:digit:][:space:];,:\(\)\'\-\.]+"', '') 

Examine the data and see if the results are OK, repeat until the result is good enough. Rename your original post column ("original_text" in my sample) to something like "orignal_backup" and your new column ("stripped_text" in my example) to the original post column's name.

More information regarding character classes in MySQL REGEX can be found here.

SaschaM78
  • 4,376
  • 4
  • 33
  • 42
  • This would probably my best bet, need to update to 8.0+ first. Will keep the question open till then. – S.Visser Mar 22 '19 at 15:27
1

It is important the lazy quantifier: style="[\d\D]*?"

In MySQL you will need to add an extra backslash: style="[\\d\\D]*?"

EDIT: MYSQL QUERY:

update table 
   set column = REGEXP_REPLACE(column,'style="[\\d\\D]*?"','')
ekochergin
  • 4,109
  • 2
  • 12
  • 19
Sal
  • 1,307
  • 1
  • 8
  • 16
0

Actually its easy. You can set default style and after getting one from mysql update variable:

<?php

$NewStyle=' style="background-color: red;"';

//your query to get style

$NewStyle=' style="'.$row['style'].'"';// if some query conditions - example: background-color: blue 

<div '.$NewStyle.'></div>

?>

Edit: I actually did not read correctly and skipped that it was about WP. As it can still help one day i will leave this answer open.

Ingus
  • 1,026
  • 12
  • 34
0

You are in WordPress, so can I assume you have some PHP skills available? If so, you could use something like simpleXML to scan all pages and have a recursive loop go through all children tags of lets say or . Or, you could load a Javascript/jQuery script that removes all style attributes onload. You could have a MySQL query do it, but I would think this would be more risky. With simpleXML and the Javascript approach, you cam limit the scope of the removal. With a find and replace in MySQL it would be unlimited both in the DOM of a page and pages.

I am not good with Regexp, but maybe if this is what you want look here How to do a regular expression replace in MySQL?

Russell
  • 16
  • 3
  • Thanks for your answer but there is no need to alter the results after they come from the database, that will add unnecessary load to the page. Best thing for me is just to replace it directly in the database. – S.Visser Mar 22 '19 at 14:20
  • You misunderstand. Or rather, I wasn't clear. Once the pages are loading without the styles, you can copy that source to the new site. Or better yet, build a crawler to do it for you. – Russell Mar 22 '19 at 14:26
  • Ah that would not work because I need values that are not shown in the frontend, so the import tool from WordPress is the only way. An option is to do the replace in the import XML but that seems quite a brute-force method. – S.Visser Mar 22 '19 at 14:32
  • SimpleXML would work there... Then again a decent text editor with a regex find and replace might too. – Russell Mar 22 '19 at 14:36