1

I need to scrub some data from a log of some xml and I need to know how to make preg_replace replace all of the occurrences of a regex match.

The xml looks something like this.

<contactData>                 
<id>29194</id>                 
<firstName>Michael</firstName>                 
<lastName>Smith</lastName>                 
<address1>1600 Pennsylvania Ave</address1>                 
<address2></address2>                 
<city>Washington</city>                 
<state>DC</state>                 
<postalCode>20500</postalCode>                 
<country>US</country>                 
<phone>3012013021</phone>                 
<email>michael@potus.gov</email>                 
</contactData>             
<contactData>                 
<id>29195</id>                 
<firstName>Shelly</firstName>                 
<lastName>McPherson</lastName>                 
<address1>2411 Georgia Ave</address1>                 
<address2></address2>                 
<city>Silver Spring</city>                 
<state>MD</state>                 
<postalCode>20902-5412</postalCode>                 
<country>US</country>                 
<phone>3012031302</phone>                 
<email>shelly@example.com</email>
</contactData>

When I run this on this xml.

$regex = $replace = array();
$regex[] = '/(<contactData>)(.*)(<email>)(.*)(<\/email>)/is';
$regex[] = '/(<contactData>)(.*)(<phone>)(.*)(<\/phone>)/is';
$replace[] = '$1$2$3xxxxxxxxxxxxxxxx$5';
$replace[] = '$1$2$3xxxxxxxxxxxxxxxx$5';
$text = preg_replace($regex, $replace, $text);

I get this.

<contactData>                 
<id>29194</id>                 
<firstName>Michael</firstName>                 
<lastName>Smith</lastName>                 
<address1>1600 Pennsylvania Ave</address1>                 
<address2></address2>                 
<city>Washington</city>                 
<state>DC</state>                 
<postalCode>20500</postalCode>                 
<country>US</country>                 
<phone>3012013021</phone>                 
<email>michael@potus.gov</email>                 
</contactData>             
<contactData>                 
<id>29195</id>                 
<firstName>Shelly</firstName>                 
<lastName>McPherson</lastName>                 
<address1>2411 Georgia Ave</address1>                 
<address2></address2>                 
<city>Silver Spring</city>                 
<state>MD</state>                 
<postalCode>20902-5412</postalCode>                 
<country>US</country>                 
<phone>xxxxxxxxxxxxxxxx</phone>                 
<email>xxxxxxxxxxxxxxxx</email>
</contactData>

How do I get it replace the other "contactData" email and phone?

Halfstop
  • 1,710
  • 17
  • 34

1 Answers1

2

It's more correct to use any XML parser to do this. For exampe, simpleXML

// Your XML does not inlude root element. 
// If real does, remove `root`from the next line
$xml = simplexml_load_string('<root>' . $text . '</root>'); 
for($i = 0; $i < count($xml->contactData); $i++) {
    unset($xml->contactData[$i]->email);
    unset($xml->contactData[$i]->phone);
}

echo $xml->saveXML();
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
splash58
  • 26,043
  • 3
  • 22
  • 34
  • 1
    You're correct. I've avoided doing that because I wanted to keep the XML request as untouched as possible, but it's silly to use regex for this, fraught with potential problems. – Halfstop May 25 '16 at 17:01