-1

I have an XML file that's read using PHP's file_get_contents so other changes can be done to it.

I need to find and remove the nodes <BATCHALLOCATIONS.LIST>...<BATCHALLOCATIONS.LIST> (not just those two lines, but what's between the entire node) in the entire file.

Since the file is already loaded using file_get_contents I'd like to do this without having to load the file again using simpleXML, or an XML parser or any other method (like DOM).

The node does not have a specific parent and appears randomly.

The XML file is exported from a Business Accounting Software.

Any idea on how to achieve this? Maybe using a Regular Expression to do a search and replace or something like that?

I've been trying to do this using a regular expression and preg_replace, but just can't get things to work.

Here's just a portion of the file. The original runs to 10K+ lines.

This should have worked but doesn't

preg_replace('/^\<BATCHALLOCATIONS.LIST\>(.*?)\<\BATCHALLOCATIONS.LIST\>$/ism','', $newXML);

I'm trying to do this without using any HTML/XML parser.

Norman
  • 6,159
  • 23
  • 88
  • 141

1 Answers1

-1

There's probably a better way to do it, but this will work

// get your file as a string 
$yourXML = file_get_contents($file) ;

$posStart = stripos($yourXML,'<BATCHALLOCATIONS.LIST>') ;
$posEnd = stripos($yourXML,'</BATCHALLOCATIONS.LIST>') + strlen('</BATCHALLOCATIONS.LIST>') ;

$newXML = substr($yourXML,0,$posStart) . substr($yourXML,$posEnd) ;
Mr Glass
  • 1,186
  • 1
  • 6
  • 14