1

I have to replace some 'file names' from within an XML with the full path for those files. All the files are in the same directory which simplifies things somewhat. I was trying to use BeautifulSoup4 but there was a bug that kept breaking crashing python so I'm trying to do the same thing with regex's.

The copasiML variable contains an XML as a string.

My code:

    copasiML=IA.read_copasiML_as_string(copasi_file)
    data_file_names=re.findall('<Parameter name="File Name" type="file" value="(.*)"/>',copasiML)
    for i in data_file_names:
        copasiML2=re.sub('<Parameter name="File Name" type="file" value="'+i+'"/>','<Parameter name="File Name" type="file" value="'+os.path.join(os.getcwd()+i)+'"/>',copasiML)
        os.remove(copasi_file)
        with open(copasi_file,'w') as f:
            f.write(str(copasiML2))

As it stands, my code runs but doesn't actually do anything. Would anybody happen to know how to fix my code?

Many thanks

CiaranWelsh
  • 7,014
  • 10
  • 53
  • 106
  • I'd recommend [`lxml`](http://lxml.de/) library or, at least, [`xml.etree.ElementTree`](https://docs.python.org/2/library/xml.etree.elementtree.html#module-xml.etree.ElementTree) from standard python – har07 Jun 26 '15 at 14:28

1 Answers1

0

Every time you try to parse HTML/XML with Regular Expressions, you make baby Jesus cry.

I BeautifulSoup doesn't work for you, I suggest XPath as illustrated in python lxml - modify attributes.

Fun (but instructive) readings:

Community
  • 1
  • 1
fferri
  • 18,285
  • 5
  • 46
  • 95