2

Background: I am creating websites in Webflow, and then exporting them to be integrated with a PHP backend. Webflow's default file structure is different than our backend, so I'm using Python & BeautifulSoup to help correct some tedious things before actually 'integrating' the exported code.

The first thing I'm trying to solve is changing all image srcs to 'images/xxx' rather than '../images/xxx', which I was able to do like this:

img['src'] = img['src'].replace('../images/', 'images/')

Now I'd like to find all the links and replace their hrefs with the structure we use on the backend that looks like this:

<a href="<?=$website_info->url?>/page"></a>

I've been able to find all the links in BS without any issues, and I'm trying to replace their hrefs like this:

links = soup.find_all('a', href=True)
for link in links:
    link['href'] = '<?=$website_info->url?>/page'
    print(link)

but that results in output like this, with every < and > replaced with &lt; and &gt;, respectively:

<a class="inner-page-nav-link w-nav-link" href="&lt;?=$website_info-&gt;url?&gt;/link">Page Name</a>

Does anyone know how I could replace the link hrefs without the < and > characters being escaped like this?

Adam Bowker
  • 193
  • 9
  • Does this help: [Replace text without escaping in BeautifulSoup](https://stackoverflow.com/a/30692601/4985733) – Martin Evans Feb 03 '21 at 19:56
  • @MartinEvans Thank you! I saw that, and tried to go that route but ended up having some trouble because all the links also have other attributes (classes and other custom attributes). – Adam Bowker Feb 05 '21 at 02:55
  • Could you update the question to include a bigger example to recreate that? – Martin Evans Feb 05 '21 at 07:37

0 Answers0