0

So, I'm trying to migrate a database from Textpattern CMS to something more generic. There are some textpattern-specific commands inside of articles that pull in images. I want to turn these into generic HTML image links. At the moment, they look like this in the sql file:

<txp:upm_image image_id="4" form="dose" />

I want to turn these into something more like this:

<img src="4.jpg" class="dose" />

I've had some luck with TextWrangler doing some regex stuff, but I'm stumped. Any ideas on how to find & replace all of these image paths?

EDIT: For future reference, here's what I ended up doing in PHP to output it:

$body = $post['Body_html'];
$pattern = '/txp:upm_image image_id="([0-9]+)" form="([^"]*)"/i';
$replacement = 'img src="/images/$1.jpg" class="$2"';
$body = preg_replace($pattern, $replacement, $body);
// outputed <img src="/images/59.jpg" class="dose" />
jpea
  • 3,114
  • 3
  • 26
  • 26

2 Answers2

1

I wouldn't use grep; it's sed you want

$ echo '<txp:upm_image image_id="4" form="dose" />' | sed -e 's/^.*image_id="\([[:digit:]]*\)".*form="\([[:alpha:]]*\)".*/<img src="\1.jpg" class="\2" \/>/' 
<img src="4.jpg" class="dose" /> 
$

if your class has alphanumeric characters, use [[:alnum:]]

(works on macos darwin)

  • This solution works if the string consists of one txp tag. It fails if there are multiple txt tags with stuff between them. – ridgerunner Mar 25 '11 at 17:45
0

Not sure which tool you are using but try this regex solution: Search for this:

<txp:upm_image\s+image_id="(\d+)"\s+form="([^"]*)"\s*\/>

And replace with this:

<img src="$1.jpg" class="$2" />

Note that this only works for txp tags having the same form as your example. It will fail if there are txp tags having extra attributes, or if they are in a different order.

ridgerunner
  • 33,777
  • 5
  • 57
  • 69