0

I'm having trouble using Python Docx to replace string while keeping style (even with this super useful post). The twist I'm adding is that my text has periods in it, which are being recognized as separate runs.

I just started with python-docx and after reading the documentation, my understanding is that it does work with entire paragraphs. I've tried using it on the runs level but it seems to cap off right at the periods [.], any numeric text, or text of another format. What I'm trying to achieve is a find --> replace the {{%cellbg key.value }}(as per example below) whilst still maintaining the style.

In my Word template:

sometext.here{{%cellbg key.value }}some.other.text.i.don't.care.about

My python code:

from docx import Document
doc = Document(filename)
for p in doc.paragraphs:
    if re.search('(.*?){{%cellbg (.*?) }}(.*?)', paragraph.text):
        cellbg_og = re.search(r'\{\{%cellbg (.*?)\}\}(.*?)', paragraph.text).group(0)
        cellbg_tag = re.search(r'\{\{\%(.*?)\s\}\}', cellbg_og).group(1)
        replace_cellbg = '{% ' + cellbg_tag + ' %}'
        paragraph.text = paragraph.text.replace(cellbg_og, replace_cellbg)
# doc.save(filename)
doc.save('test.docx')
return 1

Ideally, when I implement this, I'd like the following result:

Original template: sometext.here{{%cellbg key.value }}some.other.text.i.don't.care.about

Expected output: sometext.here{% cellbg key.value %}some.other.text.i.don't.care.about

What I am currently getting: sometext.here{% cellbg key.value %}some.other.text.i.don't.care.about

What am I doing wrong? Any help would be super appreciated!

ineedatea
  • 1
  • 2

1 Answers1

0

You could use a single pattern with a capture group.

(Looking at the code, I think it should be for paragraph in doc.paragraphs:)

{{%(cellbg\s+.+?)\s*}}

The pattern matches:

  • {{% Match literally
  • ( Capture group 1 (referred to by \1 in the example code)
    • cellbg\s+.+? Match cellbg, 1+ whitspace chars and 1+ times any char as least as possible (non greedy)
  • ) Close group 1
  • \s* Match optional whitespace chars
  • }} Match literally

In the replacement use the capture group with custom spacing on the left and right between single curly's

{% \1 %}

Regex demo

An example using the example string:

import re

regex = r"{{%(cellbg\s+.+?)\s*}}"
s = "sometext.here{{%cellbg key.value }}some.other.text.i.don't.care.about"
result = re.sub(regex, r"{% \1 %}", s)

if result:
    print(result)

Output

sometext.here{% cellbg key.value %}some.other.text.i.don't.care.about
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • Thanks so much for this! I realized I had another issue as the text I posted actually needs to be in a table. I know I'm supposed to add: for table in tpl.tables: for row in table.rows: for cell in row.cells: However, now it does not work for me... – ineedatea Mar 26 '21 at 16:40