0

I'm currently attempting a little project where I read from a word document from python, extract each paragraph from the .docx and then output a new word document where those paragraphs are separated into neat boxes atop of one another. Attached is an example of the desired output (of course, in a word document). Of course, attached also is the word document I'm working with (a snippet of it), the code I'm currently working with, as well as what's currently being output from the code when run. This is a bit of a time sensitive project, but thank you for anybody who helps!

EDIT: I've gotten close to my desired output. Now, I just want to remove the indents from each paragraph so that when they're in the box, there is no indent. I'd also like to center all of the text within their 1x1 tables. Additionally, I'd like to make those borders thicker and I'd like the boxes to be a little bit closer together. Using '/r' in this case makes the space slightly larger.

from docx import Document
from docx.shared import Inches

doc = Document('test.docx')
new_doc = Document()
sections = new_doc.sections
for section in sections:
    section.left_margin = Inches(3)
    section.right_margin = Inches(3)
for para in doc.paragraphs:
    table = new_doc.add_table(rows=1, cols=1)
    table.style = 'Table Grid'
    cells = table.rows[0].cells
    cells[0].text = para.text
    new_doc.add_paragraph('\r')
new_doc.save('details.docx')

Example of Desired Output --^ Example of Desired Output

Current word doc I'm manipulating --v Word Doc I'm working with

Current output to document --v enter image description here

Jimmy Wede
  • 107
  • 1
  • 7
  • 1
    Looks like it's working. Did you have a question about it? – scanny Jul 10 '20 at 16:36
  • @scanny Not so much. I'm getting an output which I suppose is good for my code, however the output is not what I want. As shown in the "Desired Output" post, I want to get boxes around each individual paragraph, as well as not repeat the same paragraphs at all, but rather simply extract one paragraph at a time, and each paragraph will get it's own graphical box that it gets held within on the output, in descending order. – Jimmy Wede Jul 10 '20 at 18:30
  • `box = ("Item")` is equivalent to `box = "Item"`. `[Item for Item in box]` then would resolve to `["I", "t", "e", "m"]`, that is `for Item in box:` iterates over the letters of the str "Item". That's why you're getting four copies. That's probably a good place to start. If you meant box to be a tuple, you need `box = ("Item",)`. Look up trailing comma in single-item tuple if that syntax is unfamiliar to you, e.g. https://stackoverflow.com/questions/7992559/what-is-the-syntax-rule-for-having-trailing-commas-in-tuple-definitions – scanny Jul 10 '20 at 21:53
  • @scanny So I've updated the code and here is a new output. I don't want that the paragraphs to repeat themselves at all, I only want the paragraphs to be separated into small (approximately 1-inch) squares. I seem to be pretty close, just need to not get that first set of paragraphs and I need squares around those paragraphs, only much smaller with the text still filling it up. – Jimmy Wede Jul 11 '20 at 03:23
  • @scanny I've fixed one of my problems, now I just need the text to fit into that vertical column of boxes. Edited in is the revised code along with the current output. – Jimmy Wede Jul 12 '20 at 23:07

0 Answers0