4

I am getting following output : [[], [], ['Audi'], ['audi'], ['AuDi']]
But I want ['Audi','audi','AuDi']
my code is:

from docx import Document
document = Document(r'C:\Users\aliassample02.docx')
list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.append(results)
print(list1)

5 Answers5

4

Use extend list instead append:

list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.extend(results)

Or you can flatten values in list comprehension:

list1 = [x for para in document.paragraphs 
           for x in re.findall(r'audi', para.text, re.IGNORECASE)]

EDIT:

list1 = []
for para in document.paragraphs:
    for x in list2:
        results = re.findall(x, para.text, re.IGNORECASE)
        list1.extend(results)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • @kumaranuj - Can you be more specific? – jezrael Jun 08 '20 at 08:15
  • 1
    one more thing, if "audi" will be an iterable item then? Like i want to iterate so many items like list2 = ["audi","bmw"]. Then i need to apply for i in list2: results = re.findall(r'i', para.text, re.IGNORECASE) –  Jun 08 '20 at 08:35
  • @kumaranuj - What should be output? both lists together? – jezrael Jun 08 '20 at 08:36
  • I have one docx file and i want to delete/replace the particular word in docs which i will pass through as list. list1 = ["aaa","bbb"] wherever this two elements will be there in docx , it will delete and replace as well irespective of uppercase/lowrcase. for delete operation i thought of using replacing it with "" empty string. and replacing i am using "replacingWord" –  Jun 08 '20 at 08:55
  • @kumaranuj - I think the best is create new question. – jezrael Jun 08 '20 at 08:57
  • Ok. After 90 minutes, i will post it again. Thanks :) –  Jun 08 '20 at 08:59
  • https://stackoverflow.com/questions/62259624/how-to-replace-any-word-in-ppt-using-python-respective-of-uppercase-lowercase @jezrael can you please check. –  Jun 08 '20 at 10:06
  • @kumaranuj - I check it and not easy test, because there are no data for test, missing [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) – jezrael Jun 08 '20 at 10:19
2

You can flatten the list after finding all things you want:

list1 = [item for sublist in list1 for item in sublist]
0

It worked for me:

list1 = []
for para in document.paragraphs:
    results = re.findall(r'audi', para.text, re.IGNORECASE)
    list1.extend(results)
0
list1 = [item for sublist in list1 for item in sublist]

This list comprehensive also works for me.

0
list1 = [x for para in document.paragraphs 
           for x in re.findall(r'audi', para.text, re.IGNORECASE)]

Best solution i have got for my query.