0

I have several .ppt (MS PowerPoint Slides) files to work on. Each of them contains dozens of slides that I want to remove.

For example, for file a.ppt, I need to remove the slides [2, 6, 12, 25] etc.

The library python-pptx does not have direct “deleting” so I am thinking maybe copying the required slides to a new file would be a possible solution.

I could not however find an example and am not sure how to proceed.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
Mark K
  • 8,767
  • 14
  • 58
  • 118

2 Answers2

1

old solution answered May 29 '16 at 2:23

What I used to make it happen is by a keyboard simulation tool (not really a programming language but also need scripting).

It recognizes the PPT slides I want to remove and multiple choose them, so I can delete them in one go.

Hope it helps someone has the similar need.

updated solution enlightened by Michael Berk in Mar. 2020

Here is a workable example of a PPTX of 5 slides, only to keep the 3rd and 5th ones.

from pptx import Presentation

def dropSlides(slidesToKeep, prs):
    indexesToRemove = [x for x in range(1, len(prs.slides._sldIdLst)+1) if x not in slidesToKeep]

    for i, slide in enumerate(prs.slides):
        id_dict = {slide.id: [i, slide.rId] for i, slide in enumerate(prs.slides._sldIdLst)}

        if i+1 in indexesToRemove:
            slide_id = slide.slide_id

            prs.part.drop_rel(id_dict[slide_id][1])
            del prs.slides._sldIdLst[id_dict[slide_id][0]]

    return prs

newPrs = dropSlides([3,5], Presentation("C:\\Slides 1-5.pptx"))

newPrs.save('c:\\Slides_only_3rd_and_5th_kept.pptx')
Mark K
  • 8,767
  • 14
  • 58
  • 118
1

Here is a function subsets .pptx presentations (not sure if there is a difference for .ppt files). Source

def dropSlides(self, slidesToKeep, prs):
    """Return a new presentation that has the correct slide subset.

    Param:
        - slidesToKeep: index of slides to keep from csv (int list)
        - prs: presentation (pptx.presentation)

    Return:
        - presentation with new slide subset

    """

    # get slides to delete
    indexesToRemove = [x for x in range(1, len(prs.slides._sldIdLst)+1) if x not in slidesToKeep]

    # subset report
    for i, slide in enumerate(prs.slides):
        # create slide dict
        id_dict = {slide.id: [i, slide.rId] for i, slide in enumerate(prs.slides._sldIdLst)}

        # iterate thorugh indexes
        if i+1 in indexesToRemove:
            # get slide id
            slide_id = slide.slide_id

            # remove slide
            prs.part.drop_rel(id_dict[slide_id][1])
            del prs.slides._sldIdLst[id_dict[slide_id][0]]

    return prs
Michael Berk
  • 705
  • 7
  • 23
  • Brek, thank you! Can you please give an example of the usage? I used "dropSlides("C:\\sample.pptx", [2,3,5], "C:\\sample (new).pptx") but seems it's not the way. – Mark K Mar 04 '20 at 00:47
  • 1
    Sure. ```self.dropSlides([1,2,3], Presentation("C:\\sample.pptx"))```. Here, ```self``` is not a parameter but an reference to the class. Also, ```prs``` needs to be of type Presentation, not the path. – Michael Berk Mar 04 '20 at 17:57
  • thanks! but it gives me "NameError: name 'self' is not defined "... – Mark K Mar 05 '20 at 00:36
  • ```self``` is only necessary if your code is in a class. If it's not, simply remove all references to ```self``` i.e. delete it from the arguments in the function declaration and run ```dropSlides([1,2,3], Presentation("C:\\sample.pptx"))```. – Michael Berk Mar 05 '20 at 16:19
  • when I tried "dropSlides([1,2,3], Presentation("C:\\sample.pptx"))". It shows "NameError: name 'Presentation' is not defined" – Mark K Mar 06 '20 at 01:01
  • 1
    You need to import presentation from python-pptx - [source](https://python-pptx.readthedocs.io/en/latest/api/presentation.html) – Michael Berk Mar 10 '20 at 19:08
  • when I tried "dropSlides([1,2,3], Presentation("C:\\sample.pptx"))". It shows "TypeError: dropSlides() missing 1 required positional argument: 'prs'". – Mark K Mar 11 '20 at 00:57
  • If your code is in a class, use `self`. If not, remove all instances of `self` (including in the function declaration `def dropSlides(slidesToKeep, prs):`. Here's a [link](https://stackoverflow.com/questions/2709821/what-is-the-purpose-of-the-word-self-in-python) that will help. – Michael Berk Mar 16 '20 at 20:32
  • thank you for the follow-up. I tried again just now and there's not more error message after changed the fist line to "def dropSlides(slidesToKeep, prs):". But where is the new file (after slides removal) saved to? – Mark K Mar 17 '20 at 06:06
  • It is returned by the function. It will be stored in the var `newPrs` if you run `newPrs = dropSlides(slidesToKeep, prs)`. Getting a better understanding of how python functions work would be very useful here. – Michael Berk Mar 18 '20 at 18:49
  • 1
    thank you for the reply. Somebody may need a workable solution instantly and they don't have luxury of weeks to wait for a remedy. anyway, your attention on this question is highly appreciated. – Mark K Mar 19 '20 at 01:20
  • Somehow, if I `prs.save` I get no issues, but if I `return prs` I get `UserWarning: Duplicate name` and needs to repair the file when opening it. – Henrique Brisola Apr 17 '21 at 03:09
  • You `return prs` then `prs.save` in a different function? – Michael Berk Apr 19 '21 at 20:02