Is there a way to identify/detect shapes on a PDF or Illustrator file programmatically?

Question

I'm trying to discover if there is a way, a library, or tool today that would help me to detect the shapes programmatically on a PDF file or an Illustrator File. I have seen some libraries such as Shapely or Pyclipper but haven't been able to use a PDF or AI file as an input for the library to process.

I know this is possible using image processing libraries but converting the file to an image would lose its layers and properties. Any alternatives?

You could do PDF->Image->Process and then take the image coordinates and translate back to the PDF coordinates. PDF scale is 1/72 inch, so just need to know the image DPI to convert between the two coordinate systems. You might also need to get the PDF page rotation to know which way the positive axis are, though depends exactly what you are doing with the PDF if that part matters or not. — Ryan, Feb 12 '21 at 19:50
I have considered using Image Processing but I think it would not do the work since for example if I have 2 overlapping shapes (different layers) I would be able to identify them as separate pieces but when converting this file to an image I think all layers would merge into one single layer and would identify those 2 overlapping pieces as only one piece instead of getting 2 separate shapes, but don't know yet how to tackle this so I can identify pieces that are overlapping or one on top of the other etc. Basically I want to be able to identify pieces — lorelayb, Feb 14 '21 at 23:31
If by "layers" you mean PDF Optional Content Groups (OCG), also known as Layers, then turning those on off for different conversions to image should be easy. If you just mean overlapping path operators, then you could detect overlaps, and separate them out by using a PDF library SDK that offers low level editing of the PDF graphic stream, so you can again generate different images. — Ryan, Feb 15 '21 at 16:17

Is there a way to identify/detect shapes on a PDF or Illustrator file programmatically?

0 Answers0