I am working on a web application that takes MS documents(word, excel, ppt) as input documents and generates PDF documents, while it's possible to create the accessible PDF using the API/library that I am currently using, I was looking for an API/Library that will help me scan the input document(word, ppt, excel) for accessibility compliance.
As if the input document itself is lacking the semantic meta-data for accessibility the resulting PDF will not be accessible.