0

I would like some suggestions on how I can achieve this. While there is discussion on this topic, it is six years old and I am hoping there are SaaS solutions available today or easy way to do it.

I would like to run a program on tax-returns in pdf format that would remove or redact sensitive information from the pdf file such as Name, Address, SSN, and other PII, and generate a public copy of the tax return in pdf that is safe to share with others.

The source of the pdf can be a scanner or tax software. Is it there an easy way to accomplish this?

Thanks, Dan

Dan A.
  • 37
  • 9
  • My need would suffice, if I can identify a list of all the IRS forms used in the tax returns. – Dan A. Sep 27 '17 at 02:06
  • 1
    This is a question better suited for software Recs, but as a side-note: there is something rather contradictory about sending sensitive-date documents to a third-party service in order to redact the senstive information. – Patrick Gallot Sep 27 '17 at 15:13
  • I am providing taxcaddy at findtaxpro.com for storing sensitive info and want to share the "hygiened" PDF for estimates in the findtaxpro marketplace. So not contradictory. Do you mean software requisition? – Dan A. Sep 28 '17 at 04:56
  • *"Do you mean software requisition?"* - Software Recommendations, I presume: https://softwarerecs.stackexchange.com/ - stack overflow is meant for very specific problems *using a given API/library/service*, not for *finding* one. – mkl Sep 29 '17 at 10:39
  • How do you expect the software to recognize the sensitive information? In particular scanned pages might have additional or missing margins, they might even be scanned upside-down or something similar, and they usually are scanned a bit rotated. – mkl Oct 01 '17 at 21:00
  • I just want the form name. – Dan A. Oct 05 '17 at 22:44

1 Answers1

0

There is a SaaS based image storage and manipulation service called cloudinary (cloudinary.com) which has an add-on that may help to redact text, see: https://cloudinary.com/documentation/ocr_text_detection_and_extraction_addon

How are these files being presented? e.g. are the PDF files viewed on the web or via an application as images?

[i am not affiliated with cloudinary]

Mark Redman
  • 24,079
  • 20
  • 92
  • 147