Maybe this question seems a bit strange, but it has a very practical use case.
Assume that we selected arbitrary a section of a PDF file to generate a checksum, such as the selected text (highlighted text) in the following screen shot:
And then we generate a checksum from selected text using a hash function. We deliver (and not send) the whole of PDF file along with this checksum to a receiver, such that this receiver does NOT know which section of PDF file has been selected and hashed. And this receiver wants to verify this checksum. So, they need to know exactly which section of the PDF file has been selected and hashed. So, we need to find a solution by which this receiver can find the exact position of the selected and hashed text.
Since a hash function is not reversible, to the question is that:
How this receiver can find exactly the selected and hashed text in PDF file?
For example, is it feasible to determine the exact location and position of the selected and hashed text in PDF file? (It is very sensitive, since even a wrong character or space can lead to failure of checksum verification.)
Is there a reliable approach for this challenge?
Note 1: If the question is not clear enough, please let me know to explain it in more details.
Important: Please note that because of limitation of space, we can only store the checksum value plus some limited data that show the position of selected text, meaning that we cannot store the entire selected text.
use case: we intent to verify the integrity of selected texts in the document by a verifier. The checksum along with information which address to the hashed text, will be stored in the blockchain, so because of limitations of storing in the blockchain (it's costly), we cannot store the entire selected and hashed text in the blockchain, instead we store only some useful information that address to the exact position of selected and hashed text. The verifier has access to the entire document, however they do not know which section of document has been hashed. They need to know it to verify the checksum.
Assume ex. a prover has a certificate (paper), he needs to prove he is owner of certificate.He scan certificate (digitize it to any format is better). Issuer of certificate has selected some sensitive parts of certificate (ex. owner info, etc) and hashed them separately each selected sections to generate checksum. When prover (owner) deliver certificate to a verifier, the verifier needs to check all checksums. at this step, heneeds to know which parts of certificate have been hashed. So, we need to attach useful data to checksums by which verifier can find hashed sections.
Please also note that the selected text is not recorded, but also it is selected to generate checksum. however the verifier needs to know the content of this text to verify checksum. the problem is that because of limitations of storing data in blockchain, we cannot store the entire hashed text, but also we can only store some useful information which address to the exact position of hashed text.
Update: This question is related to (FREE Tool for watching coordinates in PDF) where using a tool we would be able to find the exact (x,y) coordinates of a selected text. I am not yet sure that this tool can be used for my question.