0

I want to read a PDF file as a string.
I'm using File.ReadAllText(path), but the result ends on the first stream of binary data.
I think it recognizes some part of the stream as the end of file and stops.

Any idea how to solve this?

Amirali Amirifar
  • 309
  • 1
  • 4
  • 13
Carlo_Mava
  • 15
  • 1
  • 4
  • You can extract text from a PDF using tools as [IText7](https://www.nuget.org/packages/iText7/) – Jimi May 24 '22 at 11:20
  • Does this answer your question? [Base64 Encode a PDF in C#?](https://stackoverflow.com/questions/475421/base64-encode-a-pdf-in-c) – Troy Turley May 24 '22 at 13:34

1 Answers1

1

You cannot read a PDF file as a string, because a PDF file contains other data than just characters. Read the file into a byte array or parse it switching between reading text and binary data whenever you encounter a stream object in the PDF file.

Some languages like PHP treat strings and byte arrays as interchangeable. That is not the case in C#.