PDF upload encoding issue

Question

I'll get straight to the point: how to upload PDF files from a C# backend into a HTTP web service inside a multipart/form-data request without the contents being mangled to the point of the file becoming unreadable? The web service documentation only states that text files should be text/plain and image files should be binary; PDF files are only mentioned as "also supported", with no mention of what format or encoding they should be in.

The code I'm using to create the request:

HttpWebRequest request;
string boundary = "---------------------------" + DateTime.Now.Ticks.ToString("x");
request.ContentType = "multipart/form-data; boundary=" + boundary;

using (StreamWriter sw = new StreamWriter(request.GetRequestStream())) {
    sw.WriteLine("--" + boundary);

    sw.WriteLine("Content-Disposition: form-data; name=\"files\"; filename=\"" + Path.GetFileName(filePath) + "\"");

    sw.WriteLine(filePath.EndsWith(".pdf") ? "Content-Type: application/pdf" : "Content-Type: text/plain");

    sw.WriteLine();
    if (filePath.EndsWith(".pdf")) {

        // write PDF content into the request stream
    }
    else sw.WriteLine(File.ReadAllText(filePath));
    sw.Write("--" + boundary);
    sw.Write("--");
    sw.Flush();
}

For simple text files, this code works just fine. However, I have trouble uploading a PDF file.

Writing the file into the request body using StreamWriter.WriteLine with either File.ReadAllText or Encoding.UTF8.GetString(File.ReadAllBytes) results in the uploaded file being unreadable due to .NET having replaced all the non-UTF-8 bytes with squares (which somehow also increased file size by over 100 kB). Same result with UTF-7 and ANSI, but UTF-8 results in the closest match to the original file's contents.
Writing the file into the request body as binary data using either BinaryWriter or Stream.Write results in the web service rejecting it outright as invalid POST data. Content-Transfer-Encoding: binary (indicated by the documentation as necessary for application/http, hence why I tried) also causes rejection.

What alternative options are available? How can I encode PDF without .NET silently replacing the invalid bytes with placeholder characters? Note that I have no control over what kind of content the web service accepts; if I did, I'd already have moved on to base64.

had you tried use `HttpClient` with `MultipartFormDataContent`? — Selvin, Oct 09 '19 at 14:43
https://stackoverflow.com/questions/2934295/c-sharp-save-a-file-from-a-http-request — David Tansey, Oct 09 '19 at 15:01
@DavidTansey Stream.CopyTo() does not work either, web service rejects it as invalid POST data. — amitakartok, Oct 09 '19 at 15:10

score 0 · Answer 1 · answered Oct 10 '19 at 14:08

Problem solved, my bad. The multipart form header and the binary data were both correct but were in the wrong order because I didn't Flush() the StreamWriter before writing the binary data into the request stream with Stream.CopyTo().

Moral of the story: if you're writing into the same Stream with more than one Writer at the same time, always Flush() before doing anything with the next Writer.

PDF upload encoding issue

1 Answers1