55

I'm POSTing a file to a WCF REST service through a HTML form, with enctype set to multipart/form-data and a single component: <input type="file" name="data">. The resulting stream being read by the server contains the following:

------WebKitFormBoundary
Content-Disposition: form-data; name="data"; filename="DSCF0001.JPG"
Content-Type: image/jpeg

<file bytes>
------WebKitFormBoundary--

The problem is that I'm not sure how do extract the file bytes from the stream. I need to do this in order to write the file to the disk.

rafale
  • 1,704
  • 6
  • 29
  • 43
  • possible duplicate of [WCF service to accept a post encoded multipart/form-data](http://stackoverflow.com/questions/1354749/wcf-service-to-accept-a-post-encoded-multipart-form-data) – Darin Dimitrov Sep 18 '11 at 07:26
  • @Darin: I'm not sure. My service already accepts multipart/form-data from POSTs, but reading the incoming stream and extracting the file bytes is what I'd like to do. – rafale Sep 18 '11 at 07:38
  • I am still facing issue in image upload using form-data http://stackoverflow.com/questions/39853604/getting-length-issue-of-stream-data-while-uploading-image-using-wcf-services-in – Dhruvit Modi Oct 04 '16 at 13:36

10 Answers10

52

Sorry for joining the party late, but there is a way to do this with Microsoft public API.

Here's what you need:

  1. System.Net.Http.dll
    • Included in .NET 4.5
    • For .NET 4 get it via NuGet
  2. System.Net.Http.Formatting.dll

Note The Nuget packages come with more assemblies, but at the time of writing you only need the above.

Once you have the assemblies referenced, the code can look like this (using .NET 4.5 for convenience):

public static async Task ParseFiles(
    Stream data, string contentType, Action<string, Stream> fileProcessor)
{
    var streamContent = new StreamContent(data);
    streamContent.Headers.ContentType = MediaTypeHeaderValue.Parse(contentType);

    var provider = await streamContent.ReadAsMultipartAsync();

    foreach (var httpContent in provider.Contents)
    {
        var fileName = httpContent.Headers.ContentDisposition.FileName;
        if (string.IsNullOrWhiteSpace(fileName))
        {
            continue;
        }

        using (Stream fileContents = await httpContent.ReadAsStreamAsync())
        {
            fileProcessor(fileName, fileContents);
        }
    }
}

As for usage, say you have the following WCF REST method:

[OperationContract]
[WebInvoke(Method = WebRequestMethods.Http.Post, UriTemplate = "/Upload")]
void Upload(Stream data);

You could implement it like so

public void Upload(Stream data)
{
    MultipartParser.ParseFiles(
           data, 
           WebOperationContext.Current.IncomingRequest.ContentType, 
           MyProcessMethod);
}
Ohad Schneider
  • 36,600
  • 15
  • 168
  • 198
  • Will this method create a dependency with IIS? – Devela Feb 12 '14 at 03:06
  • It seems that you need to make the operation an async task too? Is there a way to consume the async method without having the whole chain be async? – Gerard ONeill Sep 26 '14 at 21:01
  • 3
    @GerardONeill you can always do Task.Wait() when you wish do go synchronous. For example: `ParseFiles(data, contentType, fileProcessor).Wait()`. Not sure why you'd want to do it though... – Ohad Schneider Sep 27 '14 at 09:21
  • This way I can plug this functionality into existing code without having to change the entire chain of methods. In my particular spot, I'm just learning about all of this (starting with parsing multipart data), and it helps learning if I don't have to learn how to do 12 things before I get one thing to work. – Gerard ONeill Sep 29 '14 at 11:17
  • I forgot to say Thanks ;). I was just letting you know the issue with the await operator, especially for a newbie. – Gerard ONeill Sep 29 '14 at 16:46
  • Hey, we've all been there :) When you're ready to tackle it, I recommend: https://www.google.com/search?q=site:blogs.msdn.com+%22Asynchrony+in+C%23%22 – Ohad Schneider Sep 29 '14 at 19:18
35

You may take a look at the following blog post which illustrates a technique that could be used to parse multipart/form-data on the server using the Multipart Parser:

public void Upload(Stream stream)
{
    MultipartParser parser = new MultipartParser(stream);
    if (parser.Success)
    {
        // Save the file
        SaveFile(parser.Filename, parser.ContentType, parser.FileContents);
    }
}

Another possibility is to enable aspnet compatibility and use HttpContext.Current.Request but that's not a very WCFish way.

Community
  • 1
  • 1
Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
  • 1
    What's the matter with using aspnet compatibility? – rafale Sep 18 '11 at 18:56
  • 3
    @rafale, the matter is that of tomorrow you decide to host your WCF service in something else than IIS (like a Windows Service, or whatever) you will have to rewrite it. Other than that nothing wrong. – Darin Dimitrov Sep 18 '11 at 18:58
  • I forgot to mention that my WCF service is being hosted outside of an IIS as a Managed Windows Service. I'm guessing this means aspnet compatibility isn't available to me either way, correct? – rafale Sep 18 '11 at 19:11
  • @rafale, absolutely, no aspnet compatibility if you host in a Windows Service. So you will have to parse this `multipart/form-data` stream and I am afraid there is nothing built-in .NET that might help you. A third party parser is probably your best bet. – Darin Dimitrov Sep 18 '11 at 19:14
  • This is a little off topic, but is the `Content-Type: image/jpeg` property in the header detected by the browser submitting the form? – rafale Sep 18 '11 at 19:39
  • @rafale, I don't understand your question. It's browser sending this request to the server. What detection do you mean? It's up to the server to parse the HTTP request. – Darin Dimitrov Sep 18 '11 at 19:40
  • The server receives `Content-Type: image/jpeg` as part of the stream (see original post), and I was just wondering where that comes. – rafale Sep 18 '11 at 19:43
  • @rafale, it's the browser that formats the request like this because the user selected a jpeg file to upload. – Darin Dimitrov Sep 18 '11 at 19:44
  • What would I need to put at the place of SaveFile(), Im confused there. – guiomie Feb 06 '13 at 16:06
  • @guiomie, that would depend on what you want to do with the file and where you want to save it. – Darin Dimitrov Feb 06 '13 at 16:10
  • 2
    File.WriteAllBytes(filepath, parser.FileContents) is what I wanted to do (save the file on the disk) – guiomie Feb 06 '13 at 16:27
  • 1
    Multipart parser is LGPL. The one by @Lorenzo Polidori, besides being more complete, is also MIT licensed (far more permissive). – Ohad Schneider Dec 25 '13 at 14:22
  • That multi-parser is glorious! Thanks! – nickvans Aug 05 '14 at 23:19
  • Is it not possible to reference System.Web stuff from ASP.NET, hand over the inputStream to its components and retrieve parsed fields? – HaLeiVi Jul 13 '17 at 04:47
28

I've had some issues with parser that are based on string parsing particularly with large files I found it would run out of memory and fail to parse binary data.

To cope with these issues I've open sourced my own attempt at a C# multipart/form-data parser here

Features:

  • Handles very large files well. (Data is streamed in and streamed out while reading)
  • Can handle multiple file uploads and automatically detects if a section is a file or not.
  • Returns files as a stream not as a byte[] (good for large files).
  • Full documentation for the library including a MSDN-style generated website.
  • Full unit tests.

Restrictions:

  • Doesn't handle non-multipart data.
  • Code is more complicated then Lorenzo's

Just use the MultipartFormDataParser class like so:

Stream data = GetTheStream();

// Boundary is auto-detected but can also be specified.
var parser = new MultipartFormDataParser(data, Encoding.UTF8);

// The stream is parsed, if it failed it will throw an exception. Now we can use
// your data!

// The key of these maps corresponds to the name field in your
// form
string username = parser.Parameters["username"].Data;
string password = parser.Parameters["password"].Data

// Single file access:
var file = parser.Files.First();
string filename = file.FileName;
Stream data = file.Data;

// Multi-file access
foreach(var f in parser.Files)
{
    // Do stuff with each file.
}

In the context of a WCF service you could use it like this:

public ResponseClass MyMethod(Stream multipartData)
{
    // First we need to get the boundary from the header, this is sent
    // with the HTTP request. We can do that in WCF using the WebOperationConext:
    var type = WebOperationContext.Current.IncomingRequest.Headers["Content-Type"];

    // Now we want to strip the boundary out of the Content-Type, currently the string
    // looks like: "multipart/form-data; boundary=---------------------124123qase124"
    var boundary = type.Substring(type.IndexOf('=')+1);

    // Now that we've got the boundary we can parse our multipart and use it as normal
    var parser = new MultipartFormDataParser(data, boundary, Encoding.UTF8);

    ...
}

Or like this (slightly slower but more code friendly):

public ResponseClass MyMethod(Stream multipartData)
{
    var parser = new MultipartFormDataParser(data, Encoding.UTF8);
}

Documentation is also available, when you clone the repository simply navigate to HttpMultipartParserDocumentation/Help/index.html

Jake Woods
  • 1,808
  • 15
  • 20
  • 2
    Finally it is work, I really really appreciate your great work, thank a lot @Jake Woods. – Frank Myat Thu Feb 17 '14 at 14:36
  • 5
    Worth noting, that this parser is also available via Nuget: [HttpMultipartParser](https://www.nuget.org/packages/HttpMultipartParser/). (in case someone, like me, just did ctrl+f looking for nuget :)) – Bartek Eborn Mar 10 '16 at 09:36
15

I open-sourced a C# Http form parser here.

This is slightly more flexible than the other one mentioned which is on CodePlex, since you can use it for both Multipart and non-Multipart form-data, and also it gives you other form parameters formatted in a Dictionary object.

This can be used as follows:

non-multipart

public void Login(Stream stream)
{
    string username = null;
    string password = null;

    HttpContentParser parser = new HttpContentParser(stream);
    if (parser.Success)
    {
        username = HttpUtility.UrlDecode(parser.Parameters["username"]);
        password = HttpUtility.UrlDecode(parser.Parameters["password"]);
    }
}

multipart

public void Upload(Stream stream)
{
    HttpMultipartParser parser = new HttpMultipartParser(stream, "image");

    if (parser.Success)
    {
        string user = HttpUtility.UrlDecode(parser.Parameters["user"]);
        string title = HttpUtility.UrlDecode(parser.Parameters["title"]);

        // Save the file somewhere
        File.WriteAllBytes(FILE_PATH + title + FILE_EXT, parser.FileContents);
    }
}
Matthew James Davis
  • 12,134
  • 7
  • 61
  • 90
Lorenzo Polidori
  • 10,332
  • 10
  • 51
  • 60
3

Another way would be to use .Net parser for HttpRequest. To do that you need to use a bit of reflection and simple class for WorkerRequest.

First create class that derives from HttpWorkerRequest (for simplicity you can use SimpleWorkerRequest):

public class MyWorkerRequest : SimpleWorkerRequest
{
    private readonly string _size;
    private readonly Stream _data;
    private string _contentType;

    public MyWorkerRequest(Stream data, string size, string contentType)
        : base("/app", @"c:\", "aa", "", null)
    {
        _size = size ?? data.Length.ToString(CultureInfo.InvariantCulture);
        _data = data;
        _contentType = contentType;
    }

    public override string GetKnownRequestHeader(int index)
    {
        switch (index)
        {
            case (int)HttpRequestHeader.ContentLength:
                return _size;
            case (int)HttpRequestHeader.ContentType:
                return _contentType;
        }
        return base.GetKnownRequestHeader(index);
    }

    public override int ReadEntityBody(byte[] buffer, int offset, int size)
    {
        return _data.Read(buffer, offset, size);
    }

    public override int ReadEntityBody(byte[] buffer, int size)
    {
        return ReadEntityBody(buffer, 0, size);
    }
}

Then wherever you have you message stream create and instance of this class. I'm doing it like that in WCF Service:

[WebInvoke(Method = "POST",
               ResponseFormat = WebMessageFormat.Json,
               BodyStyle = WebMessageBodyStyle.Bare)]
    public string Upload(Stream data)
    {
        HttpWorkerRequest workerRequest =
            new MyWorkerRequest(data,
                                WebOperationContext.Current.IncomingRequest.ContentLength.
                                    ToString(CultureInfo.InvariantCulture),
                                WebOperationContext.Current.IncomingRequest.ContentType
                );

And then create HttpRequest using activator and non public constructor

var r = (HttpRequest)Activator.CreateInstance(
            typeof(HttpRequest),
            BindingFlags.Instance | BindingFlags.NonPublic,
            null,
            new object[]
                {
                    workerRequest,
                    new HttpContext(workerRequest)
                },
            null);

var runtimeField = typeof (HttpRuntime).GetField("_theRuntime", BindingFlags.Static | BindingFlags.NonPublic);
if (runtimeField == null)
{
    return;
}

var runtime = (HttpRuntime) runtimeField.GetValue(null);
if (runtime == null)
{
    return;
}

var codeGenDirField = typeof(HttpRuntime).GetField("_codegenDir", BindingFlags.Instance | BindingFlags.NonPublic);
if (codeGenDirField == null)
{
    return;
}

codeGenDirField.SetValue(runtime, @"C:\MultipartTemp");

After that in r.Files you will have files from your stream.

Ohad Schneider
  • 36,600
  • 15
  • 168
  • 198
Lukasz S
  • 648
  • 1
  • 10
  • 29
  • As long as a .NET update doesn't stop it from working, this is by far the best solution (ASP.NET is probably more robust and reliable than existing open source implementations, as good as they may be). – Ohad Schneider Dec 26 '13 at 15:29
  • 1
    I've edited your answer with some necessary code (it may crash for certain files otherwise) – Ohad Schneider Dec 29 '13 at 20:10
  • could you say a bit more about what kind of files can cause it to crash? and returns that you've put there does it mean that we cannot create request or maybe just r.Files will throw an exception ? – Lukasz S Jan 07 '14 at 12:14
  • The parser has an optimization for large files, where it stores them on disk. I've populated the necessary fields for this optimization to work. You can control thins threshold via the `requestLengthDiskThreshold` configuration element in your app.config. Another config element you might want to change is `maxrequestlength`. The return statements could may as well be exceptions, it means our reflection failed and we may very well fail on big files. – Ohad Schneider Jan 07 '14 at 14:24
2

The guy who solved this posted it as LGPL and you're not allowed to modify it. I didn't even click on it when I saw that. Here's my version. This needs to be tested. There are probably bugs. Please post any updates. No warranty. You can modify this all you want, call it your own, print it out on a piece of paper and use it for kennel scrap, ... don't care.

using System.Collections.Generic;
using System.Collections.Specialized;
using System.IO;
using System.Net;
using System.Text;
using System.Web;

namespace DigitalBoundaryGroup
{
    class HttpNameValueCollection
    {
        public class File
        {
            private string _fileName;
            public string FileName { get { return _fileName ?? (_fileName = ""); } set { _fileName = value; } }

            private string _fileData;
            public string FileData { get { return _fileData ?? (_fileName = ""); } set { _fileData = value; } }

            private string _contentType;
            public string ContentType { get { return _contentType ?? (_contentType = ""); } set { _contentType = value; } }
        }

        private NameValueCollection _post;
        private Dictionary<string, File> _files;
        private readonly HttpListenerContext _ctx;

        public NameValueCollection Post { get { return _post ?? (_post = new NameValueCollection()); } set { _post = value; } }
        public NameValueCollection Get { get { return _ctx.Request.QueryString; } }
        public Dictionary<string, File> Files { get { return _files ?? (_files = new Dictionary<string, File>()); } set { _files = value; } }

        private void PopulatePostMultiPart(string post_string)
        {
            var boundary_index = _ctx.Request.ContentType.IndexOf("boundary=") + 9;
            var boundary = _ctx.Request.ContentType.Substring(boundary_index, _ctx.Request.ContentType.Length - boundary_index);

            var upper_bound = post_string.Length - 4;

            if (post_string.Substring(2, boundary.Length) != boundary)
                throw (new InvalidDataException());

            var current_string = new StringBuilder();

            for (var x = 4 + boundary.Length; x < upper_bound; ++x)
            {
                if (post_string.Substring(x, boundary.Length) == boundary)
                {
                    x += boundary.Length + 1;

                    var post_variable_string = current_string.Remove(current_string.Length - 4, 4).ToString();

                    var end_of_header = post_variable_string.IndexOf("\r\n\r\n");

                    if (end_of_header == -1) throw (new InvalidDataException());

                    var filename_index = post_variable_string.IndexOf("filename=\"", 0, end_of_header);
                    var filename_starts = filename_index + 10;
                    var content_type_starts = post_variable_string.IndexOf("Content-Type: ", 0, end_of_header) + 14;
                    var name_starts = post_variable_string.IndexOf("name=\"") + 6;
                    var data_starts = end_of_header + 4;

                    if (filename_index != -1)
                    {
                        var filename = post_variable_string.Substring(filename_starts, post_variable_string.IndexOf("\"", filename_starts) - filename_starts);
                        var content_type = post_variable_string.Substring(content_type_starts, post_variable_string.IndexOf("\r\n", content_type_starts) - content_type_starts);
                        var file_data = post_variable_string.Substring(data_starts, post_variable_string.Length - data_starts);
                        var name = post_variable_string.Substring(name_starts, post_variable_string.IndexOf("\"", name_starts) - name_starts);
                        Files.Add(name, new File() { FileName = filename, ContentType = content_type, FileData = file_data });
                    }
                    else
                    {
                        var name = post_variable_string.Substring(name_starts, post_variable_string.IndexOf("\"", name_starts) - name_starts);
                        var value = post_variable_string.Substring(data_starts, post_variable_string.Length - data_starts);
                        Post.Add(name, value);
                    }

                    current_string.Clear();
                    continue;
                }

                current_string.Append(post_string[x]);
            }
        }

        private void PopulatePost()
        {
            if (_ctx.Request.HttpMethod != "POST" || _ctx.Request.ContentType == null) return;

            var post_string = new StreamReader(_ctx.Request.InputStream, _ctx.Request.ContentEncoding).ReadToEnd();

            if (_ctx.Request.ContentType.StartsWith("multipart/form-data"))
                PopulatePostMultiPart(post_string);
            else
                Post = HttpUtility.ParseQueryString(post_string);

        }

        public HttpNameValueCollection(ref HttpListenerContext ctx)
        {
            _ctx = ctx;
            PopulatePost();
        }


    }
}
Bluebaron
  • 2,289
  • 2
  • 27
  • 37
1

I have implemented MultipartReader NuGet package for ASP.NET 4 for reading multipart form data. It is based on Multipart Form Data Parser, but it supports more than one file.

Václav Dajbych
  • 2,584
  • 2
  • 30
  • 45
1

How about some Regex?

I wrote this for a text a file, but I believe this could work for you

(In case your text file contains line starting exactly with the "matched" ones below - simply adapt your Regex)

    private static List<string> fileUploadRequestParser(Stream stream)
    {
        //-----------------------------111111111111111
        //Content-Disposition: form-data; name="file"; filename="data.txt"
        //Content-Type: text/plain
        //...
        //...
        //-----------------------------111111111111111
        //Content-Disposition: form-data; name="submit"
        //Submit
        //-----------------------------111111111111111--

        List<String> lstLines = new List<string>();
        TextReader textReader = new StreamReader(stream);
        string sLine = textReader.ReadLine();
        Regex regex = new Regex("(^-+)|(^content-)|(^$)|(^submit)", RegexOptions.IgnoreCase | RegexOptions.Compiled | RegexOptions.Singleline);

        while (sLine != null)
        {
            if (!regex.Match(sLine).Success)
            {
                lstLines.Add(sLine);
            }
            sLine = textReader.ReadLine();
        }

        return lstLines;
    }
mork
  • 1,747
  • 21
  • 23
0

I have dealt WCF with large file (serveral GB) upload where store data in memory is not an option. My solution is to store message stream to a temp file and use seek to find out begin and end of binary data.

Yang Zhang
  • 4,540
  • 4
  • 37
  • 34
0

In my case (as the OP) I always have just one part returned. So here is a quick and dirty to handle the simplest case.

/*
 * If the byte array has a paragraph of text at the beginning, remove it, and also 
 * remove a number of bytes at the end equal to the length of the first line.
 * Otherwise, just return the rawContent.
 */
public static byte[] RemoveMultipartWrapper(byte[] rawContent)
{
    var i1 = 0;
    var i2 = 0;
    for (var i = 1; i + i1 + 6 < rawContent.Length && i < 1000; i++)
    {
        if (rawContent[i] == 13 && rawContent[i + 1] == 10)  // search for CRLF
        {
            if (i1 == 0) i1 = i;  // first line break
            else if (i == i2 + 2)  // empty line
                return rawContent.Skip(i + 2).Take(rawContent.Length - i1 - i - 8).ToArray();
            i2 = i;
        }
    }
    return rawContent;   // no wrapper found
}
John Henckel
  • 10,274
  • 3
  • 79
  • 79