3

I am trying to send a file over HTTP using MultipartEntityBuilder. We are sending "filename" as string attribute to addBinaryBody as below. The issue is string filename has some special characters like

"gültig/Kapitel/00/00/SOPs/SOP/sop123.pdf"

But when it is going over HTTP, it goes like

"g?ltig/Kapitel/00/00/SOPs/SOP/sop003986.pdf"

I tried URLDecoder and new String(bytes, StandardCharsets.UTF_8). Nothing works. Please suggest some answers.

Required answer:

Special characters should go as "gültig" instead of "g?ltig"

MultipartEntityBuilder builder = MultipartEntityBuilder.create();
    builder.addTextBody("index", docbase_name.toLowerCase() + "_content_index");
    builder.addBinaryBody("file", fileContent, ContentType.MULTIPART_FORM_DATA,filename);
    HttpEntity multipart = builder.build();
    HttpPost request = new HttpPost(
            "http://" + utility.getIp()
                + ":" + utility.getPort() + "/fscrawler/_upload");
    request.setEntity(multipart);
Nick
  • 805
  • 5
  • 14
sadhiya usama
  • 83
  • 1
  • 1
  • 6
  • It's impressive how even the number at the end changes due to encoding problems ;-) – Joachim Sauer Jan 22 '20 at 07:07
  • Oh yeah!!.. I gave a different example in the second line:) but the issue is I am getting "g?ltig" instead of gültig – sadhiya usama Jan 22 '20 at 08:42
  • I'm pretty sure you can solve the issue by specifying which encoding to use to send your data, how exactly to do that depends on the API you use, and I don't know this one by heart. Trying to construct your own `String` is almsot certainly the wrong approach here. – Joachim Sauer Jan 22 '20 at 08:45

1 Answers1

3

filename should be passed to addBinaryBody encoded as an RFC 2047 header, you can use MimeUtility from java mail api to perform encoding (see e.g. similar issue How to put and get UTF-8 string from HTTP header with jersey?):

builder.addBinaryBody("file", fileContent, ContentType.MULTIPART_FORM_DATA, MimeUtility.encodeText(filename));

setEntity becomes:

        request.setEntity(
                MultipartEntityBuilder.create()
                        .addTextBody("index", docbase_name.toLowerCase() + "_content_index")
                        .addBinaryBody("file",
                                fileContent,
                                ContentType.MULTIPART_FORM_DATA,
                                MimeUtility.encodeText(filename))
                        .build());

Complete testcase:

package com.github.vtitov.test;

import com.google.common.base.Charsets;
import com.google.common.io.CharStreams;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.ContentType;
import org.apache.http.entity.mime.MultipartEntityBuilder;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import org.glassfish.jersey.logging.LoggingFeature;
import org.glassfish.jersey.media.multipart.FormDataBodyPart;
import org.glassfish.jersey.media.multipart.FormDataMultiPart;
import org.glassfish.jersey.server.ResourceConfig;
import org.glassfish.jersey.test.JerseyTest;
import org.glassfish.jersey.test.TestProperties;
import org.junit.Test;

import javax.mail.internet.MimeUtility;
import javax.ws.rs.Consumes;
import javax.ws.rs.GET;
import javax.ws.rs.POST;
import javax.ws.rs.Path;
import javax.ws.rs.core.Application;

import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.util.LinkedList;
import java.util.List;
import java.util.UUID;
import java.util.logging.Formatter;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.logging.SimpleFormatter;

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.notNullValue;

public class RestTest extends JerseyTest {
    private final static Logger log = Logger.getLogger(MockHttpResource.class.getName());

    @Path("fscrawler")
    public static class FscrawlerResource {
        @POST
        @Consumes("multipart/form-data")
        @Path("_upload")
        public String postToString(final FormDataMultiPart multiPart) throws Exception {
            List<String> fileNames = new LinkedList<>();
            try {
                for(FormDataBodyPart f:multiPart.getFields().get("file")) {
                    fileNames.add(MimeUtility.decodeText(f.getContentDisposition().getFileName()));
                }
            } catch (Exception e) {
                log.log(Level.SEVERE, "server error: ", e);
                throw e;
            }
            return String.join(",", fileNames);
        }
    }

    @Override
    protected Application configure() {
        forceSet(TestProperties.CONTAINER_PORT, "0");
        set(TestProperties.RECORD_LOG_LEVEL, Level.INFO.intValue());
        set(TestProperties.RECORD_LOG_LEVEL, Level.FINE.intValue());
        return new ResourceConfig(FscrawlerResource.class)
                .register(LoggingFeature.class)
                .register(org.glassfish.jersey.media.multipart.MultiPartFeature.class)
                ;
    }

    @Test
    public void multipart() throws IOException {
        String baseUri = target().getUri().toString();
        String docbase_name = UUID.randomUUID().toString();
        byte[] fileContent = UUID.randomUUID().toString().getBytes(StandardCharsets.UTF_8);
        String  filename = "gültig/file.txt";

        HttpPost request = new HttpPost(baseUri + "fscrawler/_upload");
        request.setEntity(
                MultipartEntityBuilder.create()
                        .addTextBody("index", docbase_name.toLowerCase() + "_content_index")
                        .addBinaryBody("file",
                                fileContent,
                                ContentType.MULTIPART_FORM_DATA,
                                MimeUtility.encodeText(filename))
                        .build());

        log.info("executing request " + request.getRequestLine());
        try(CloseableHttpClient httpclient = HttpClients.createDefault();
            CloseableHttpResponse response = httpclient.execute(request)
        ) {
            log.info(String.format("response: %s", response.toString()));
            HttpEntity resEntity = response.getEntity();
            assertThat(resEntity, notNullValue());
            if (resEntity != null) {
                log.info("Response content length: " + resEntity.getContentLength());
                String resContent = CharStreams.toString(new InputStreamReader(resEntity.getContent(), Charsets.UTF_8));
                log.info(String.format("response content: %s", resContent));
                assertThat("filename matches", filename, equalTo(resContent));
            }
            EntityUtils.consume(resEntity);
        } catch (IOException e) {
            dumpServerLogRecords();
            throw e;
        }
        dumpServerLogRecords();
    }

    void dumpServerLogRecords() {
        log.info(String.format("total server log records: %s", getLoggedRecords().size()));
        Formatter sf = new SimpleFormatter();
        getLoggedRecords().forEach(r -> {
            log.info(String.format("server log record\n%s", sf.format(r)));
        });

    }
}

You can enable logging to see requests, responses and processing:

mvn test \
  -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog \
  -Dorg.apache.commons.logging.simplelog.showdatetime=true \
  -Dorg.apache.commons.logging.simplelog.log.org.apache.http=DEBUG \
  -Dorg.apache.commons.logging.simplelog.log.org.apache.http.wire=DEBUG
y_ug
  • 904
  • 6
  • 8
  • I cannot use MimeUtility encoding because, the filename we put inside fscrawler is being used by a different server. We cannot decode it. I tried using `new String(filename.getBytes(), StandardCharsets.UTF_8)` But this also doesnt work. – sadhiya usama Jan 23 '20 at 09:42
  • I understand that you are using `fscrawler`. It's not clear, though, if you can patch `fscrawler`'s installation. Vanilla `fscrawler` uses `ISO_8859_1` [see here](https://github.com/dadoonet/fscrawler/blob/fscrawler-2.6/rest/src/main/java/fr/pilato/elasticsearch/crawler/fs/rest/UploadApi.java#L87) Or you can submit a patch to github or patch your installation yourself: add `MimeUtility.decode` or `MimeUtility.decodeText` to filename processing – y_ug Jan 23 '20 at 15:53