0

Usecase : I want to get all the contents of my documents in db and store it in one zip file.

I used the ml-java-util to convert the content into zip file. My server side transformation module logic is like below:

  1. I went into each document and extracted only required fields using node Xpath.
  2. But when I executed , it created a zip file when I opened, it has set of files based on Uri name and with extracted content. Rather than merging all the file contents into one it is creating new file for every URI.

How can I override that behavior? I want all the content (i.e.extracted values) in all the documents to be in a single file that too converted to a zip.

Tried to apply For loop but thought it is of no use since the function itself getting uri's (i.e. context.uri) one by one.

Any help is appreciated.

Thanks

Private
  • 1,661
  • 1
  • 20
  • 51
  • Based on "Rather than merging all the file contents into one it is creating new file for every URI", I think you need to clarify your use case. Is it that you want one zip containing one file, which has all of the matching documents concatenated together? Or do you want one zip containing a file per matching document? – rjrudin Feb 16 '18 at 13:07
  • Yeah I want one zip containing one file,which has all of matching documents contents concatenated together – Private Feb 17 '18 at 05:48

1 Answers1

2

You can use ExportToWriterListener! ExportToWriterListener exports all the contents retrieved by QueryBatcher and writes to a File.

DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8012,
    new DatabaseClientFactory.DigestAuthContext("admin", "admin"));
DataMovementManager moveMgr = client.newDataMovementManager();
ServerTransform transform = new ServerTransform("transformName");
File outputFile = new File("output.txt"); // pass in your file here
String collection = "customers";
StructuredQueryDefinition query = new  StructuredQueryBuilder().collection(collection); // Substitute your query here
try (FileWriter writer = new FileWriter(outputFile)) {
  ExportToWriterListener exportListener = new ExportToWriterListener(writer)
    .withRecordSuffix("\n")
    .withTransform(transform) // pass in your Server Transform here
    .onGenerateOutput(
      record -> {
        String contents = record.getContentAs(String.class); 
        return contents; // return the content as it is which is the server transformed documents' content
      }
    );

  QueryBatcher queryJob =
    moveMgr.newQueryBatcher(query)
      .withThreadCount(5)
      .withBatchSize(10)
      .onUrisReady(exportListener)
      .onQueryFailure( throwable -> throwable.printStackTrace() );
  moveMgr.startJob( queryJob );
  queryJob.awaitCompletion();
  moveMgr.stopJob(queryJob);
}

Then you can create a zip out of the file.

  • I don't think this addresses the use case of writing documents to a zip file. My experience with DMSDK is that ExportListener should be used instead with a Consumer that adds each document to a zip. That's what is supported in ml-javaclient-util - https://github.com/marklogic-community/ml-javaclient-util/blob/master/src/main/java/com/marklogic/client/ext/datamovement/job/ExportToZipJob.java – rjrudin Feb 16 '18 at 13:09
  • You are right. This doesn't address the use case of writing documents to a zip file. He wanted all the contents extracted from the server to go into a single file and then wanted to create a zip out of that file alone and he expected that to happen when he used ml-javaclient-util job to convert to zip. I have created the above code which should satisfy the customer's use case. – Vivek Siddharthan Feb 16 '18 at 19:30
  • Thanks @Vivek for converting to zip can I implement `writetozipconsumer` class in your above code . – Private Feb 17 '18 at 06:15
  • No, WriteToZipConsumer takes in a DocumentRecord. After the file is created, then it is normal java code. You can see this link https://stackoverflow.com/questions/1091788/how-to-create-a-zip-file-in-java to convert the output file to a zip file. – Vivek Siddharthan Feb 17 '18 at 23:44