Questions tagged [mlcp]

MarkLogic Content Pump is an open-source, Java-based command-line tool (mlcp). mlcp provides the fastest way to import, export, and copy data to or from MarkLogic databases. It is designed for integration and automation in existing workflows and scripts.

https://developer.marklogic.com/products/mlcp

User Guide

https://docs.marklogic.com/guide/mlcp

Features

Content Pump can:

  • Bulk load billions of local files
  • Split and load large, aggregate XML files or delimited text
  • Bulk load billions of triples or quads from RDF files
  • Archive and restore database contents across environments
  • Copy subsets of data between databases
  • Load documents from HDFS, including Hadoop SequenceFiles

Data sources and destinations

Content Pump supports moving data between a MarkLogic database and any of the following:

  • Local filesystem
  • HDFS
  • MarkLogic archive
  • Another MarkLogic database

Formats

Content Pump supports

  • XML, JSON, text, binary files
  • RDF encoded in RDF/XML, Turtle, RDF/JSON, N3, N-Triples, N-Quads, or TriG serialization formats
  • Compressed files and archives (ZIP, GZIP)
  • MarkLogic archive, which includes both content and metadata (e.g., permissions and properties)
  • Delimited text (e.g., CSV) (import only)
  • Temporal Documents
  • Hadoop SequenceFiles

Getting Started with MLCP

You may find this free online training course helpful.

To get started moving data with mlcp, download and unpack the binaries. For those interested in hacking or look at the internals, you can also download the Apache 2.0 licensed source.

To create your first import script make sure you have an XDBC server attached to your database (running on port 8006, for example, below). From the command line, run the following, substituting your particulars.

156 questions
6
votes
1 answer

Invalid Value Operator '<'(less than)sign when passed as -query_filter in MLCP

I am using MLCP(Marklogic Content Pump) for copying content from one database to another. In this i'm using -query_filter option and its value is a cts:query in XML serialized format of a set of cts:element-range-query wrapped in cts:and-query :…
5
votes
1 answer

marklogic mlcp custom transform split aggregate document to multiple files

I have a JSON "aggregate" file that I want to split up and ingest as multiple documents into MarkLogic using mlcp. I want to transform the content during ingestion using javascript. My JSON file looks something like this: { "type":…
jennifer
  • 81
  • 6
5
votes
1 answer

mlcp transform csv file into OBI sources

I have the following challenge. We have csv files that we want to load into MarkLogic database using mlcp. We also want to transform the loaded rows during the load into OBI sources, so we buils a transform function for that. Now I am struggling…
Hugo Koopmans
  • 1,349
  • 1
  • 15
  • 27
4
votes
1 answer

mlcp is not loading document when executed through gradle

Hi I am trying to load xml document into marklogic using mlcpTask class from gradle. I am currently using - Marklogic version 10.0-1 - Gradle 6.5 - Java 14.0.1 Build.gradle file as below: plugins { id "com.marklogic.ml-gradle" version…
4
votes
1 answer

MLCP Export Selected Documents using document selector

i want to export selected documents from MarkLogic using MLCP based on xpath match. mlcp export -host localhost -port 8061 -username admin -password admin -mode local -output_file_path shiv -database shiv -output_type archive -document_selector…
DevNinja
  • 1,459
  • 7
  • 10
4
votes
1 answer

Bulk loading files into MarkLogic using MLCP fails

I am trying to use MLCP to bulk load files into MarkLogic. Command line and error are below. I followed the instructions in one of the tutorials. I dont know why its doing anything with hadoop if my mode is local. Any ideas what I'm doing…
Jeff
  • 41
  • 1
4
votes
2 answers

Loading data with mlcp - namespace issue

I'm trying to load rss data from Wordpress into MarkLogic database. The data is in the form of following:
Seong
  • 41
  • 3
4
votes
2 answers

Loading .owl files in marklogic

Is it possible to load .owl files using mlcp? I tried with -input_file_type rdf but it gives error as below: bin/mlcp.sh import -host localhost -port 9010 -username uname -password pwd -mode local -input_file_path /home/user/semantics/data…
Manisha
  • 41
  • 1
4
votes
1 answer

MarkLogic content pump mlcp document URI issue

I am tring to use the marklogic content pump in ML 7. Downloaded mlcp from site trying to load one xml. From Marklogic documentation: The following example loads files from the local filesystem directory /space/bill/data: mlcp.sh import -host…
Hugo Koopmans
  • 1,349
  • 1
  • 15
  • 27
3
votes
1 answer

Exception while copying data using MLCP

I am trying to copy 1 million documents from one database to another database using MLCP but I am getting following Exception. 19/08/30 11:48:08 ERROR contentpump.DatabaseContentReader: RuntimeException reading /integration/test/88398921012548…
DevNinja
  • 1,459
  • 7
  • 10
3
votes
2 answers

MarkLogic - Incremental load using MLCP

MarkLogic version : 9.0-6.2 We are trying to use mlcp to load daily changes of customer data into data-hub-STAGING and then use a harmonize flow to bring changes into data-hub-FINAL. As I understand, the 'collector.sjs' is used to return the uris…
Bhanu
  • 427
  • 2
  • 8
3
votes
1 answer

How to remove a column from a csv file while loading a file?

I want to remove the particular column from the csv file and load it into database using mlcp. My csv file contains: URI,EmpId,Name,age,gender,salary 1/Niranjan,1,Niranjan,35,M,1000 2/Deepan,2,Deepan,25,M,2000 3/Mehul,3,Mehul,28,M,3000 I want to…
3
votes
1 answer

Gradle Project convert to Maven

I have started to learn Gradle and i want to know how to convert a gradle project to maven project. I took a gradle project from the below link : https://github.com/rjrudin/ml-camel-mlcp I was able to generate a jar but a POM.xml is not generated…
Vikram
  • 635
  • 1
  • 9
  • 29
3
votes
1 answer

MLCP Import java.lang.UnsatisfiedLinkError

I am trying to import data into Marklogic server with MLCP. The data is in XML and inside an archive(zip) file. MLCP is ending with java.lang.UnsatisfiedLinkError I have tried with MLCP 8.0.6 and MLCP 8.0.7 but the error is same in both…
Dheeresh
  • 143
  • 2
  • 8
3
votes
2 answers

MarkLogic: Does mlcp need a XDBC server?

Does mlcp necessarily need a XDBC server or does it work with a HTTP server as well ?
Yash
  • 510
  • 2
  • 6
  • 14
1
2 3
10 11