I have tried every command found in the documentation, how could i get only the text part as output, and not at all the images?
https://github.com/coolwanglu/pdf2htmlEX/wiki/Command-Line-Options.
I have tried every command found in the documentation, how could i get only the text part as output, and not at all the images?
https://github.com/coolwanglu/pdf2htmlEX/wiki/Command-Line-Options.
I'm not sure what you are trying to achieve as the question subject and details appears contradictory, but there are options to split out the graphics and text into separate files:
--embed <string>
--embed-css <0|1> (Default: 1)
--embed-font <0|1> (Default: 1)
--embed-image <0|1> (Default: 1)
--embed-javascript <0|1> (Default: 1)
--embed-outline <0|1> (Default: 1)
Specify which elements should be embedded into the output HTML
file.
If switched off, separated files will be generated along with
the HTML file for the corresponding elements.
--embed accepts a string as argument. Each letter of the string
must be one of `cCfFiIjJoO`, which corresponds to one of the
--embed-*** switches. Lower case letters for 0 and upper case
letters for 1. For example, `--embed cFIJo` means to embed
everything but CSS files and outlines.
--split-pages <0|1> (Default: 0)
If turned on, the content of each page is stored in a separated
file.
This switch is useful if you want pages to be loaded separately
& dynamically -- a supporting server might be necessary.
Also see --page-filename.
So if you use the --split-pages 1
and --embed-image 0
options, then you have one HTML page per PDF page, which does not include embedded images.
If this isn't what you want then please include additional information in your question.