1

I have some high quality JPG. They are documents, I mean, no photos, no pictures, mainly text.

Is there any way to convert them into a PDF considering they are documents and they have to be slightly transformed, rotated, aligned, cropped, maybe solarized and joined ?

When you scan a document it's not perfectly straight and maybe it's in some perspective. I've seen some software doing this (the app CamScanner per example).

Any way to do it in linux console ?

Thank you

FlamingMoe
  • 2,709
  • 5
  • 39
  • 64
  • Can you clarify what you want in your question? imagemagick's convert changes jpg to pdf, as well as doing the transformations you tell it to.Other resources specify what type of samples are used, then recommend certain transformations, e.g. [other SO question](http://stackoverflow.com/questions/9608279/cleaning-scanned-grayscale-images-with-imagemagick) and [whiteboard conversion](http://www.reddit.com/r/commandline/comments/1weqnn/cli_oneliner_script_to_clean_up_and_beautify/). An existing imagemagick wrapper is [textcleaner](http://www.fmwconcepts.com/imagemagick/textcleaner/index.php) – Appleman1234 Jun 28 '14 at 05:38
  • The question is a bit confusing - you're asking about using a Linux console but you want to manage it on your mobile device? Is the inference here that you want to do this on the mobile device itself using a CLI on the phone, or are you planning to download the pictures to a traditional desktop and manipulate them there? – Avery Payne Jul 02 '14 at 18:58
  • Maybe you're looking for something like this? http://www.exactcode.de/site/open_source/exactimage/hocr2pdf/ – Demelziraptor Jul 03 '14 at 03:22

7 Answers7

2

There is a lot of command line tools to modify pictures, I guess that is not really the problem. But converting it to PDF is?

Without researching the parameters, here comes the commands to transform a JPEG to PDF:

jpegtopnm | pnmtotiff | tiff2pdf

UnixShadow
  • 1,222
  • 8
  • 12
2

Have a look at the ScanTailor project. This is a very good tool to prepare all kinds of scanned or photographed documents that mainly consist of text (as you have) as a preparation for any OCR software (in open source you would choose tesseract-ocr, optionally in combination with gImageReader). However there is only support for batch processing which is also very powerful. If you still need a cli interface then you need to change the source code which you can find in github by yourself.

https://github.com/scantailor/scantailor/

If you happen to understand german you can find a brief introduction here: http://www.heise.de/open/artikel/Toolbox-Scan-Tailor-bringt-gescannte-Dokumente-in-Form-1787142.html

Scolytus
  • 16,338
  • 6
  • 46
  • 69
thomas.mc.work
  • 6,404
  • 2
  • 26
  • 41
2

install package imagemagick ( in ubuntu sudo apt-get install imagemagick)

and

convert *.jpg pictures.pdf
Michał G
  • 2,234
  • 19
  • 27
1

Why no imagemagick? It is more or less the standard on scripted image processing. I don't think that you will find an alternative.

Stefan Weiser
  • 2,264
  • 16
  • 25
  • I meant not to answer "use imagemagick" ... I guess there's no better conversion tool for jpg to pdf ... but this task is NOT only conversion, but other things before conversion – FlamingMoe Jun 27 '14 at 20:50
  • My answer is, that you never will find any tool other than that (I'm pretty sure), if you want to do image processing too (except you want to write one ;-)). This is what I meant, not "use imagemagick"! – Stefan Weiser Jun 28 '14 at 13:49
1

When you scan a document it's not perfectly straight and maybe it's in some perspective. I've seen some software doing this (the app CamScanner per example).

But also CS needs the support of a human. Without human interaction it's very hard to get perspective and so on right.

If you want to do such things, you'll maybe need to implement it yourself. You could start looking at OpenCV examples. Here's a nice one: Automatic perspective correction for quadrilateral objects.

OpenCV does not support PDF creation. So once you have prepared the image and obtained the necessary parameters (clippping, perspective, scaling) you can use other tools/libraries like ImageMagick to create a PDF out of your image data.

Scolytus
  • 16,338
  • 6
  • 46
  • 69
1

Is there any way to convert them into a PDF considering they are documents and they have to be slightly transformed, rotated, aligned, cropped, maybe solarized and joined ?

With convert command various options are available which can be seen on the man page HERE, you can use to transform, rotate, align, crop the image file from command line easily.

neo
  • 969
  • 1
  • 11
  • 23
0

I suggest you use PDFTK. Follow this link Adding an image to a pdf with pdftk Pretty simple to use to. PDFTK is increasingly becoming more powerful each day. With PDFTK you can determine where on your PDF you want to place the image and resize it accordingly using their STAMP toolkit. Hope this helps

Community
  • 1
  • 1
Daniel
  • 598
  • 1
  • 6
  • 23