5

Just what the title says .

Strictly speaking what I define as "text" bounding box for a grayscaled image is a set of 4 coordinates (x,y,x+width,y+height) that have to define a rectangle area in that image that has the maximum number of non white pixels and at the same time the least possible number of white pixels(without chagning the maximum amount of non-white pixel). I have text in quotation marks since images does not actually contain text because images do only contain pixels with colours.

Having installed ImageMagick in my Ubuntu and typing in the terminal the command: $convert input.png -trim ouput.png , I get :

input.png

output.png

Open the two images in new tabs in your web browser and you will understand the difference they have and you will also understand what I define as "text" bounding box. The output.png has actually the width and height that I am looking for.I do not know how to get x and y coordinates.

The answer provided here (1) for pdf pages does not meet my criteria since the "text" bounding box that gs gives me has big white margins ( and actually as far as I can understand what gs defines as "text" bounding box for a pdf is something different from my definition of "text" bounding box for a picture).

Community
  • 1
  • 1
liaguridio
  • 461
  • 5
  • 13
  • Show us the code you have so far. – John Zwinck Sep 27 '15 at 07:27
  • 1
    I have no code for getting the coordinates of the text bounding box as I have defined it in my post . I know of a command that crop a picture's text bounding box as I have defined it : $convert input.png -trim output.png . The problem is that I do not know how to get the coordinates of the text bonding box. – liaguridio Sep 27 '15 at 07:33
  • Are you trying to create a program for this? If so, you need to try and write some code. If not, you should post this question instead on SuperUser and ask for help on how to use existing programs. Either way it's off-topic as currently written. – John Zwinck Sep 27 '15 at 07:36

2 Answers2

2

I don't understand all the words in your description, and I think a diagram would help, but if you just want to know what -trim would do as your sample code implies:

identify -format "%@" image.png
200x100+10+20

So, for your image, you get

identify -format "%@" paper.png
406x620+38+68

which means that your box is 38 pixels to the right of the top left corner and 68 pixels down from the top left corner, and it is 406 pixels wide and 620 pixels tall.

And if I draw in that rectangle in red, I get:

convert paper.png -stroke red -fill none -draw "rectangle 38,68 444,688" result.png

enter image description here

An alternative way of getting the same result but using convert in place of identify is:

convert -format %@ paper.png info:
406x620+38+68
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Thanks a lot . The output of the command you have provided gives me the coordinates that I was looking for in grayscaled pictures till so far . – liaguridio Sep 27 '15 at 09:40
0

Images don't have a 'text bounding box', because obviously there is no text.

The images in the PDF file may themselves contain white pixels, if they are scanned from books then they almost certainly will. These pixels count towards the bounding box of the image, because they are white not transparent and will obscure anything drawn beneath them.

Its also rather nonsensical to define a 'text bounding box' as 'an area in that picture that has no white margins and only text'. If its in an image then there is no text, only image samples which define pixels. That's a picture of text, not actually text. In order to differentiate between areas of an image containing text and areas containing non-text you will need OCR software, nothing else is going to do this because only OCR software is capable of detecting the difference between text and non-text.

KenS
  • 30,202
  • 3
  • 34
  • 51
  • Thanks for the criticism for my topic :) . You are right . I have edited to make some of my concepts more clear. – liaguridio Sep 27 '15 at 09:21