3

I would like to be able to automatically rotate an image based off of its text content so that the text will be displayed properly (vertically). I would prefer the language to be either Javascript or PHP.

Bad Example

enter image description here

Proper Example

enter image description here

For instance, GIMP and PS do it when importing a picture like such:

enter image description here

Q

How can I accurately auto rotate images with JS/PHP so that the text shows up properly (vertically, if you would)?

--NOTE--

I do not want to rotate based off of the "EXIF orientation" data, but rather by the orientation of the text in the image. Apparently the EXIF data only tracks the orientation the picture was taken in respects to the ground.

karns
  • 5,391
  • 8
  • 35
  • 57
  • There is a javascript library to do this for you: http://stackoverflow.com/a/20600801/722617 – Bryan Zwicker May 21 '15 at 16:43
  • If I am not mistaken, the link you provided doesn't offer the solution for which I am looking for in JS and the PHP link is dead. I added a note to explain what I am looking for... – karns May 21 '15 at 17:09

3 Answers3

1

One possible solution I thought of would be to use OCR to detect characters in an image and test the image in all 4 orientations (rotated 90 degrees 3 times, in addition to original orientation). Whichever position returns the highest matched characters is most likely the proper orientation for the text.

One could use the following library for PHP: https://github.com/thiagoalessio/tesseract-ocr-for-php. In coordination with imagerotate(), one could find out the best orientation for the image based off of the amount of characters returned from OCR.

In Theory

require_once '/path/to/TesseractOCR/TesseractOCR.php';

$filename='path/to/some/image.jpg';
$photo = // create photo from $filename
$results = array();

for ($i=0; $i<4; $i++) {
    $new = imagerotate($photo, $i*90, 0);
    $new_path = // save the new rotated photo and get path
    $tesseract = new TesseractOCR($new_path);
    $results[$i] = strlen($tesseract->recognize());
}

/* Highest output is the best orientation for the image in respects to the text in it */
echo "Original Orientation: " . $results[0];
echo "Rotated 90 degrees: " . $results[1];
echo "Rotated 180 degrees: " . $results[2];
echo "Rotated 270 degrees: " . $results[3];

Pros - Utilizes existing libraries (Tesseract with PHP wrapper, imagerotate php function)

Cons - Computationally intensive. One image needs to be rotated 3 times & OCR 4 times

karns
  • 5,391
  • 8
  • 35
  • 57
0

The solution you are asking for is not quite the same as your example. At first I thought you wanted a smart function that detected the text, but it is simpler than that with your example.

You need to look at the EXIF data. Fortunately, I have doe this multiple times. To correct an image that has been rotated, you can use the following function, which I wrote for correcting images taken on tablets/phones but will be displayed on computers. The input must be a filename for a JPG image.

function fix_image($filename, $max_dimension = 0)
{

    $exif = exif_read_data($filename, 0, true);
    $o = $exif[IFD0][Orientation];

    // correct the image rotation
    $s = imagecreatefromjpeg($filename);
    if ($o == 1) { }
    if ($o == 3) { $s = imagerotate($s, 180, 0); }
    if ($o == 6) { $s = imagerotate($s, 270, 0); }
    if ($o == 8) { $s = imagerotate($s, 90, 0); }

    // export the image, rotated properly
    imagejpeg($s, $filename, 100);
    imagedestroy($s);


    if ($max_dimension > 0)
    {

     $i = getimagesize($filename);

     // reopen image for resizing
     // Use the known orientation to determine if it is portrait or landscape
     if ($i[0] > $i[1])
     {
      // landscape: make the width $max_dimension, adjust the height accordingly
      $new_width = $max_dimension;
      $new_height = $i[1] * ($max_dimension/$i[0]);
     } else {
      // portrait: make the height $max_dimension, adjust the width accordingly
      $new_height = $max_dimension;
      $new_width = $i[0] * ($max_dimension/$i[1]);
     }

     $s = imagecreatefromjpeg($filename);
     $n = imagecreatetruecolor($new_width, $new_height);
     imagecopyresampled($n, $s, 0, 0, 0, 0, $new_width, $new_height, $i[0], $i[1]);
     imagejpeg($n, $filename, 100);

     imagedestroy($n);
     imagedestroy($s);
    }
}
  • Please take note of my "NOTE" section in the question. I don't want the EXIF orientation, per se... – karns May 21 '15 at 19:41
  • Then you will need something much stronger than PHP and javascript, I'm afraid. but, if all else fails, this is a great function for auto-rotating images. – Jack Thomson May 21 '15 at 20:01
  • I don't think so, I've got some ideas in my head that would likely work, but would be too intensive... – karns May 22 '15 at 00:02
0

If the exif rotation is not acceptable you will likely need to do some image processing. This will never be 100% accurate. I am not sure the tessaract solution proposed by karns will work very well though since tessaract needs a fair amount of training and you might always encounter fonts you have not trained. Additionally, a comment on how to detect orientation of a scanned document? suggests that tessarct autorotates the image for text detection so you might get similar results on the rotated images.

An alternative is to use opencv via a php wrapper, e.g. https://github.com/mgdm/OpenCV-for-PHP (I have not used the wrapper myself). You can then do a line histogram for example pictures see the accepted answer on word segmentation using opencv. This way you can determine if the picture is horizontally or vertically oriented. Afterwards (and after possible correction of vertically oriented pictures) you could try to determine whether or not the text is upside down, google for detect upside down text one of the results, for example, suggests counting the dots in the upper and lower parts of a line. Again, this will never be 100% accurate.

Community
  • 1
  • 1
ikkjo
  • 735
  • 1
  • 9
  • 18