0

I am looking for a solution to generate thumbnail images of screen shots of list of webs sites, and save to disk.

I found the following commands can do that. And I got a working shell script in ubuntu, which generates the images fine. (CutyCapt + ImageMagick packages)

cutycapt --url=http://www.yahoo.com --out=yahoo.png
convert yahoo.png -thumbnail 150x180^ -gravity NorthWest -extent 150x180 yahoothumb.jpg

But, it is sequential and taking lot of time. I thought of creating a PHP or Python script, which I will host as a web page in Apache. Then a separate program will generate multiple requests to increase the throughput of generating images.

I tried PHP first.

<?php echo exec('cutycapt --url=http://www.google.com --out=/var/www/google.png --javascript=on');?>

And CutyCapt fails with following: CutyCapt: Can not connect to X Server. I am running PHP/Apache with the same Identity/user that I run my regular shell script with.

Btw, I am a C# developer. So relatively less familiar with PHP, Linux, Scripts. I can deal with config files for PHP, Apache though :)

I have tried using .NET to launch IE in memory but it's cumbersome and also doesn't produce best results + needs STA: so it will be very less throughput.

cdpnet
  • 580
  • 2
  • 7
  • 23

2 Answers2

3

I have used the Xvfb and CutyCapt combination detailed on this page in production for a few years without a problem. In fact, I found the combination so reliable, I wrote a Ruby wrapper library (capit) to simplify using the combination from within some new Ruby applications I'm working on.

Also, though I haven't used it in production, I've been able to replicate similar results with the rendering functionality of the PhantomJS library as well.

EDIT:

You may want to check out this article for an example of how to run CutyCapt as a service of sorts.

ezkl
  • 3,829
  • 23
  • 39
0

Instead of relying on an external program to do your image manipulations, try using PHP's built-in GD image library. In my experience, it's super fast and very extensive, providing any and all image analysis and manipulation functionality.

Here's a thumbnailer script I wrote in PHP years ago that, in addition to providing a lot of options, integrates some trickery to automatically load and add borders to the images it creates. Sorry the code is kind of single-purpose, but the techniques it employs are repeatable. Here is the script in action.

Edit:

I was unaware what cutycapt did. Now it seems like this question is related to what you're trying to do; converting a web page to a PDF as a go-between format may get you at least one step closer to the goal.

Community
  • 1
  • 1
amphetamachine
  • 27,620
  • 12
  • 60
  • 72
  • I think this may be helpful. But grabbing the screenshot of a URL is not available outright. And even though it can be achieved, it seems it's windows only. I will look more. – cdpnet Jun 22 '11 at 18:48
  • Since it's windows, I am not sure if we can do paralleled. My C# program needed IE to run within Single Threaded mode. Also want to add that I have taken screenshots multiple ways. 2 ways in windows and this 1 way in Linux. This one with cutycapt has been giving me best image output. – cdpnet Jun 22 '11 at 18:55
  • @cdpnet - Parallel processing is not supported by PHP. This is overcome via the browser's ability to create more than one HTTP request at once, and the server software's ability to thread multiple requests. PHP is sort of a single-shot thing. – amphetamachine Jun 22 '11 at 19:26
  • That's exactly what I want and understand. I do not intend to write any special code in PHP or any other server script for parallel processing. Just want to make multiple requests at a time. – cdpnet Jun 22 '11 at 20:22