Alternatives for generating a video feed from screenshots

Question

I'm working in a remote administration toy project. For now, I'm able to capture screenshots and control the mouse using the Robot class. The screenshots are BufferedImage instances.

First of all, my requirements: - Only a server and a client. - Performance is important, since the client might be an Android app.

I've thought on opening two socket connections, one for mouse and system commands and the second one for the video feed.

How could I convert the screenshots to a video stream? Should I convert them to a known video format or would it be ok to just send a series of serialized images?

The compression is another problem. Sending the screen captures in full resolution would result in a low frame rate, according to my preliminary tests. I think I need at least 24 fps to perceive movement, so I've to both downscale and compress. I could convert the BufferedImages to jpg files and then set the compression rate, but I don't want to store the files on disk, they should live in RAM only. Another possibility would be to serialize instances (representing an uncompressed screenshot) to a GZipOutputStream. What is the correct approach for this?

To summarize:

In case you recommend the "series of images" approach, how would you serialize them to the socket OutputStream?
If your proposition is to convert to a know video format, which classes or libraries are available?

Thanks in advance.

UPDATE: my tests, client and server on same machine
-Full screen serialized BufferedImages (only dimension, type and int[]), without compression: 1.9 fps.
-full screen images through GZip streams: 2.6 fps.
-Downscaled images (640 width) and GZip streams: 6.56 fps.
-Full screen images and RLE encoding: 4.14 fps.
-Downscaled images and RLE encoding: 7.29 fps.

If you use your idea with JPEG compressed images, there is no need to create files at all - ImageIO works with streams, so you can "save" your image directly into a socket and retrieve it on the other end directly as image as well (with a little wiring code around). — Durandal, Feb 14 '12 at 16:27
@Durandal Do you think it will save more space than GZip streams? — Mister Smith, Feb 14 '12 at 16:33
With JPEG you have the choice (compression ratio), if you set quality low enough... it'll compress more than BufferedImage + GZIP. The question is: how much quality will you need and how fast will it be with that quality (CPU and required network troughput). Unless you have clear performance and quality goals you will have a hard time deciding on what to use. — Durandal, Feb 14 '12 at 16:57

score 6 · Accepted Answer · answered Feb 14 '12 at 11:03

6

If its just screen captures, I would not compress them using a Video compression scheme, most likely you don't want lossy compression (blurred details in small text etc are the most common defects). For getting a workable "remote Desktop" feel, remember the previously sent Screenshot and send only the difference to get to the next one. If nothing (or very little) changes between frames this is very efficient. It will however not work well in certain situations like playing a video, game or scrolling a lot in a document.

Compressing the difference between two BufferedImage can be done with more or less elaborate methods, a very simple, yet reasonably effective method is simply to subtract one image from the other (resulting in zeros everywhere they are identical) and compressing the result with simple RLE (run length encoding).

Reducing the color precision can be used to further reduce the amount of data (depending on the use case you could omit the least significant N bits of each color channel, for most GUI applications look not much different if you reduce colors from 24 bits to 15 bits).

answered Feb 14 '12 at 11:03

Durandal

19,919
4
36
70

I thought that video formats will provide Key Frame Compression, which is basically what you are proposing, and other benefits. – Mister Smith Feb 14 '12 at 11:20
All the popular Video compression schemes do indeed that. The (deciding) difference here is that Video compressors are optimized for *Video*, and all of them apply *lossy* compression. They don't deal too well with the kind of graphics commonly displayed on a desktop (text) - that may or may not be an issue for the use case. – Durandal Feb 14 '12 at 11:35
I'll try your substraction technique, seems very compression-friendly. – Mister Smith Feb 14 '12 at 13:32
Just tested, the enhanced version achieves 2.53 fps with full screen images (the conventional version allowed 1.9 fps), so 30% better thanks to the GZip compression. However, the reconstructed images have some colors changed. When used with downscaled images there's no performance gain, it even has an slightly lower framerate. It makes me think that the scaling is the main factor to achieve high framerates. – Mister Smith Feb 14 '12 at 14:41
You will not achieve your desired Framerates with GZIP, at least not with a single threaded implementation. Thats why I suggested RLE compression (its requires a lot less cpu power). Also, depending on *how* you implement creating the delta you can also lose a lot of time processing the images. If you have altered colors, theres a glitch in your implementation (probably not handling overflows correctly). Generally if you need to process every pixel of an image, avoid the get/setRGB() stuff and get access to the underlying DataBuffer and from there to the array backing the buffer. – Durandal Feb 14 '12 at 16:23
will try RLE later. About the glitch, I tested the substraction/addition code and it is not the cause. I guess it is the getRGB as you suggested, but I don't know why. – Mister Smith Feb 14 '12 at 16:38
If you process the pixels as *int* values the separate color channels bleed into each other. Each color channel needs to be subtracted independently to avoid that. And there is no need to store the result back into the image, it can be fordwarded directly to the stream. There is no RLE compressing stream implementation in the JRE, but its fairly easy to roll your own (provided you have some experience with the Stream API and juggling raw byte data). – Durandal Feb 14 '12 at 17:06
Implemented my own RLE streams (with some trouble in InputStream), but they are ok when used with array streams. Disabled GZip compression and achieved about twice framerate in full screen mode, and 0.7 extra fps with downscaled images. Using both GZip and RLE yields almost the same results. I'm sending a "probe" non-substracted image in each round of 10. – Mister Smith Feb 15 '12 at 12:47
About the colors glitch, I only noticed it when saving a received sample image to a JPG file. Yesterday I implemented my own visor in swing, which doesn't use JPEG at all, and every frame is displayed ok. Thanks for your advice. – Mister Smith Feb 15 '12 at 12:50

score 5 · Answer 2 · edited Sep 09 '13 at 05:08

Break the screen up into a grid squares (or strips)
Only send the grid square if it's different from the previous

// server start

sendScreenMetaToClient(); // width, height, how many grid squares
...

// server loop ImageBuffer[] prevScrnGrid while(isRunning) {

ImageBuffer scrn = captureScreen();
ImageBuffer[] scrnGrid = screenToGrid(scrn);
for(int i = 0; i < scrnGrid.length; i++) {
    if(isSameImage(scrnGrid[i], prevScrnGrid[i]) == false) {
        prevScrnGrid[i] = scrnGrid[i];
        sendGridSquareToClient(i, scrnGrid[i]); // send the client a message saying it will get grid square (i) then send the bytes for grid square (i)
    }
} }

Don't send serialized java objects just send the image data.

ByteArrayOutputStream imgBytes = new ByteArrayOutputStream();
ImageIO.write( bufferedImage, "jpg", imgBytes );
imgBytes.flush();

+1 for providing a snippet on how to write jpeg without creating file on disk. — Mister Smith, Feb 17 '12 at 08:28

score 3 · Answer 3 · answered Feb 10 '12 at 10:39

Firstly, I might suggest only capturing a small part of the screen, rather than downscaling and potentially losing information, perhaps with something like a sliding window which can be moved around by pushing the edges with a cursor. This is really just a small design suggestion though.

As for compression, I would think that a series of images would not compress separately as well as with a decent video compression scheme, especially as frames are likely to remain consistent between captures in this scenario.

One option would be to use Xuggle, which is capable of capturing the desktop via Robot in a number of video formats afaiu, but I can't tell if you can stream and decode with this.

For capturing jpegs and converting them, then you can also use this.

Streaming these videos seems to be a little more complicated, though.

Also, it seems that the abandoned Java Media Framework supports this functionality.

My knowledge in this area is not fantastic tbh, so sorry if I have wasted your time, but it looks like some more useful information on the feasibility of using Xuggle as a screensharer has been compiled here. This also appears to link to their own notes on existing approaches.

If it doesn't need to be pure Java I reckon this would all be much easier using just by interfacing with a native screen capture tool...

Maybe it would be easiest just to send video as a series of jpegs after all! You could always implement your own compression scheme if you were feeling a little crazy...

Tried sending serialized holder objects containing BufferedImage data at full screen, over GZip streams. Even running client and server in the same machine, the framerate was very low (about 1 fps). Very likely I'll have to go for video. — Mister Smith, Feb 10 '12 at 13:11

score 2 · Answer 4 · edited May 23 '17 at 11:53

I think you described a good solution in your question. Convert the images to jpeg, but don't write them as files to disk. If you want it to be a known video format, use M-JPEG. M-JPEG is a stream of jpeg frames in a standard format. Many digital cameras, especially older ones, save videos in this format.

You can get some information about how to play an M-JPEG stream from this question's answers: Android and MJPEG

If network bandwidth is a problem, then you'll want to use an inter-frame compression system such as MPEG-2, h.264, or similar. That requires a lot more processing than M-JPEG but is far more efficient.

score 0 · Answer 5 · answered Feb 16 '12 at 21:26

0

If you're trying to get 24fps video then there's no reason not to use modern video codecs. Why try and recreate that wheel?

Xuggler works fine for encoding h264 video and sounds like it would serve your needs nicely.

answered Feb 16 '12 at 21:26

blahdiblah

33,069
21
98
152

Alternatives for generating a video feed from screenshots

5 Answers5