There is a faster way introduced on iOS, try this post.
In any case the data retrieval from GPU is always slow but there are many cases where you can break this operation down to many smaller ones or even putting it to some background thread but this depends on the project you are working on: If you need to get the frames in real time like making a screenshot video there is not much you can do (unless dropping video FPS or resolution...), on the other hand if you are only making a complex screenshot and you want your scene to continue uninterrupted there might be a few ways.
On screenshot you could create another frame buffer and continue rendering on the new one (scene uninterrupted), then on the end of each draw frame bind original buffer and use read pixels but copy only a portion of the data: For instance copy a 4th of the original buffer each frame and you will get the whole image data in 4 frames and reducing this operation 4 times per frame. This can also be used making a video (copy half of the buffer each frame dropping video FPS to half).
Another way is to rather create a custom frame buffer with POT(power of two) dimensions and attaching a texture to it. This way you draw to texture instead of drawing it directly to the buffer. This way you can create another thread, another shared context on which you create screenshots using this texture while on main thread you draw it to the screen. This way you can control how much time the application spends on making a screenshot but note that using this for something like a video is very unlikely to look nice.