13

I'm trying to get dimensions from a picture directly from the web using this code:

string image = @"http://www.hephaestusproject.com/.../csharp3.png";
byte[] imageData = new WebClient().DownloadData(image);
MemoryStream imgStream = new MemoryStream(imageData);
Image img = Image.FromStream(imgStream);

int wSize = img.Width;
int hSize = img.Height;

It works but the performance is terrible because I need to download many images just to get their dimensions.

Is there a more efficient way to do the same thing?

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
Thadeu Fernandes
  • 505
  • 1
  • 6
  • 23
  • 2
    no. not really. you need the image to know its size. you MIGHT be able to open PART of the stream, depending on the file type, and parse the meta data directly.... but I doubt that's going to fit what you need; and generally, once the web request is sent, you're going to get the whole image in one go. – hometoast May 05 '15 at 13:36

2 Answers2

14

In case this is helpful to those coming along later, it does seem that this is indeed possible. A brief review of the JPG, PNG and GIF image formats shows that they all generally have a header at the beginning of the file that contains the image dimensions.

Reddit uses an algorithm to download successive 1024-byte chunks to determine image dimensions without downloading an entire image. The code is in Python but it is in the _fetch_image_size method here: https://github.com/reddit/reddit/blob/35c82a0a0b24441986bdb4ad02f3c8bb0a05de57/r2/r2/lib/media.py#L634

It uses a separate parser in their ImageFile class and successively attempts to parse the image and retrieve the size as more bytes are downloaded. I've loosely translated this to C# in the code below, heavily leveraging the image-parsing code at https://stackoverflow.com/a/112711/3838199.

There may be some cases where retrieving the entire file is necessary but I suspect that applies to a relatively small subset of JPEG images (perhaps progressive images). In my casual testing it seems most image sizes are retrieved via the first 1024-byte retrieval; indeed this chunk size could probably be smaller.

using System;
using System.Collections.Generic;
using System.Drawing; // note: add reference to System.Drawing assembly
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading.Tasks;

namespace Utilities
{
    // largely credited to https://stackoverflow.com/a/112711/3838199 for the image-specific code
    public static class ImageUtilities
    {
        private const string ErrorMessage = "Could not read image data";
        private const int ChunkSize = 1024;

        private static readonly HttpClient Client = new HttpClient();
        private static readonly Dictionary<byte[], Func<BinaryReader, Size>> ImageFormatDecoders = new Dictionary<byte[], Func<BinaryReader, Size>>()
        {
            { new byte[]{ 0x42, 0x4D }, DecodeBitmap},
            { new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x37, 0x61 }, DecodeGif },
            { new byte[]{ 0x47, 0x49, 0x46, 0x38, 0x39, 0x61 }, DecodeGif },
            { new byte[]{ 0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A }, DecodePng },
            { new byte[]{ 0xff, 0xd8 }, DecodeJfif },
        };

        /// <summary>
        /// Retrieve the dimensions of an online image, downloading as little as possible
        /// </summary>
        public static async Task<Size> GetWebDimensions(Uri uri)
        {
            var moreBytes = true;
            var currentStart = 0;
            byte[] allBytes = { };

            while (moreBytes)
            {
                try
                {
                    var newBytes = await GetSomeBytes(uri, currentStart, currentStart + ChunkSize - 1).ConfigureAwait(false);
                    if (newBytes is null || newBytes.Length < ChunkSize)
                        moreBytes = false;
                    
                    if(new bytes != null)
                        allBytes = Combine(allBytes, newBytes);

                    return GetDimensions(new BinaryReader(new MemoryStream(allBytes)));
                }
                catch
                {
                    currentStart += ChunkSize;
                }
            }

            return new Size(0, 0);
        }

        private static async Task<byte[]?> GetSomeBytes(Uri uri, int startRange, int endRange)
        {
            var request = new HttpRequestMessage { RequestUri = uri };
            request.Headers.Range = new RangeHeaderValue(startRange, endRange);
            try
            {
                var response = await Client.SendAsync(request).ConfigureAwait(false);
                return await response.Content.ReadAsByteArrayAsync().ConfigureAwait(false);
            }
            catch 
            {

            }
            return new byte[] { };
        }

        /// <summary>
        /// Gets the dimensions of an image.
        /// </summary>
        /// <returns>The dimensions of the specified image.</returns>
        /// <exception cref="ArgumentException">The image was of an unrecognized format.</exception>    
        public static Size GetDimensions(BinaryReader binaryReader)
        {
            int maxMagicBytesLength = ImageFormatDecoders.Keys.OrderByDescending(x => x.Length).First().Length;

            byte[] magicBytes = new byte[maxMagicBytesLength];

            for (int i = 0; i < maxMagicBytesLength; i += 1)
            {
                magicBytes[i] = binaryReader.ReadByte();

                foreach (var kvPair in ImageFormatDecoders)
                {
                    if (magicBytes.StartsWith(kvPair.Key))
                    {
                        return kvPair.Value(binaryReader);
                    }
                }
            }

            throw new ArgumentException(ErrorMessage, nameof(binaryReader));
        }

        // from https://stackoverflow.com/a/415839/3838199
        private static byte[] Combine(byte[] first, byte[] second)
        {
            byte[] ret = new byte[first.Length + second.Length];
            Buffer.BlockCopy(first, 0, ret, 0, first.Length);
            Buffer.BlockCopy(second, 0, ret, first.Length, second.Length);
            return ret;
        }

        private static bool StartsWith(this byte[] thisBytes, byte[] thatBytes)
        {
            for (int i = 0; i < thatBytes.Length; i += 1)
            {
                if (thisBytes[i] != thatBytes[i])
                {
                    return false;
                }
            }
            return true;
        }

        private static short ReadLittleEndianInt16(this BinaryReader binaryReader)
        {
            byte[] bytes = new byte[sizeof(short)];
            for (int i = 0; i < sizeof(short); i += 1)
            {
                bytes[sizeof(short) - 1 - i] = binaryReader.ReadByte();
            }
            return BitConverter.ToInt16(bytes, 0);
        }

        private static int ReadLittleEndianInt32(this BinaryReader binaryReader)
        {
            byte[] bytes = new byte[sizeof(int)];
            for (int i = 0; i < sizeof(int); i += 1)
            {
                bytes[sizeof(int) - 1 - i] = binaryReader.ReadByte();
            }
            return BitConverter.ToInt32(bytes, 0);
        }

        private static Size DecodeBitmap(BinaryReader binaryReader)
        {
            binaryReader.ReadBytes(16);
            int width = binaryReader.ReadInt32();
            int height = binaryReader.ReadInt32();
            return new Size(width, height);
        }

        private static Size DecodeGif(BinaryReader binaryReader)
        {
            int width = binaryReader.ReadInt16();
            int height = binaryReader.ReadInt16();
            return new Size(width, height);
        }

        private static Size DecodePng(BinaryReader binaryReader)
        {
            binaryReader.ReadBytes(8);
            int width = binaryReader.ReadLittleEndianInt32();
            int height = binaryReader.ReadLittleEndianInt32();
            return new Size(width, height);
        }

        private static Size DecodeJfif(BinaryReader binaryReader)
        {
            while (binaryReader.ReadByte() == 0xff)
            {
                byte marker = binaryReader.ReadByte();
                short chunkLength = binaryReader.ReadLittleEndianInt16();

                if (marker == 0xc0 || marker == 0xc1 || marker == 0xc2)
                {
                    binaryReader.ReadByte();

                    int height = binaryReader.ReadLittleEndianInt16();
                    int width = binaryReader.ReadLittleEndianInt16();
                    return new Size(width, height);
                }

                binaryReader.ReadBytes(chunkLength - 2);
            }

            throw new ArgumentException(ErrorMessage);
        }
    }
}

Tests:

using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Utilities;
using System.Drawing;

namespace Utilities.Tests
{
    [TestClass]
    public class ImageUtilitiesTests
    {
        [TestMethod]
        public void GetPngDimensionsTest()
        {
            string url = "https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(272, 92), actual);
        }

        [TestMethod]
        public void GetJpgDimensionsTest()
        {
            string url = "https://upload.wikimedia.org/wikipedia/commons/e/e0/JPEG_example_JPG_RIP_050.jpg";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(313, 234), actual);
        }

        [TestMethod]
        public void GetGifDimensionsTest()
        {
            string url = "https://upload.wikimedia.org/wikipedia/commons/a/a0/Sunflower_as_gif_websafe.gif";
            Uri uri = new Uri(url);
            var actual = ImageUtilities.GetWebDimensions(uri);
            Assert.AreEqual(new Size(250, 297), actual);
        }
    }
}
Brandon Minnick
  • 13,342
  • 15
  • 65
  • 123
karfus
  • 869
  • 1
  • 13
  • 20
  • Thanks so much for the above piece of code. It was very useful. However, in my tests, the code above seems to return 0,0 as width & height for all the png & GIF images. Maybe there are additional markers that need to be included? If you updated this code or know about additional markers for png and GIF , can you post those? Thanks – Steve Johnson Dec 28 '16 at 10:00
  • 1
    As I mentioned I'm leveraging the image parsing code from http://stackoverflow.com/a/112711/3838199 and perhaps it doesn't work in all instances. However I've added my unit tests and though they are admittedly far from exhaustive they show particular images where this routine works perfectly. Also note that this code is currently in use in a large production application scraping the size of images on over 800 million web sites. – karfus Jan 01 '17 at 16:08
5

In a word, no.

In a few more words, you would have to rely on there being a resource containing details of the dimensions of images on a given server. In 99.99% of cases that will simply not exist.

Hossein Narimani Rad
  • 31,361
  • 18
  • 86
  • 116
ZombieSheep
  • 29,603
  • 12
  • 67
  • 114