0

I have a database storing Binary JPEG Images with two different file signatures (FFD8FFE0 and FFD8DDE1) and I would like to convert them into Base64 so I can use them in another application (Power BI). The data is stored as an IMAGE field type, however I only receive the data in a CSV file and import into my tool as a string and work with it from there.

For the file signature FFD8FFE0, I have no problem converting using the below code (from another Stack post - thank you):

    public static string ToBase64(String sBinary)
    {
        int noChars = sBinary.Length;

        byte[] bytes = new byte[noChars / 2];

        for (int i = 0; i < noChars; i += 2)
        {
            bytes[i / 2] = Convert.ToByte(sBinary.Substring(i, 2), 16);
        }
        return Convert.ToBase64String(bytes);
    }

However, the file signature FFD8FFE1 is not converting and displaying properly to an image. It gives me an output, but does not display properly.

Any advice? Is this because of the different file signature OR because of the size of the string (they are noticeably larger).

EDIT: Thank you everyone who assisted. As mentioned in the comments, the real issue was the data I was trying to convert - it was being truncated in the CSV. So for anything who ever comes across this post, pull directly from SQL and not a text file as there is a good chance the data will be truncated.

rak11
  • 123
  • 1
  • 15
  • 4
    You receive binary from the database, convert it to string, unconvert it from string and then encode in base64? Why not receive binary directly and pass it to ToBase64String? – GSerg Sep 23 '21 at 16:16
  • @GSerg Open to suggestions on how that would look. Is it simply skipping out on the Byte Array section and just using ToBase64String()? – rak11 Sep 23 '21 at 16:19
  • 2
    @rak11 it would look like just `Convert.ToBase64String(bytes);`. If you write a query that loads data from a `varbinar(max)` field, the result will be `byte[]`. It it's not, why not? Does your own code convert that `byte[]` to a string? Is the database field text instead of binary? – Panagiotis Kanavos Sep 23 '21 at 16:25
  • To clarify, the field in the database is of type IMAGE. I get the data in a CSV file and import it as a long string (ie: 0xFFD8...). – rak11 Sep 23 '21 at 16:32
  • 1
    `image` is an obsolete type, equivalent to `varbinary(max)` and is loaded as `byte[]`. If you use[DbDataReader.GetBytes](https://learn.microsoft.com/en-us/dotnet/api/system.data.common.dbdatareader.getbytes?view=net-5.0) you get a `byte[]` directly. If you use `GetValue` you get `byte[]` wrapped in an `object`. What do CSV files have to do with *images*? – Panagiotis Kanavos Sep 23 '21 at 16:39
  • `I get the data in a CSV file` - at which point it doesn't matter what the database type is, does it? :) Have you considered that the data in the database/csv may be corrupted? – GSerg Sep 23 '21 at 16:40
  • 1
    In fact, corruption is almost certain, unless the export code was carefully written. `0xFFD8...` is how SSMS displays binary data, not the actual data. It's not the full data either, it's truncated because SSMS can't very well display 2GB of data (the maximum size). SSMS isn't an export tool – Panagiotis Kanavos Sep 23 '21 at 16:45
  • 1
    Why go through the CSV file *at all*? If you can write C# code why not write a query to load the data? – Panagiotis Kanavos Sep 23 '21 at 16:45
  • The data is from the backend of an enterprise application. I am under the impression that it is accurate in the database, but no way to really verify. I can say that the data in the CSV matches the data that is in the backend. But all fair advice thank you, I will see if I can pull directly from the database. – rak11 Sep 23 '21 at 16:47
  • What DB type are you using, how are you reading from the DB. – Danny Varod Sep 23 '21 at 16:48
  • @PanagiotisKanavos Thank you for the advice. I will try directly pulling from the database and not using SSMS, as you mentioned, it is likely truncating. I will update my post after my tests. – rak11 Sep 23 '21 at 17:08
  • 1
    Even if the method looks a bit weird, the string to byte/base64 conversion looks ok and if one works, the other should work too... `FFD8FFE1` is a JPEG which includes EXIF data (as opposed to `FFD8FFE0`): are you sure the software you want to show the image on actually supports that format? You may want to write the bytes to a file (`File.WriteAllBytes(@"C:\test.jpg", bytes);`, after filling the byte array, for example) and see if you can open them with an image viewer. – Jcl Sep 23 '21 at 17:20

3 Answers3

0

It sounds like you exported the binary data from an admin tool's query results. Such tools will almost always truncate the displayed binary results to conserve memory.

It's better and easier to read the data directly from the database using ADO.NET or a micro-ORM like Dapper to reduce the boilerplate code.

Using Dapper you could write something as simple as:

var sql="select image from MyTable where Category=@category";

using var connection=new SqlConnection(connectionString);

var images=connection.Query<byte[]>(sql,new {category="Cats"});

And convert it with :

var cats64=images.Select(bytes=>Convert.ToBase64String(bytes));

Dapper will handle opening and closing the connection, so we don't even have to do that.

If you want to retrieve more fields, you can define a class to accept the results. Dapper will map the result columns to properties by name. Once you have a class, you can easily add a method to return the Base64 string:

class MyData
{
    public string Name{get;set;}
    public byte[] Image {get;set;}

    public string ToBase64()=>Convert.ToBase64String(Image);
}


....

var images=connection.Query<MyData>("select Name,Image From ....",...);

Using plain old ADO.NET needs a few more lines:

var sql="select image from MyTable where Category=@category";
using var connection=new SqlConnection(connectionString);
using var cmd=new SqlCommand(sql,connection);
cmd.Parameters.Add("@category",SqlDbType.NVarChar,20).Value="Cats";
using var reader=cmd.ExecuteReader();

while(reader.Read())
{
    var image=(byte[])reader["image"];
    ...
}

With ADO.NET though, it's also possible to load the data as a stream instead of loading everything into memory. This is very helpful when the image is large because we avoid caching the entire blob in memory. We could write the stream to a file directly without first loading it in memory :

while(reader.Read())
{
    using (var stream=reader.GetStream(0))
    using (var file=File.Create(somePath))
    {
        stream.CopyTo(file);
    }
}

There's no overload of Convert.ToBase64String that works with streams. It's possible to use the CryptoStream class with a Base64 transform to convert a stream to encode an input stream into Base64 as this SO answer shows. Adopting it to this case:

while(reader.Read())
{
    using (var stream.      = reader.GetStream(0))
    using (var base64Stream = new CryptoStream( stream, new ToBase64Transform(), CryptoStreamMode.Read ) )
    using (var outputFile   = File.Create(somePath) )
{
    await base64Stream.CopyToAsync( outputFile ).ConfigureAwait(false);
}
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
-1

If you are looking to convert to a base64 jpg url:

public static string ToBase64PNGUrl (byte[] bytes) =>
    $"data:image/jpg;base64,{Convert.ToBase64String(bytes)}";
trinalbadger587
  • 1,905
  • 1
  • 18
  • 36
  • I did some [research](https://en.wikipedia.org/wiki/List_of_file_signatures) on the formats the OP provided and it appears that the images are JPEGs. – Woody1193 Sep 23 '21 at 16:57
-3

You are mapping the results from the database incorrectly, this field should be byte[], not string.

I assume you are receiving a hexadecimal representation of the bytes, which you could convert to bytes then convert to base64 as you attempted (Convert.ToBase64String(bytes)).

Try using EF-code first to read from table and define the image property as byte[].

Danny Varod
  • 17,324
  • 5
  • 69
  • 111
  • There's absolutely no reason to use an ORM just to read some fields. As for the data, there's no guarantee the strings are valid. SSMS's result grid for example truncates binary data – Panagiotis Kanavos Sep 23 '21 at 16:47
  • @PanagiotisKanavos No reason, other than making it simple - from the code in the question it is clear that the reading from the DB is incorrect. – Danny Varod Sep 23 '21 at 16:48
  • It's not making it simple at all, it's making it 100 times harder. Instead of a single query returning some values, you add entities and mapping that aren't even used. A simple `connection.QueryAsync – Panagiotis Kanavos Sep 23 '21 at 16:49
  • It won't be the same amount of lines at all. You'll have to create a DbContext at least, and an entity, none of which are useful. Unless you used `AsNoTracking()` you'd end up tracking a lot of binary data for no reason as well. The downvote is because this is a *very* complicated way to do a simple query with significant overhead. As for `DB First`, this increases the time needed to just write the code a *lot*. This is possibly the most complicated way to read an image – Panagiotis Kanavos Sep 23 '21 at 16:56
  • @PanagiotisKanavos I meant code-first, fixed. Entity = 2 lines, context = 3 lines, query in using = 2 lines (not counting {}s). – Danny Varod Sep 23 '21 at 16:58