I'm trying to make a piece of code run faster. The code is already using async/await. But it's still slow.
So I tried to alter my foreach to use the new IAsyncEnumerable. However I gained 0 performance from this. And it appears to run the code sequentially. Which surprised me. I thought the await foreach
would run each iteration in its own thread.
Here's my attempt at speeding up the code.
var bag = new ConcurrentBag<IronPdf.PdfDocument>(); // probably don't need a ConcurrentBag
var foos = _dbContext.Foos;
await foreach (var fooPdf in GetImagePdfs(foos))
{
bag.Add(fooPdf);
}
private async IAsyncEnumerable<IronPdf.PdfDocument> GetImagePdfs(IEnumerable<Foo> foos)
{
foreach (var foo in foos)
{
var imagePdf = await GetImagePdf(foo);
yield return imagePdf;
}
}
private async Task<IronPdf.PdfDocument> GetImagePdf(Foo foo)
{
using var imageStream = await _httpService.DownloadAsync(foo.Id);
var imagePdf = await _pdfService.ImageToPdfAsync(imageStream);
return imagePdf;
}
using IronPdf;
public class PdfService
{
// this method is quite slow
public async Task<PdfDocument> ImageToPdfAsync(Stream imageStream)
{
var imageDataURL = Util.ImageToDataUri(Image.FromStream(imageStream));
var html = $@"<img style=""max-width: 100%; max-height: 70%;"" src=""{imageDataURL}"">";
using var renderer = new HtmlToPdf(new PdfPrintOptions()
{
PaperSize = PdfPrintOptions.PdfPaperSize.A4,
});
return await renderer.RenderHtmlAsPdfAsync(html);
}
}
I also gave Parallel.ForEach
a try
Parallel.ForEach(foos, async foo =>
{
var imagePdf = await GetImagePdf(foo);
bag.Add(imagePdf);
});
However I keep reading that I shouldn't use async with it, so not sure what to do. Also the IronPdf library crashes when doing it that way.