Let me go through your code to explain a couple of things.
First get rid of your Try
and Catch
and avoid ever using them in the future. Sounds weird, I know. But everything in code is technically "try this" because every line of code can fail. The only reason to ever use the actual Try
command is if you have a valid Catch
block that actually does something useful. Logging is one thing. Showing an error message is another, but since you're in VS that's covered already.
Next are these two lines:
Dim htmlText As String = wc.DownloadString(AppDomain.CurrentDomain.BaseDirectory + "\SCRA_Resources\SCRA.html")
Dim htmlarraylist = HTMLWorker.ParseToList(New StringReader(htmlText), Nothing)
The right part of the first line is "get some HTML from a very specific location" and the left part is "and put that into a variable as a string that's totally unaware of the original specific location". Read this a couple of times if it doesn't make sense because it should explain why the second line can't find the images.
Your image links are all relative but relative to what? I know you want it to be your specific folder but you didn't actually specify that in any way. HTML has (or maybe had, I haven't done this in a decade probably) a way to do this via the base
tag but I don't know if iText supports that. So instead you need to tell iText "when I say relative, I mean relative to this folder".
Before continuing, it is important to understand that you are using a very old, officially obsoleted and no longer supported helper class that lacks many features and will eventually cause you a lot of grief. The HTMLWorker
class was replaced with the XMLWorker
class many years ago. Although the HTMLWorker
class sounds like something that's more appropriate, think of the XMLWorker
as "XHTML" instead of "XML".
Okay, so if you're stuck using HTMLWorker
, you can solve this by implementing the iTextSharp.text.html.simpleparser.IImageProvider
interface. If you do this and you are using the 5.x series you should hopefully get a bunch of warnings because, as was said above, HTMLWorker
is officially obsoleted. The GetImage
method of this interface will be called for every image in your document. Below is a very simple implementation that takes a single parameter for the constructor that specifies what the new location should be. Ideally you should add some error handling (this is a good candidate for a Try\Catch
because your Catch
could be to include an explicit "image not found image") and if you have a mixture of absolute and relative images you should check for that, too.
Public Class RelativeRootImageProvider
Implements iTextSharp.text.html.simpleparser.IImageProvider
Public Property BasePath As String
Public Sub New(basePath As String)
Me.BasePath = basePath
End Sub
Public Function GetImage(src As String,
attrs As IDictionary(Of String, String),
chain As iTextSharp.text.html.simpleparser.ChainedProperties,
doc As IDocListener) As iTextSharp.text.Image Implements iTextSharp.text.html.simpleparser.IImageProvider.GetImage
''//This should also check to see if src is absolute and maybe try getting it first before the below.
''//The below could also have a File.Exists() check, too.
Dim newSrc = System.IO.Path.Combine(BasePath, src)
Return iTextSharp.text.Image.GetInstance(newSrc)
End Function
End Class
To use this you just need to create a special collection and add it to it:
''//Pick a folder
Dim RelativeImageRootPath = Environment.GetFolderPath(Environment.SpecialFolder.Desktop)
''//Collection of providers
Dim providers As New System.Collections.Generic.Dictionary(Of String, Object)()
''//Add our image provider pointed to our specific folder
providers.Add(HTMLWorker.IMG_PROVIDER, New RelativeRootImageProvider(RelativeImageRootPath))
And then pass the providers as the third parameter of the ParseToList
method:
Dim htmlarraylist = HTMLWorker.ParseToList(New StringReader(htmlText), Nothing, providers)