Based on this question: How can I get HTML source code from TWebBrowser
If I run this code with a html page that has Unicode code page, the result is gibberish becouse TStringStream is not Unicode in D7. the page might be UTF8 encoded or other (Ansi) code page encoded.
How can I detect if a TStream/IPersistStreamInit is Unicode/UTF8/Ansi?
How do I always return correct result as WideString for this function?
function GetWebBrowserHTML(const WebBrowser: TWebBrowser): WideString;
If I replace TStringStream with TMemoryStream, and save TMemoryStream to file it's all good. It can be either Unicode/UTF8/Ansi. but I always want to return the stream back as WideString:
function GetWebBrowserHTML(const WebBrowser: TWebBrowser): WideString;
var
// LStream: TStringStream;
LStream: TMemoryStream;
Stream : IStream;
LPersistStreamInit : IPersistStreamInit;
begin
if not Assigned(WebBrowser.Document) then exit;
// LStream := TStringStream.Create('');
LStream := TMemoryStream.Create;
try
LPersistStreamInit := WebBrowser.Document as IPersistStreamInit;
Stream := TStreamAdapter.Create(LStream,soReference);
LPersistStreamInit.Save(Stream,true);
// result := LStream.DataString;
LStream.SaveToFile('c:\test\test.txt'); // test only - file is ok
Result := ??? // WideString
finally
LStream.Free();
end;
end;
EDIT: I found this article - How to load and save documents in TWebBrowser in a Delphi-like way
Which does exactlly what I need. but it works correctlly only with Delphi Unicode compilers (D2009+). read Conclusion section:
There is obviously a lot more we could do. A couple of things immediately spring to mind. We retro-fit some of the Unicode functionality and support for non-ANSI encodings to the pre-Unicode compiler code. The present code when compiled with anything earlier than Delphi 2009 will not save document content to strings correctly if the document character set is not ANSI.
The magic is obviously in TEncoding
class (TEncoding.GetBufferEncoding
). but D7 does not have TEncoding
. Any ideas?