I'm writing a command line tool that uses WKWebView to capture screenshots of webpages. To do this, I have to ensure that the page is fully loaded, including all client-side redirects, before capturing the screenshot.
In general, this happens automatically, and I just have to wait for webView(_:didFinish:)
to be called. However sometimes redirection happens after webView(_:didFinish:)
is called for the original URL (e.g. all Google search result links).
To handle this, I check for new requests after loading is complete via webView(_:decidePolicyFor navigationAction:)
and repeatedly call webView.load
on the new requests until no more are generated.
The problem is that of course websites make lots of requests after a page is fully loaded that aren't redirections, e.g. sending tracking data. So I end up calling webView.load
on those requests instead, and instead of screenshotting e.g. a blogpost, I end up screenshotting the URL that an embedded tracking script sent data to, which visually is of course just a blank page.
Is there any way I can distinguish client-side redirect requests from these background ajax requests? Or failing that maybe some other way to follow redirects without calling webView.load
on each new request?
Here is a functioning simplified version of my code:
import WebKit
@MainActor
class WebContainer: NSObject {
private lazy var webView: WKWebView = {
let webView: WKWebView = WKWebView()
webView.navigationDelegate = self
return webView
}()
private var redirectURL: URL? // set every time decidePolicyFor navigationAction is called
private var loadedURL: URL? // set when didFinish is called
private var continuation: UnsafeContinuation<Void, Error>?
private func load(request: URLRequest) async throws -> Int {
try await withUnsafeThrowingContinuation { continuation in // required in the absence of an event loop, as this is a command line tool
self.continuation = continuation
webView.load(request)
}
}
private func takeScreenshot() async throws -> Data? {
// get data from webView and do stuff
}
private func checkRedirect() async -> Bool {
try! await Task.sleep(for: .seconds(0.1)) // small delay to wait for new requests
if redirectURL != loadedURL { // checks if redirectURL has been set to something new
return true
}
return false
}
func generateData(type: DataType, request: URLRequest) async throws -> Data? {
// Load website
try await load(request: request)
// Check for redirects after loading completed
while await checkRedirect() == true {
let redirectRequest: URLRequest = URLRequest(url: redirectURL!)
try await load(request: redirectRequest)
}
// Process & return data
return try await takeScreenshot()
}
}
extension WebContainer: WKNavigationDelegate {
func webView(_ webView: WKWebView, decidePolicyFor navigationAction: WKNavigationAction) async -> WKNavigationActionPolicy {
redirectURL = navigationAction.request.url // if a new request is initiated after webView(_:didFinish:) is called, this will set redirectURL to the new URL
return WKNavigationActionPolicy.allow
}
func webView(_ webView: WKWebView, didFinish navigation: WKNavigation!) {
loadedURL = webView.url
continuation?.resume(returning: ())
}
func webView(_ webView: WKWebView, didFail navigation: WKNavigation!, withError error: Error) {
navigationFailed = true
continuation?.resume(throwing: error)
}
func webView(_ webView: WKWebView, didFailProvisionalNavigation navigation: WKNavigation!, withError error: Error) {
navigationFailed = true
continuation?.resume(throwing: error)
}
}