4

I am loading some Wikipedia pages from a webarchive (created in desktop Safari) in a UIWebView. This allows the pages to be available offline.

However, for some reason the images aren't loading when offline. It appears that they are being loaded from the website.

Everything worked fine in the past and I've noticed that the problem only effects new webarchives created after Wikipedia updated their mobile website format.

It's strange because the images load when offline if I open the webarchive on my computer, but not in iOS.

Any idea what's going on here?

I'm using the following code to load the webarchive:

NSString *fileName=[[NSString alloc] initWithFormat:@"%@", appDelegate.urlName];

NSString *htmlPath=[[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:fileName];

NSURL *url=[NSURL URLWithString:[htmlPath lastPathComponent] relativeToURL:[NSURL fileURLWithPath:[htmlPath stringByDeletingLastPathComponent] isDirectory:YES]];

[self.myWebView loadRequest:[NSURLRequest requestWithURL:url]];

Update: I also found out that loading a webarchive in mobile safari made from the mobile wikipedia site will cause a crash in iOS 7.

Here is a link to a new webarchive that is causing problems and one from the old version of Wikipedia that works fine. I've changed the file extension to "plist" so they can easily be edited. Change back to "webarchive" to test.

(NEW) https://dl.dropboxusercontent.com/u/20616325/Badger%20%28NEW%29.plist

(OLD) https://dl.dropboxusercontent.com/u/20616325/Badger%20%28OLD%29.plist

Jonah
  • 4,810
  • 14
  • 63
  • 76
  • I've explored the answer given here with no results: http://stackoverflow.com/questions/12647267/uiwebview-on-ios-6-does-not-display-images-with-relative-urls-in-webarchives/12647269?noredirect=1#comment46013784_12647269 – Jonah Mar 17 '15 at 15:55
  • 1
    It sounds like either the URLs to the images inside the web archive are still remote URLs, or there's some javascript that you can't see that is attempting to make remote URL calls. That answer you linked to tells you how to decode the webarchive. Look inside it for urls and check if they're local, or if they have some javascript magic. – damian Mar 19 '15 at 14:45
  • @damian is probably right. If you could share a webarchive as a sample I guess we could come up with a solution. – Alladinian Mar 19 '15 at 14:55
  • I'm not sure that I can share the webarchive directly. However here's one that's giving me problems. https://en.m.wikipedia.org/wiki/American_badger – Jonah Mar 19 '15 at 16:00
  • I've explored the webarchive as a plist and played around a bit, but I'm not exactly sure what I'm looking for. Any help would be great!! – Jonah Mar 19 '15 at 16:03
  • I added links to the webarchives above. – Jonah Mar 19 '15 at 16:26
  • wikipedia images are stored at wikimedia.com. Are you sure if Webarchive saves external media rather than current domain? – Ali Sheikhpour Mar 21 '15 at 00:39
  • Oh, you were right! No images! Well don't know what to say. in my that little help I had images and they were inside the archive. Try to add the page as regular html, either with images inside separate folder or write some converter to put them as – ilnar_al Mar 22 '15 at 04:25

2 Answers2

3

Even if you add the page as "complete web page" with images stored separately and relative path to them. It won't load them as they add weird code into omg tag, e.g.

<img alt=".." src="relative_path(//upload.. in the relapse)" srcset="tahat_causes_problems" data-file-width="" data-file-height="" />

srcset="//upload.wikimedia.org/wikipedia/commons/thumb/8/82/Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg/330px-Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/8/82/Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg/440px-Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg 2x" data-file-width="2124" data-file-height="1416"

I added a complete web page to Xcode project with wright relative paths, loaded to webView, NO IMGAES. But when I got rid of this srcset=".." and the rest it was loaded fine.

ilnar_al
  • 932
  • 9
  • 13
  • 1
    I'm not quite sure I'm following your answer. I'm not able to find "srcset" anywhere in the webarchive. – Jonah Mar 22 '15 at 20:20
  • just open your webarchive (NEW) as text, e.g. wiit vim, then search for 'srcset' - found. But in OLD - there is no 'srcset'. BTW you mentioned it worked fine with OLD webarchive – ilnar_al Mar 22 '15 at 20:25
  • Whenever I edit the webarchive it seems to break something and Xcode can't read it. – Jonah Mar 24 '15 at 16:52
  • I found out that it works if I open the webarchive in a hex editor and change every occurrence of "srcset=" to "srcse1=" I had to use the number 1 to make sure it didn't change the length of the word and mess up the rest of the file. – Jonah Mar 24 '15 at 19:44
2

Ilnar is correct. Expanding on his answer, the attribute srcset is not supported in iOS7 srcset support This is most likely what is causing the crash you are seeing.

Srcset is used to provide multiple image links for different device sizes all in one image tag. There is javascript at the beginning that finds the right src for the image return'srcset'in new Image();

The NEW webarchive is using this tag to provide links to 3 images. The OLD webarchive simply uses the tag to point to a URL.

Srcset should be supported in iOS8 but it looks like wikipedia is using a resolution tag of 1.5x and 2x.

`srcset="//upload.wikimedia.org/wikipedia/commons/thumb/8/82/Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg/270px-Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/8/82/Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg/360px-Taxidea_taxus_%28Point_Reyes%2C_2007%29.jpg 2x

Webkit (the backbone of Safari) only supports whole numbers (1x,2x,3x). So this could be causing the failed load on iOS 8.

zimmryan
  • 1,099
  • 10
  • 19
  • Can you recommend a way to edit the webarchives that will show the text? I was using TextWrangler but I'm having issues converting it back into a webarchive after I make the changes. Thanks! – Jonah Mar 24 '15 at 16:13
  • @Jonah I was using sublime text 2 to open the webarchive and search for srcset. But I have not found a way to save edits as a proper .webarchive. Maybe try Webarchive Extractor to get to a folder structure, then open the edited HTML in safari and save as a new webarchive. – zimmryan Mar 24 '15 at 16:49
  • Yeah, whenever I edit the webarchive it seems to break something and Xcode can't read it. – Jonah Mar 24 '15 at 16:52
  • @Jonah If you view the file in a hex editor you will see there is some sort of signature at the very end. When you edit the file something needs to change in this signature as well to make it readable. – zimmryan Mar 24 '15 at 18:38
  • I found out that it works if I open the webarchive in a hex editor and change every occurrence of "srcset=" to "srcse1=" I had to use the number 1 to make sure it didn't change the length of the word and mess up the rest of the file. – Jonah Mar 24 '15 at 19:43
  • Yes, I am able to get it working now. I found that Hex Fiend worked well and didn't break the file. Regarding the bounty, I'm not exactly sure what to do. You begin your answer by saying that Ilnar is correct and his answer got me in the right direction. Your answer also helped a lot, though removing the srcset tags as you recommend ends up breaking the file. – Jonah Mar 24 '15 at 19:50
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/73695/discussion-between-zimmryan-and-jonah). – zimmryan Mar 24 '15 at 20:06