1

My website is a static html photo gallery which has more than 100,000 html pages with no database, just simply html (old website, not a adult site)

I used to create sitemap and find broken links using Xenu application and I did skipped images (*.jpg) when creating a sitemap before but now (after some 4 years) its not working or I forget what I added in exclude options.

Xenu Screenshot

I used below formats to exclude/skip images to add in generating list/checking.

*.jpg*
*jpg*
*.jpeg*
*jpeg*

But its not working.

I really need to skip images because if there is 100k html pages, there is 100k HQ photos with 100k thumbnails :( and all this will make bigger sitemap and I don't want my direct photo links to index in search engine.

Please help. Any possible ways to skip "image/jpeg" ?

Thank you very much.

James Skemp
  • 8,018
  • 9
  • 64
  • 107
PeaceYo
  • 23
  • 4
  • No One Knows ? :/ – PeaceYo Dec 23 '17 at 05:12
  • I've never seen a regular expression work with this program. The only option you have is to exclude files starting with a particular string. For example: if you wanted to exclude all files in the images directory, you would add `http://domain.tld/images/` to the "Do not check any URLs beginning with this" section. – Daerik Dec 23 '17 at 05:30
  • Thanks for the reply. But I am sure I used Xenu for sitemap before and never used folder path or any urls to filter. I am confused now. I just checked my old sitemap and it does not have image links, it has more than 100k links and this cant be generated by other free apps out there. Anyway, I am checking for php script to do the job. Thank you very much. – PeaceYo Dec 24 '17 at 13:49

1 Answers1

0

I just had a similar requirement, and it turns out that Xenu have a "wildcard edition" that works with regex. It is available on their website.

RichardB
  • 2,615
  • 1
  • 11
  • 14