0

Massive Novice, so any help greatly appreciated.

I have a mysql database with about 30,000 entries, each with a link to a thumbnails in .jpg format.

I was trying to figure out if there was anyway to go through the mysql database and verify that each link is still valid - aka if the .jpg file is where it is meant to be?

So each entry has an id, name, thumb (which is an url).

Thanks!

3 Answers3

0

For links, do a query to get all the links, something like:

SELECT DISTINCT url FROM your_stuff;

You didn't mention the exact nature of your database, but something like that should work.

Then loop over it and for each one, use something like cURL to do a HEAD request - this is better than a usual GET request because it's identical, except the server shouldn't return the actual file, so you don't have to download every image just to see if they're there. Just do the HEAD request, and confirm that the server answered with a status 200.

This question goes into a little more about the HEAD request in cURL.

Community
  • 1
  • 1
Mike
  • 2,132
  • 3
  • 20
  • 33
0

Considering this line in your question: "each with a link to a thumbnails in .jpg format. "

You may also try a regexp : ending with .jpg

SELECT DISTINCT url_column as regex_u, id 
    FROM your_table
    where url_column regexp '\(.jpg)$';

even with like : contains jpg

    SELECT DISTINCT url_column as like_u, id 
    FROM your_table
    where url_column like '%.jpg%';

Another with instr:

    SELECT DISTINCT url_column as instr_u , id
    FROM your_table
    where instr(url_column, '.jpg') > 0;

If you want to match a whole ulr

    SELECT DISTINCT url_column as url_u, id 
    FROM your_table
    where url_column regexp '^(https?://|www\\.)[\.A-Za-z0-9/_\-]+\\.(jpg)$'
    ;

Another using Right:

    SELECT DISTINCT url_column as right_u, id 
    FROM your_table
    where Right(url_column,4) = '.jpg';

Please check the explain plan for the most efficient solution. LIKE seems to take the longest.

bonCodigo
  • 14,268
  • 1
  • 48
  • 91
  • @Bruf please give this a try and comment. It's great if you could show a bit of sample url though to make sure `regexp` fits well ;) – bonCodigo Jan 13 '13 at 21:14
  • sample url would look like http://griffiths.askaboutireland.ie/gv4/pdf-pages/OS_LETTERS_LONDONDERRY/thumbs2/OS_LETTERS_LONDONDERRY_pdf-035.tif.jpg – Bruff O'Reilly Jan 13 '13 at 21:33
  • @BruffO'Reilly IC now what you meant... well you have tif.jpg.. so that means it's ending with `.jpg`, then you have many choices to choose through. If it needs to contain `griffiths` we can add that to the start. Please try this out and let me know. – bonCodigo Jan 13 '13 at 21:35
0

SQL Fiddle

SELECT * from table_name where SUBSTRING(url, -3, 3) = 'jpg';
Raghvendra Parashar
  • 3,883
  • 1
  • 23
  • 36