-3

I have to make a page which checks a link and blocks it if it's a phishing link.

What's the best/most efficient way to check URLs against a blacklist?

I could store it in MySQL or a text file (maybe use JSON in the text file and iterate through an array of blocked links?)

I was thinking if I used MySQL it would have to check the database every time the link has to be checked and this may be too resource intensive, but maybe I'm wrong about that.

Either way won't be difficult for me to implement, but I'm just curious as to what the most efficient way for this would be.

J Del
  • 831
  • 2
  • 15
  • 31
  • What's with the question downvote? Everyone's all facist about questions these days. – S. Imp Jan 10 '17 at 03:35
  • *There are either too many possible answers, or good answers would be too long for this format. Please add details to narrow the answer set or to isolate an issue that can be answered in a few paragraphs.* – shmosel Jan 10 '17 at 03:46
  • Possible duplicate of [Why use MySQL over flatfiles?](http://stackoverflow.com/questions/2667850/why-use-mysql-over-flatfiles) – Mike Jan 10 '17 at 03:47
  • *"I used MySQL it would have to check the database every time"* - Using a text file you would have to load the whole file into memory every time too. – Mike Jan 10 '17 at 03:50
  • Define "phishing link" in very specific technical terms. Is it in Google's malware list? Is it hidden behind one or more URL shorteners or rewriters? Are you prepared to manually curate this list? This is a very, very hard problem to solve effectively. The most efficient way is to warn people that links are beyond your control and to encourage them to use safe browsers like Chrome with alerts for those sorts of sites. – tadman Jan 10 '17 at 06:53

1 Answers1

1

If you are talking about incoming url this can be spoofed or hidden. It can be obtained by the $_SERVER global. Eg: $_SERVER['HTTP_REFERER']

I wont do your work, as it is against the guidelines, although upon seeing your reply with more information I may become inspired.

To look through a MySQL database for a domain, these topics might be of use for you:

MYSQL Fulltext search and LIKE

MySQL match() against() - order by relevance and column?

https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html

Let me know which one helped you the most, and if you need further instruction, I can suggest a viable way to do this.

P.S.: Consider performance over precision, as I do agree your question was too vague and replies would have to be too thorough. You could also iterate over a PHP array to search for a blacklist of URLs.

Also, these are more advanced and involve regexp, but will probably interest you:

How to validate domain name in PHP?

https://corpocrat.com/2009/02/28/php-how-to-get-domain-hostname-from-url/

Regards.

Community
  • 1
  • 1
netpeers
  • 56
  • 4