Here is the idea.
I have master articles say from the site BBC news. This master article is originally published by BBC news, but it may be used by many other sites across the web.
Approach 1:
Since Google doesn't provide any API. I implemented a program to fetch links from Google search results using Python and mechanize. However, this approach is not recommendable because my IP may get blocked. I don't want to risk doing it.
How I did?
I used the article title and author of the article combined as a boolean query to get only the matching article similar to master article. Results are quite good, but I don't want to go with this one.
Approach 2:
I tried with Google custom search querying with keywords from master article restricting the search only to limited sites instead of whole web. But the results are not good. I need only the links pointing to the articles used by other sites.
Can anyone tell me some better approach? Is there any libraries available for such purpose which can i make use of?