I guess you are trying to build metadata scraper using javascript, if not wrong.
You need to take into consideration CORS policy before proceeding further, while requesting data from any url.
Reference URL:
- https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
- https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS/Errors
JSFiddle: http://jsfiddle.net/pgrmL73h/
Have demonstrated, how you can fetch the meta tags from URL given. For demo purpose, I have used https://jsfiddle.net/ url for fetching the meta tags, you can change it as per your need.
Followed below steps to retrieve the META tags from website.
For retrieving page source from any website url, first you need to access that website. Using jquery AJAX method you can do it.
Reference URL: https://api.jquery.com/jquery.ajax/
Used $.parseHTML method from jQuery which helps to retrieve DOM elements from html string.
Reference URL: https://api.jquery.com/jquery.parsehtml/
Once the AJAX request retrieves page source successfully, we are checking each DOM element from the page source & filtered the META nodes as per our need & stored the data inside a "txt" variable.
E.G.: Tags like keyword, description will be retrieved.
- Once the AJAX request completed, we are displaying the details of the variable "txt" inside a paragraph tag.
JS Code:
function myFunction() {
var txt = "";
document.getElementById("demo").innerHTML = txt;
// sample url used here, you can make it more dynamic as per your need.
// used AJAX here to just hit the url & get the page source from those website. It's used here like the way CURL or file_get_contents (https://www.php.net/manual/en/function.file-get-contents.php) from PHP used to get the page source.
$.ajax({
url: "https://jsfiddle.net/",
error: function() {
txt = "Unable to retrieve webpage source HTML";
},
success: function(response){
// will get the output here in string format
// used $.parseHTML to get DOM elements from the retrieved HTML string. Reference: https://api.jquery.com/jquery.parsehtml/
response = $.parseHTML(response);
$.each(response, function(i, el){
if(el.nodeName.toString().toLowerCase() == 'meta' && $(el).attr("name") != null && typeof $(el).attr("name") != "undefined"){
txt += $(el).attr("name") +"="+ ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")) +"<br>";
console.log($(el).attr("name") ,"=", ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")), el);
}
});
},
complete: function(){
document.getElementById("demo").innerHTML = txt;
}
});
}