I dont have an answer/code snippet for you, but you should consider researching "screen/web scraping" to capture data. Then using "regular expressions" to count characters and strip tags etc. Using both these you will be able to achieve your end goal. Good luck
Here is a start taken from www.jacobward.co.uk. This will allow you to capture a web page in a variable.
<?php
// Defining the basic cURL function
function curl($url) {
$ch = curl_init(); // Initialising cURL
curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data
$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
curl_close($ch); // Closing cURL
return $data; // Returning the data from the function
}
$scraped_website = curl("http://www.example.com"); // Executing our curl function to scrape the webpage http://www.example.com and return the results into the $scraped_website variable
?>
Web/Screen Scraping Wikipedia
Regular Expressions Webcheatsheet
I have tried it before and found it extremely complicated and thus failed. Good luck