Here is the situation. I am using Node.js and cheerio My console log return is fine except for some duplicate stuff cause by the site structure...
My script:
var request = require ('request'),
cheerio = require('cheerio'),
chart = [];
request('http://www.website-X.com', function(err, resp, body){
if(!err && resp.statusCode == 200){
var $ = cheerio.load(body);
$('tr', '#chart_body').each(function(){
var rank = $(this).text().trim().replace(/\s\s+/g, ';');
chart.push(rank);
});
console.log(chart);
}
});
The site structure (simplified):
<table id="chart_body">
<tr><!-- 1 Info I need --></td>
<tr><!-- 2 Info I need --></td>
<table>
<tbody>
<tr> Duplicate info as 1 </tr>
</tbody>
</table>
<tr><!-- 3 Info I need --></td>
<tr><!-- 4 Info I need --></td>
<tr><!-- 5 Info I need --></td>
<tr><!-- 6 Info I need --></td>
</table>
My console log return:
'1;Wolfenstein;330,703;330,703;1',
'Wolfenstein',
'2;Wolfenstein;188,200;188,200;1',
'Wolfenstein',
'3;Minecraft;126,041;215,109;2',
'Minecraft','
My console log return is fine except for the duplicate stuff . It's cause in the site structure the selector tr has another tr within it. I can't get rid of 'tr tr'. The tr's also don't have unique classes to further select.
Please help. Thanks!!! -Aldo
Oh and lastly... The pesky single quote at the beginning and end of every return. I can't take it out.