I want to scrap data from top gainers(%) from the link but it return UnicodeDecodeError: 'utf8' codec can't decode byte 0x93 in position 211: invalid start byte
import requests
from lxml import html
page_indo = requests.get('http://www.sinarmassekuritas.co.id/id/index.asp')
indo = html.fromstring(page_indo.content)
indo = indo.xpath('//tr/td/text()')
I do not found anything weird in line 211 when I view the source of the page. Please guide how to avoid this error and get the data in the top gainer(%) in table
Updated
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script type="text/javascript">
<!--
function MM_reloadPage(init) { //reloads the window if Nav4 resized
if (init==true) with (navigator) {if ((appName=="Netscape")&&(parseInt(appVersion)==4)) {
document.MM_pgW=innerWidth; document.MM_pgH=innerHeight; onresize=MM_reloadPage; }}
else if (innerWidth!=document.MM_pgW || innerHeight!=document.MM_pgH) location.reload();
}
MM_reloadPage(true);`
I am not sure what is the 211 try to point out. Triplee said it is 211th character from the beginning of the offending line
If it counted from
<!DOCTYPE html
, then the character is (... reloads the window ...)i
if counted from
<script type="text/javascript">
, then it will bedocument.MM**_**
I am not sure how one of these two will cause the error