I am trying to make a web service that i need to crawl data from. The problem is that the site that i need to get the data from, is in a asp gridview that has paging in it.. So what i need is, to read the html, do a postback to the page - so it will give me the next page of the gridview, and then get the new html code (the response) from whom i can parse and get the data i need...
I tried in many ways to solve this problem, but i did not succeed. So could you tell me where/what i am doing wrong?
Code:
[WebMethod]
public string eNabavki2()
{
WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
string htmlCode = client.DownloadString("https://site.com/Default.aspx");
string vsk = getBetween(htmlCode, "id=\"__VIEWSTATEKEY\" value=\"", "\" />");
WebRequest request = WebRequest.Create("https://site.com/Default.aspx");
request.ContentType = "application/x-www-form-urlencoded";
request.Method = "POST";
var webRequest = (HttpWebRequest)request;
webRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0"; //Googlebot/2.1 (+http://www.googlebot.com/bot.html)
//set form data
string postData = string.Format("__EVENTTARGET={0}" +
"&__EVENTARGUMENT={1}" +
"&__LASTFOCUS={2}"+
"&__VIEWSTATEKEY={3}"+
"&__VIEWSTATE={4}"+
"&__SCROLLPOSITIONX={5}"+
"&__SCROLLPOSITIONY={6}"+
"&ctl00$ctl00$cphGlobal$cphPublicAccess$publicCFTenders$dgPublicCallForTender$ctl13$ddlPageSelector={7}",
System.Web.HttpUtility.UrlEncode("ctl00$ctl00$cphGlobal$cphPublicAccess$publicCFTenders$dgPublicCallForTender$ctl13$ddlPageSelector"),
/*1*/string.Empty,
/*2*/string.Empty,
/*3*/string.Empty,//vsk
/*4*/string.Empty,
/*5*/"0",
/*6*/"383",
/*7*/"2");
byte[] byteArray = Encoding.UTF8.GetBytes(postData);
//send the form data to the request stream
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream();
dataStream.Write(byteArray, 0, byteArray.Length);
dataStream.Close();
var response = request.GetResponse();
// Get the stream containing content returned by the server.
dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
// Clean up the streams.
reader.Close();
dataStream.Close();
response.Close();
return responseFromServer;
}
Ok, so few things, in the postData string i included every thing i could find on the page that is send. I used fidler for this, and those all (26) arguments it gave me. The one i really need is the pageSelector (to change his value)
Also i notice there is a __VIEWSTATEKEY in the html code, which gets a different value everytime. You can see i tried first to get that value from the html (the vsk string), but that did not change anything..
I am sorry, but i am not familiar with this post/request thing. But i need it for a project for university, so please if someone could help me solve this....
Edit:
Here is a prt scr on what fidler is giving me for the headers: