I want to access a web server using httpwebrequest
and fetch thousands of records from a given range of pages. Each hit to a webpage fetches 15 records, and there are almost 8 to 10000 pages on the webserver. That means a total of 120000 hits to the server! If done trivially with a single process, the task can be very time consuming. Hence, multiple threading is the immediate solution that comes to mind.
Currently, I have created a worker class for searching purpose, that worker class will spawn 5 subworkers (threads) to search in a given range. But, due to my novice abilities in threading, I am unable to make it work, as I am having trouble synchronizing and making them all work together. I know about delegates, actions, events in .NET but making them to work with threads is getting confusing..This is the code that I am using:
public void Start()
{
this.totalRangePerThread = ((this.endRange - this.startRange) / this.subWorkerThreads.Length);
for (int i = 0; i < this.subWorkerThreads.Length; ++i)
{
//theThreads[counter] = new Thread(new ThreadStart(MethodName));
this.subWorkerThreads[i] = new Thread(() => searchItem(this.startRange, this.totalRangePerThread));
//this.subWorkerThreads[i].Start();
this.startRange = this.startRange + this.totalRangePerThread;
}
for (int threadIndex = 0; threadIndex < this.subWorkerThreads.Length; ++threadIndex)
this.subWorkerThreads[threadIndex].Start();
}
The searchItem method:
public void searchItem(int start, int pagesToSearchPerThread)
{
for (int count = 0; count < pagesToSearchPerThread; ++count)
{
//searching routine here
}
}
The problem exists between the shared variables of the threads, can anyone guide me how to make it a threadsafe procedure?