I've been reading a lot about threading but can't figure out how to find a solution to my issue. First let me introduce the problem. I have files which need to be processed. The hostname and filepath are located in two arrays.
Now I want to setup several threads to process the files. The number of threads to create is based on three factors:
A) The maximum thread count cannot exceed the number of unique hostnames in all scenarios.
B) Files with the same hostname MUST be processed sequentially. I.E We cannot process host1_file1 and host1_file2 at the same time. (Data integrity will be put at risk and this is beyond my control.
C) The user may throttle the number of threads available for processing. The number of threads is still limited by condition A from above. This is purely due to the fact that if we had an large number of hosts let's say 50.. we might not want 50 threads processing at the same time.
In the example above a maximum of 6 threads can be created.
The optimal processing routine is shown below.
public class file_prep_obj
{
public string[] file_paths;
public string[] hostname;
public Dictionary<string, int> my_dictionary;
public void get_files()
{
hostname = new string[]{ "host1", "host1", "host1", "host2", "host2", "host3", "host4","host4","host5","host6" };
file_paths=new string[]{"C:\\host1_file1","C:\\host1_file2","C:\\host1_file3","C:\\host2_file1","C:\\host2_file2","C:\\host2_file2",
"C:\\host3_file1","C:\\host4_file1","C:\\host4_file2","C:\\host5_file1","C:\\host6_file1"};
//The dictionary provides a count on the number of files that need to be processed for a particular host.
my_dictionary = hostname.GroupBy(x => x)
.ToDictionary(g => g.Key,
g => g.Count());
}
}
//This class contains a list of file_paths associated with the same host.
//The group_file_host_name will be the same for a host.
class host_file_thread
{
public string[] group_file_paths;
public string[] group_file_host_name;
public void process_file(string file_path_in)
{
var time_delay_random=new Random();
Console.WriteLine("Started processing File: " + file_path_in);
Task.Delay(time_delay_random.Next(3000)+1000);
Console.WriteLine("Completed processing File: " + file_path_in);
}
}
class Program
{
static void Main(string[] args)
{
file_prep_obj my_files=new file_prep_obj();
my_files.get_files();
//Create our host objects... my_files.my_dictionary.Count represents the max number of threads
host_file_thread[] host_thread=new host_file_thread[my_files.my_dictionary.Count];
int key_pair_count=0;
int file_path_position=0;
foreach (KeyValuePair<string, int> pair in my_files.my_dictionary)
{
host_thread[key_pair_count] = new host_file_thread(); //Initialise the host_file_thread object. Because we have an array of a customised object
host_thread[key_pair_count].group_file_paths=new string[pair.Value]; //Initialise the group_file_paths
host_thread[key_pair_count].group_file_host_name=new string[pair.Value]; //Initialise the group_file_host_name
for(int j=0;j<pair.Value;j++)
{
host_thread[key_pair_count].group_file_host_name[j]=pair.Key.ToString(); //Group the hosts
host_thread[key_pair_count].group_file_paths[j]=my_files.file_paths[file_path_position]; //Group the file_paths
file_path_position++;
}
key_pair_count++;
}//Close foreach (KeyValuePair<string, int> pair in my_files.my_dictionary)
//TODO PROCESS FILES USING host_thread objects.
}//Close static void Main(string[] args)
}//Close Class Program
I guess what I'm after is a guide on how to code the threaded processing routines that are in accordance with the specs above.