1

I have 2 DataTables: TableNumber containing mobile numbers and TableCode which contains a mix of all possible mobile codes which is 6 digits long all. i want to create a list to have only numbers which its 6 first digits are from TableCode, so any number which its first digits are not in TableCode will not be considered. i have tried this with foreach, .Contains(), IndexOf() but all are slow because records in numbers are more than 100,000 and it takes too long to loop through all items. and compare with another table. i use 2 nested foreach loop. i'm doing something stupid i think with 2 foreach because that will be 3 billion searches for a 30,000 members from TableCode and it takes me 5 minutes to give me result. my code is like this:

foreach(string codetable in TableCode)
     {
          foreach(string grouptable in TableNumber)
                 {
                    if(grouptable.IndexOf(codetable)!=-1)
                    {
                        //work here
                    }
                 }
     }

here i have added tables' Number rows to a list which contains only numbers so here i am searching lists but similar to this when trying to compare DataTables again it takes too long.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
Hossein Amini
  • 716
  • 9
  • 21
  • _"have tried this with foreach, .Contains(), IndexOf() but all are slow... i use 2 nested foreach loop. i'm doing something stupid i think with 2 foreach because that will be 3 billion searches ...."_ Simply show what you've tried. It's quite difficult to understand the problem. – Tim Schmelter Mar 05 '13 at 08:49
  • I'm struggling to see what your trying to achieve, perhaps re-explain the question!?! – Derek Mar 05 '13 at 08:51
  • _" i want to create a list to have only numbers which its 6 first digits are from TableCode"_ So why don't you simply take all numbers from the `DataTable` TableCode only? – Tim Schmelter Mar 05 '13 at 08:51
  • see, datatable1 has 70,000 record of U.S. mobile numbers which are 6 digits. i have another datatable which is recipients numbers but some of them are false numbers, so i want to filter this datatable and for each record if it is in the 6-digit table then it is a valid numbers code is like this: foreach(string codetable in TableCode) { foreach(string grouptable in TableNumber) if(grouptable.IndexOf(TableCode)!=-1) { //work here } } – Hossein Amini Mar 05 '13 at 09:07
  • You could start by filtering out all numbers in the "recipients numbers" datatable that are != 6 digits to prevent processing them if this validation was not done. Might speed up a little – jordanhill123 Mar 05 '13 at 09:16
  • The first table contains the valid numbers. Should we take only the first 6 digits of the number in both tables? – Tim Schmelter Mar 05 '13 at 09:17
  • TableCode: contains all 6 digits which have correct combination of mobile: for example it has 858999 this is a valid mobile number's first 6 digits, ok? now i have another datatable : GroupDataTable which contains a list of all numbers 10 digit long, which normally it will contain 6-digit at the begining and the rest 4 digits, now i want to make sure that this 10 number is inside that TableCode which makes sure that number is a valid otherwise forexample if instead of 858999 it has 888889 then it is wrong number – Hossein Amini Mar 05 '13 at 09:24
  • in fact i want to take all that 10-digits which their first 6-digits are in the TableCode, but this search takes too long, i said near 5 minutes for 30,000 numbers to be compared using foreach – Hossein Amini Mar 05 '13 at 09:28
  • Now you see why it's important to show some sample data and a desired result. We cannot predict. In this case you just have to create two datatable and add 10 lines where you add sample data. Then we can copy paste the code and see the types and the rules and everybody can test the answers. – Tim Schmelter Mar 05 '13 at 09:31
  • the cod for this part is big, i want to be helped just by knowing that the fastest way of searching through nested loops is what, or if it is good at all to search two lists for a match in foreach loops? forget about my scenario, just is it a fast way for searching an item from a list inside another list? for example stringA in list A which will be searched in ListB to find a match. i am so sorry to bother all of you, thanks everyone for your replies. – Hossein Amini Mar 05 '13 at 09:47

2 Answers2

3

Perhaps convert the datatables to IEnumerable as per the following: Convert DataTable to IEnumerable<T>

Then perhaps use yield return and perhaps handle the processing on separate threads or even use LINQ for filtering.

Maybe implement some sorting on the tables and break them into smaller chunks as well and spawn more threads for parallel processing.

Community
  • 1
  • 1
jordanhill123
  • 4,142
  • 2
  • 31
  • 40
1

So TableNumber is the "positive" table which you want to use to filter the TableCode-DataTable.

So the model is similar to this:

var TableNumber = new DataTable();
var TableCode = new DataTable();
TableNumber.Columns.Add("MobileNumbers", typeof(string));
TableCode.Columns.Add("MobileCode", typeof(string));

Then you can use a HashSet<string> with all valid numbers and Enumerable.Join to link the rows in the second table with the valid numbers:

var numbersFirst6digits = TableNumber.AsEnumerable()
    .Select(r => new string(r.Field<string>("MobileNumbers").Where(Char.IsDigit).Take(6).ToArray()));
var dictionary = new HashSet<string>(numbersFirst6digits);

var validCodeRows = from row in TableCode.AsEnumerable()
                    join num in dictionary
                    on row.Field<string>("MobileCode") equals num
                    select row;
// if you need a new DataTable:
DataTable tblValidCodes = validCodeRows.CopyToDataTable();

If you don't need to filter the first table for the first 6 digits, you can replace this line:

.Select(r => new string(r.Field<string>("MobileNumbers").Where(Char.IsDigit).Take(6).ToArray()));

with

.Select(r => {var mNum = r.Field<string>("MobileNumbers"); return mNum.Length < 6 ? mNum : mNum.Substring(0, 6)};);
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • can u please look at the edited question? i have done these but all take long and i dont know why, maybe the records are too many, should i use several threads to speed up this? – Hossein Amini Mar 06 '13 at 11:32
  • @HosseinAmini: Rollbacked to the last version of your question since you have [asked another question](http://stackoverflow.com/questions/15246490/fast-way-of-extraction-of-a-list-based-on-another-list) for your new requirement. – Tim Schmelter Mar 06 '13 at 13:19