check if values are in datatable

Question

I have an array or string:

private static string[] dataNames = new string[] {"value1", "value2".... };

I have table in my SQL database with a column of varchar type. I want to check which values from the array of string exists in that column.

I tried this:

public static void testProducts() {
            string query = "select * from my table"
            var dataTable = from row in dt.AsEnumerable()
                            where String.Equals(row.Field<string>("columnName"), dataNames[0], StringComparison.OrdinalIgnoreCase)
                            select new { 
                                Name = row.Field<string> ("columnName")
                            };
            foreach(var oneName in dataTable){
                Console.WriteLine(oneName.Name);
            }
        }

that code is not the actual code, I am just trying to show you the important part

That code as you see check according to dataNames[index]

It works fine, but I have to run that code 56 times because the array has 56 elements and in each time I change the index

is there a faster way please?

the Comparison is case insensitive

SELECT COUNT(*) FROM MyTable; When execute It. if (countRec>0) then you have records in table — realnumber3012, Aug 11 '14 at 06:38
@realnumber3012 I am not asking about the number of items into the table. i want to check if the values in the array exit in the table. the table has more than 200 rows I can see them — Agnieszka Polec, Aug 11 '14 at 06:41

score 2 · Answer 1 · edited May 23 '17 at 12:05

2

Passing a list of values is surprisingly difficult. Passing a table-valued parameter requires creating a T-SQL data type on the server. You can pass an XML document containing the parameters and decode that using SQL Server's convoluted XML syntax.

Below is a relatively simple alternative that works for up to a thousand values. The goal is to to build an in query:

select col1 from YourTable where col1 in ('val1', 'val2', ...)

In C#, you should probably use parameters:

select col1 from YourTable where col1 in (@par1, @par2, ...)

Which you can pass like:

var com = yourConnection.CreateCommand();
com.CommandText = @"select col1 from YourTable where col1 in (";
for (var i=0; i< dataNames.Length; i++)
{
    var parName = string.Format("par{0}", i+1);
    com.Parameters.AddWithValue(parName, dataNames[i]);
    com.CommandText += parName;
    if (i+1 != dataNames.Length)
        com.CommandText += ", ";
}
com.CommandText += ");";
var existingValues = new List<string>();
using (var reader = com.ExecuteReader())
{
    while (read.Read())
        existingValues.Add(read["col1"]);
}

Given the complexity of this solution I'd go for Max' or Tim's answer. You could consider this answer if the table is very large and you can't copy it into memory.

edited May 23 '17 at 12:05

Community

1
1

answered Aug 11 '14 at 06:42

Andomar

232,371
49
380
404

why did you use `count`? please – Agnieszka Polec Aug 11 '14 at 06:45
I need to know which values from the array exist in the table, not the count of them. got me please? – Agnieszka Polec Aug 11 '14 at 06:46
I am trying your update, but can I put `select distanct col1...` ? and in your query it is not case sensitive isn't it? – Agnieszka Polec Aug 11 '14 at 06:58
The solution is hard. I am going to try the other answers, +1 thanks for the efforts – Agnieszka Polec Aug 11 '14 at 07:02
The default SQL collation is not case sensitive. You can put `distinct` if you expect values to occur more than once. – Andomar Aug 11 '14 at 07:02

score 2 · Answer 2 · answered Aug 11 '14 at 06:46

2

try to use contains should return all value that you need

var data = from row in dt.AsEnumerable()
                   where dataNames.Contains(row.Field<string>("columnName"))
                   select new
                   {
                       Name = row.Field<string>("columnName")
                   };

answered Aug 11 '14 at 06:46

Max

156
9

1

are you sure that the `Contains` is case insensitive? – Agnieszka Polec Aug 11 '14 at 06:47
+1 Welcome to SO. [This question](http://stackoverflow.com/questions/3360772/linq-contains-case-insensitive) seems to explain how you can make `Contains` case insensitive – Andomar Aug 11 '14 at 06:58
you can try make the datanames ToUpper() Or ToLower() then use the contains dataNames.Contains(row.Field("columnName").ToLower()) – Max Aug 11 '14 at 06:58
1

@Max: don't use `ToUpper`/`ToLower`, that will create many temporary strings in memory, so it's inefficient. It can also be incorrect. Read: http://stackoverflow.com/questions/234591/upper-vs-lower-case Instead use `StringComparsion`/`StringComparer`. – Tim Schmelter Aug 11 '14 at 07:12

Tim Schmelter · Accepted Answer · 2014-08-11T07:09:02.660

2

First, you should not filter records in memory but in the datatabase.

But if you already have a DataTable and you need to find rows where one of it's fields is in your string[], you can use Linq-To-DataTable.

For example Enumerable.Contains:

var matchingRows = dt.AsEnumerable()
    .Where(row => dataNames.Contains(row.Field<string>("columnName"), StringComparer.OrdinalIgnoreCase));

foreach(DataRow row in matchingRows)
    Console.WriteLine(row.Field<string>("columnName"));

Here is a more efficient (but less readable) approach using Enumerable.Join:

var matchingRows = dt.AsEnumerable().Join(dataNames,
    row => row.Field<string>("columnName"),
    name => name,
    (row, name) => row,
    StringComparer.OrdinalIgnoreCase);

edited Aug 11 '14 at 07:09

answered Aug 11 '14 at 06:56

Tim Schmelter

450,073
74
686
939

I'm confused: first you say you should filter in the database, then you provide code to filter in memory? Providing the exact same answer as Max too. – Andomar Aug 11 '14 at 06:57
@Andomar: you have already provided the approach that demonstrates how to do it in the database. So why should i repeat it? I've just tried to answer OP's original question. **Edit** I haven't seens his answer. Then i can delete this. (Although OP explicitly asked for a case insensitive comparison) – Tim Schmelter Aug 11 '14 at 06:59
My answer certainly looks overly complex. An array is not an enumerable though. And `IList.Contains` does not appear to take a `StringComparer` parameter... ? – Andomar Aug 11 '14 at 07:01
@Andomar: It is [`Enumerable.Contains`](http://msdn.microsoft.com/en-us/library/bb339118(v=vs.100).ASPX) (i have tested this approach). – Tim Schmelter Aug 11 '14 at 07:02
+1 Ooh, I'm confusing `enum` and `IEnumerable`, thanks – Andomar Aug 11 '14 at 07:03
I tried your code before your update the quetsion and it works good, but you gave me the matching values, could you give me the not matching values? +1 I will accept ur answer – Agnieszka Polec Aug 11 '14 at 07:12

score 0 · Answer 4 · answered Aug 11 '14 at 07:17

Sorry I don't have a lot of relevant code here, but I did a similar thing quite some time ago, so I will try to explain.

Essentially I had a long list of item IDs that I needed to return to the client, which then told the server which ones it wanted loaded at any particular time. The original query passed the values as a comma separated set of strings (they were actually GUIDs). Problem was that once the number of entries hit 100, there was a noticeable lag to the user, once it got to 1000 possible entries, the query took a minute and a half, and when we went to 10,000, lets just say you could boil the kettle and drink your tea/coffee before it came back.

The answer was to stick the values to check directly into a temporary table, where one row of the table represented one value to check against. The temporary table was keyed against the user who performed the search, so this meant other users searches wouldn't become corrupted with each other, and when the user logged out, then we knew which values in the search table could be removed.

Depending on where this data comes from will depend on the best way for you to load the reference table. But once it is there, then your new query will look something like:-

SELECT Count(t.*), rt.dataName
FROM table t
RIGHT JOIN referenceTable rt ON tr.dataName = t.columnName
WHERE rt.userRef = @UserIdValue
GROUP BY tr.dataName

The RIGHT JOIN here should give you a value for each of your reference table values, including 0 if the value did not appear in your table. If you don't care which one don't appear, then changing it to an INNER JOIN will eliminate the zeros.

The WHERE clause is to ensure that your search only returns the unique items that you are looking for at the moment - the design should consider that concurrent access will someday occur here (even if it doesn't at the moment), so writing something in to protect it is advisable.

check if values are in datatable

4 Answers4