2

I have a question which data structure is the best for particular situation.

we have one string "AAAAAAAAAAA", and we want to know this string contain in one data base column or not.

For example below database there is two column.

1. ID   2. Name
  1        A
  2        B
  3        C
.....
49581    AAAAAAAAAAA

if it's match then, return true if not false.

I know I can use list<string> but I don't think it's best way to searching

I want to know which data structure is best way to search in this case.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
clear.choi
  • 835
  • 2
  • 6
  • 19
  • 6
    Are you searching in memory with c#, or in sql? – gunr2171 Aug 08 '14 at 20:22
  • 6
    You are performing premature optimization. You are making your program run faster before you have it running correctly. "If it doesn’t work, it doesn’t matter how fast it doesn’t work." - Mich Ravera – John Saunders Aug 08 '14 at 20:23
  • - gunr2171 , first get lists in SQL then want to control that lists in C#. – clear.choi Aug 08 '14 at 20:26
  • Do you need to know the ID/IDs of the given string, or only whether it exists (true/false)? – Sphinxxx Aug 08 '14 at 20:28
  • 1
    @clear.choi, it would be better to do the searching in sql. Why pull every row in the table over just to search for a single row? (also, use @ replies to respond to someone) – gunr2171 Aug 08 '14 at 20:29
  • Sphinxxx I need to get only true and false at this moment. – clear.choi Aug 08 '14 at 20:29
  • Similar question, almost a duplicate, same answer: "[Should I be concerned about .NET dictionary speed?](http://stackoverflow.com/a/1903245/76337)". – John Saunders Aug 08 '14 at 20:29
  • gunr2171 because they have lots of data to compare like 2000, then 2000 time connect to database I/O so I thinking to use data structure. And it could be Like search too then I believe that is more problem. – clear.choi Aug 08 '14 at 20:31
  • The time cost to get 2000 rows into your c# application will likely far outweigh the cost of anything you do with those 2000 rows in c#. You could do a linear search of 2000 rows and it would still be a tiny amount of time compared to your data acquisition cost. – hatchet - done with SOverflow Aug 08 '14 at 20:35

2 Answers2

5

A HashSet<string> would be faster to search than a List<string> if you only need to know whether the string exists.

HashSet<T> Class

..or if you feel adventurous, creating a "ternary search tree" or a "trie" may be an option:

http://www.drdobbs.com/database/ternary-search-trees/184410528

Sphinxxx
  • 12,484
  • 4
  • 54
  • 84
0

Similar to another answer, but note that if you have a hash table then for each hashed string in the column you can store the row number(s) that have that string in the hash table position for the string. So hashing is not just limited to determining whether the string exists in your column or not.

user2566092
  • 4,631
  • 15
  • 20