2

I am using VS2010 Express and newbie to programming.

I am extracting betting odds from different site and making records.

However, they use different names for the same team and it is the only way to match the team across the sites.

For example, Man United, Man Utd , Manchester United, Manu are the same team but used on different sites.

I believe this is not a rare problem and there should be some standard ways or object types to solve it.

If there are , please tell me.

At this stage, I decide to make a list as the database

List<teamdata> teamTable = new List<teamdata>();

public class teamdata
{
private long teamId;
private List<string> teamName; // Names like Man United, Man Utd... are added
...
}

I need to search via the table for every name(Some fast searching algorithms) until a team id could be assigned.

I know this is the worst implementation. Please tell me the correct direction.

Isolet Chan
  • 417
  • 6
  • 20

2 Answers2

1

You could simplify your design with List and class teamdata into a Dictionary<long,HashSet<string>> teams - where key is team ID and value is set of alternate names, and the matter of finding the team would be to call (let's say your `

Dictionary<long,HashSet<string>> teams = new Dictionary<long,HashSet<string>>();
... fill data
string queryName = "Man Utd"`) 

var teamOrNull = teams.Where(p=>p.Value.Contains(queryName)).FirstOrDefault() 
if(teamOrNull != null)
   long foundID = teamOrNull.Key;
Axarydax
  • 16,353
  • 21
  • 92
  • 151
  • This is better than using list! What is the complexity of the search? – Isolet Chan Mar 28 '13 at 15:38
  • This answer http://stackoverflow.com/questions/9812020/what-is-the-lookup-time-complexity-of-hashsettiequalitycomparert says that complexity of HashSet lookup is O(1). As the code will look in all HashSets, it will grow with number of teams (n), so I'd say O(n), which is pretty good. – Axarydax Mar 28 '13 at 19:35
0

This kind of problem can be resolved with fuzzy logic (http://en.wikipedia.org/wiki/Fuzzy_logic). If you use SQL Server, the function SOUNDEX might help you.

schglurps
  • 1,387
  • 1
  • 14
  • 26