As far as I understood your problem, you have to store arbitrary lists of names per author, and efficiently match them.
I assume you have solved the problem of parsing the names, removing non-essential / optional parts like 'Dr', and preserving particles like 'von' and 'de'. Your normalized name must be a sequence of strings in fixed case (lower case is OK, though I'd stick with upper case or title case).
Now, a List<String>
or String[]
would work as a key to a HashMap
containing other details. This won't work well, I'm afraid, since both are mutable, and I'm not sure their hashCode()
methods work right for the case.
So I'd come up with something like this:
class AuthorName(object) {
private String[] parts;
public AuthorName(String... name_parts) {
assert name_parts.length > 0;
parts = name_parts;
}
@Override
public int hashCode() {
// hashCode() that only depends on name parts
int result = 0;
for (int i=0; i < parts.length; i+=1) result ^= part.hashCode();
return result;
}
}
Map<AuthorName, ...> authors = new HashMap<AuthorName, ...>();
authors.put(new AuthorName('John', 'Doe'), ...);
assert authors.get(new AuthorName('John', 'Doe')) != 0
This does not address many possible problems, like 'Joe Random User', 'Joe R User', and 'J. R. User' be the same person. This should be addressed on a different level.
If you stated your case in more detail, with an example or two, answers could be better.
You might also be interested in the way libraries normalize author names. People use elaborate schemes to match names.