I have about 500 million 128-bit integers, adding about 100M per year. Nothing is ever deleted. The numbers come at a uniform distribution, scale-wise and time-wise.
Basically, all I need is an add operation that also returns whether the number already exists in the DB. Also, I do not want to use too much RAM for this system, so just storing everything in memory is not what I'm looking for.
So far we've been using several MyISAM tables on MySQL, using two bigints as a primary key. This give us OK performance, but I suspect it is not the right tool for this job. We've had some performance problems before splitting tables, and we've had corruptions on power outages. Also, a DB gives us many more feature that we do not need.
I'm using Python on Linux, but I'm open to suggestions.
UPDATE: Marcelo's comment mentioned Bloom Filter, which seems really promising to me. Since I'm working with hashes, I've already given up on complete accuracy, so this might be a great precision/performance trade off.