1

I've been writing a "sort and aggregate" approach to making this ~100MB data set efficiently searchable but the code's getting a bit long.

The objects are simple classes like

class item {
    public int type = 1;
    public int damage = 4;
}

And what I've done is essentially make a class where I can say items.type(1).damage(4).getItem();

Does Java have any classes for turning objects into something searchable in a similar way?

I've been looking at Java Collections and Entities, where what I've written is a giant ArrayList<Item> and several HashMap<Item>

Aage Torleif
  • 1,907
  • 1
  • 20
  • 37
  • 8
    The immediate question that comes to mind: for a 100 million record data set, why don't you want to use a real, proven database? Feels like you're reinventing the wheel. There's plenty of free, open source, fairly easy to use systems. PostgreSQL happens to be my favorite at the moment, but you don't even have to use a relational DB nowadays. – jpmc26 Feb 10 '15 at 23:27
  • [Best data structure for dictionary implementation](http://stackoverflow.com/questions/10017808/best-data-structure-for-dictionary-implementation) – Ascalonian Feb 10 '15 at 23:28
  • 1
    Wild guess, is this Minecraft-related? – user253751 Feb 10 '15 at 23:29
  • @immibis Hah :P No I'm just experimenting game-design just using basic java and Graphics2D. – Aage Torleif Feb 10 '15 at 23:35
  • That should say MB, and because there's sprite data in the items, I guess in without the sprites is much less, but I can't really account for how much. – Aage Torleif Feb 10 '15 at 23:36
  • If you're really in need of something embedded that doesn't require a separate service running, there's also options like HSQL or SQLite. You'll definitely need to do some benchmarking to make sure they can handle that kind of load, though, and I would be very surprised if you didn't have to do some very heavy tuning to get it working reasonably. Even so, I'd certainly rather use a fairly well tested system like those than roll my own. – jpmc26 Feb 10 '15 at 23:37
  • Oh, 100 MB. Still, how many rows are you talking? – jpmc26 Feb 10 '15 at 23:38
  • Its only several hundred, less then a thousand rows easy. – Aage Torleif Feb 10 '15 at 23:39
  • 1
    That's probably more relevant than the data size; it's really not that many. Question: how is the performance using a simple linear search? This is something you should check before trying to optimize. If that's no good, what about optimizing the search for one parameter? That could potentially be done with some kind of bucket structure, which could go in a hash map (mapping parameter to bucket list). You'd just do a linear search on the bucket itself. – jpmc26 Feb 10 '15 at 23:42

1 Answers1

2

Java 8 might be fitting:

Item item = items.stream()
    .filter((it) -> it.type == 1)
    .filter((it) -> it.damage == 4)
    .findFirst().orElse(null);

P.S. In your case .parallel() might help with a speed gain (divide and conquer).

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138