2

I am facing the problem of reading in a certain number which is not unique and a date which is also not unique to this number.

The program is extremely computational intensive and performs not so well on my ide, therefore I am facing the problem of using the right data structure.

At the moment I have created an index and I read the number into one HashMap and the date into the other HashMap. Then I am just matching them if I need them. However the reading in takes two functions each with a while loop.

public HashMap<String,String> getEventDates() throws Exception {
    String csvFile = "C:\\Users\\test.csv";

    CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
    String [] line;
    HashMap<String, String> eventMap = new HashMap<String, String>();

    while ((line = reader.readNext()) != null) {            
        eventMap.put(line[15], line[13]);
    }

    reader.close();
    return eventMap;
}

public HashMap<String,String> getNumberToEventDates() throws Exception {
    String csvFile = "C:\\Users\\test.csv";

    CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
    String [] line;
    HashMap<String, String> isinMap = new HashMap<String, String>();

    while ((line = reader.readNext()) != null) {            
        isinMap.put(line[15], line[4]);
    }

    reader.close();
    return isinMap;
}

Which data structure should I use to perform better? How can I merge these two methods?

I appreciate your answers!

UPDATE

Oh I am so sorry.

In fact after every while iteration line[15] which is just an index created by me.

How can I merge this two functions?

user2051347
  • 1,609
  • 4
  • 23
  • 34

3 Answers3

4

I think you should not use two function as reading from file is slower, rather have function modified like,

public HashMap<String, SimpleEntry<String,String>> getEventDatesAndNumber() throws Exception 
{
    String csvFile = "C:\\Users\\test.csv";

    CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
    String [] line;
    HashMap<String, SimpleEntry<String,String>> eventMap = new HashMap<String, SimpleEntry<String,String>>();

    while ((line = reader.readNext()) != null) 
    {            
        eventMap.put(line[15], new SimpleEntry<String , String>(line[13],line[4]));
    }

    reader.close();
    return eventMap;
}

EDIT Tim B idea is also not bad, you have MapKey class and then you change the above method as,

public HashMap<String, MapKey> getEventDatesAndNumber() throws Exception 

and then make necessary changes.

Community
  • 1
  • 1
Deepak Bhatia
  • 6,230
  • 2
  • 24
  • 58
  • 2
    i have never read of simpleEntry, but i think its exactly what he is looking for. – kai Jan 09 '14 at 09:47
1

If I understand rightly your unique index is the combination of the number and the date, and you then want to look up a value that is mapped from that?

The way to handle this is to create a MapKey object which contains the number and date:

class MapKey {
   final int number;
   final Date date;

   // Use Your IDE To generate equals and hash code. This is important!
}

Then just have a single Map<MapKey, Data> and you can do fast lookups by just doing

map.get(new MapKey(number, date));

This will be even faster if you already have the MapKey object rather than recreating it all the time, but it's not a big deal if you do need to create it.

Actually looking again it seems you are mapping from one value to two values, so to do that it would be the other way around:

class Data {
   int number;
   Date date;

   // Generate constructor etc in IDE
}

Map<String, Data> map = new HashMap<>();

Then just have one method and change the for loop to read:

while ((line = reader.readNext()) != null) {            
    eventMap.put(line[15], new Data(line[13], line[4]));
}
Tim B
  • 40,716
  • 16
  • 83
  • 128
  • @Thx a lot for your answer! I would appreciate if you could have a look at my update! – user2051347 Jan 09 '14 at 09:46
  • Why create a class to do what a built-in type can already do for you? (e.g Typle<>) – Christophe De Troyer Jan 09 '14 at 10:04
  • 1
    Mostly for type safety and code readability. It also allows for potential enhancements in future if more methods are added to the MapKey object and allows the object to be passed around between methods type-safely. – Tim B Jan 09 '14 at 10:06
  • 1
    After all why use a class for anything when you could just use Object[]... :) – Tim B Jan 09 '14 at 10:08
  • 1
    @TimB As you mentioned `This will be even faster if you already have the MapKey object rather than recreating it all the time,` don't you think so it is wrong if the same object is used in all the list then values will be overridden..... – Deepak Bhatia Jan 09 '14 at 10:13
  • For different values you need different map key objects clearly (that's partly why the fields are final, also because they being used as keys in a HashMap where they need to not change). I meant that if you already have the map key for that set of values don't recreate it. I.e. if you query on the same key multiple times or if you have another structure somewhere storing keys, etc. then it will be worth using the key from that but if you don't then it is no big deal to create a new key. – Tim B Jan 09 '14 at 10:23
1

I'm going to start by assuming that your CSV data is structured in some sane format like this.

NUM_HEADER,DATE_HEADER
NUM_VALUE,DATE_VALUE
NUM_VALUE,DATE_VALUE

Assuming the above is true, you should basically be transforming rows in the CSV file into objects in Java. Usually with attributes that match one-to-one with the values in the CSV file.

So your code would look something like this.

public Events getEvents() throws Exception {
    String csvFile = "C:\\Users\\test.csv";

    CSVReader reader = new CSVReader(new FileReader(csvFile), ';');
    String [] line;
    HashMap<String, String> eventMap = new HashMap<String, String>();

    while ((line = reader.readNext()) != null) {            
        events.put(line[15], new Event(line[13], line[4]));
    }

    reader.close();
    return events;
}

Then you also need a new value class to hold both of these variables together.

class Event {
    private int num;
    private int date;

    public Event(int date, int num) {
        this.date = date;
        this.num = num;
    }

    // Use Your IDE To generate equals and hash code. This is important! Because we're going to put this value class into a Java collection
}

Java collections - overriding equals and hashCode

And finally a class to hold the value classes in a nice service provider.

class Events {
     private Map map = new HashMap<Integer, Event>;

     public put(int uniqueId, Event event) {
         map.put(uniqueId, event);
     }

     //Now you can offer any kind of domain specific services to the consumer of the Events class that you want.
}

I like this kind of structure because it's very easy on the client code. A lot of complexity and book keeping is locked up inside of the Event and Events class. You can also put validation there, and offer a lot of convenience methods.

The last step, wrapping up everything in an "events" object is only necessary depending upon your use case. If the only functions you want are the exact ones offered by the Map interface, then I wouldn't wrap it. If you need other functionality specific to your domain, then I would wrap it. But generally I tend to lean towards OO classes whenever possible. From a client perspective it's a lot more clear what's going on if you're handling an Events class, as opposed to a Map<foo,bar> class. It's just more semantically meaningful, and that can make a big different on helping the client understand what's going on.

Community
  • 1
  • 1
Jazzepi
  • 5,259
  • 11
  • 55
  • 81