0

I am trying to make an adjacencylist. The problem is that my implementation cost to much in memory...

I read a file of words, add them to my graph and add the neighbors of the graph

static public void createEdges(){
    for(String string : words){  //words is the list of all the ~4 000 words that have been read
        ArrayList<String> wordlist = new ArrayList<String>();
       // ........................................................
       //(irrelevant part where I create a new String "word" from "string")
       // .........................................................
                    if (Contains(word) != null){  //Contains checks if the new String "word" is a part of the file I read. 
                        wordlist.add(word);
                    }
        list.put(string, wordlist); //list is of type: Hashtable<String, ArrayList<String>>; and represents the wordgraph where "string" is the node and "wordlist" is the neighbors of the node
    }}

One way I think would save memory is to not create a new ArrayList evere loop, since it will loop ~4 000 times... But then I can't think of any good way to store my adjacent nodes. Is use of this datastructure doomed for my purpose or are there any smarter ways to implement this?

Dominique Fortin
  • 2,212
  • 15
  • 20
  • This code is kind of confusing. There is no context for the variable "list", the object "Contains", and the variable "word". Without more context, it will be very difficult to help. Also, how do you know this will cost too much memory? What are you trying to achieve? – aglassman Oct 25 '18 at 21:42
  • @aglassman Yes I see that now, thanks. Contains is a method that simply checks if the file I read contains the new String "word". List i the adjacency list, so the variable "string", in list, is the node and the variable "wordlist" is the ArrayList of neighbors. I am making a BFS to find my way from one word to another. I want to implement a datastructure that support this search as fast as possible but nor storing to much data, which I feel like this does since it creates a large number of ArrayLists... – Erik Svensson Oct 25 '18 at 21:59
  • Interesting problem, but I don't get the part where you're putting the list into a map. Since you're creating a new ArrayList for each for loop iteration, won't every entry in the Hashtable be one item long? – aglassman Oct 25 '18 at 22:03

1 Answers1

0

I think if this is coded correctly, your approach shouldn't use up too much extra memory. You could probably save memory by making sure you initialize the ArrayList with how many elements you are going to add. This may or may not work for your algorithm.

You may have to create a new ArrayList after determining all the values, then copy your results into it. This will reduce the excess capacity the list will have allocated. When an array list reaches it's internal capacity, it has to allocate more memory.

This will only save you a few bytes per ArrayList, so will only be worth it if you have LOTS of entries in you map.

ArrayList vs LinkedList from memory allocation perspective

Another aspect to consider is how much space the String object is taking up. It seems you are referencing a file, and a list of words. If the file can stay in memory as a large string (max size is ~2GB) then all .substring calls will just be "windows" into that file. If you are creating new String objects, it is going to be more memory intensive.

Why does appending "" to a String save memory?

aglassman
  • 2,643
  • 1
  • 17
  • 30
  • Do you mean something like: `ArrayList wordlist2 = new ArrayList(wordlist.size()); for(String tmp: wordlist){ wordlist2.add(tmp); }` ? – Erik Svensson Oct 25 '18 at 22:30
  • Yes, this may save you a few bytes. – aglassman Oct 25 '18 at 22:34
  • I found a code that added every read word in a Hashtable wordlist2;. When adding neighbors, instead of adding it like: `list.put(string, wordlist);` it was added like: `list.get(string).add(wordlist2.get(word));`. This runs slower but saves more memory. Any idea why? – Erik Svensson Oct 25 '18 at 23:54
  • This makes more sense. You're first getting the list from the map, then adding the string from the second list. That way you aren't creating a new list every time, and you also are referencing the same string object in both lists. – aglassman Oct 26 '18 at 00:45