0

I need some advice on following multi-threading scenario. I have a xml file which contain some configuration data for a web application. When a user access the website, base on the url accessed by the user, data from the XML file will be read to find out some properties attached to that user request. To improve the performance of reading from xml file, I have used Ehcache. I am caching the request and relevant configuration from xml in the cache.

So , now the problem I am facing is. If somebody start updating the xml file, I need to stop the reading from cache until writing is completed and once the xml file update operation is done, I want to clear the cache. Then cache will be rebuild again. For this second part, I am struggling on how to implement multi threading to achieve this. File update can be done in two ways, one is user directly editing the xml file using notepad++ or some other tool. other way is using the same web application.

KItis
  • 5,476
  • 19
  • 64
  • 112
  • How often is this XML file updated? – Dale May 23 '16 at 03:55
  • may be once per day or twice per day. its not very often – KItis May 23 '16 at 03:56
  • So I guess the real question is do you really need to build functionality for this or is this a 1% use case where you probably won't use this functionality enough to make it worth it. If you won't be using it that much, edit the XML and restart your web application at the end of the day or during a maintenance window. – Dale May 23 '16 at 03:58
  • Yes, I agree with you. Considering it is a valid scenario. How would I address it – KItis May 23 '16 at 04:00
  • You could use a JMX interface in order to reset EHCache after an update. This person on this thread had the same issue. http://stackoverflow.com/questions/10912087/how-to-clear-ehcache-without-server-restart – Dale May 23 '16 at 04:05

1 Answers1

1

You can refresh the cache using a pool of worker threads reading from a shared BlockingQueue; when the user finalizes their edit you'll send a message to the queue with information on the kv pairs that need to be updated.

public class RefreshRequest<K> {
    public final K key;
    public RefreshRequest(K key) { this.key = key; }
}

public final BlockingQueue<RefreshRequest<?>> requestQueue = new ArrayBlockingQueue<>(200);
public final int concurrency = 4;
public final ExecutorService executor = Executors.newFixedThreadPool(4);

for(int i = 0; i < concurrency; i++) {
    executor.execute(new Runnable() {
        public void run() {
            try {
                while(true) {
                    RefreshRequest<?> request = requestQueue.take();
                    // refresh key
                }
            } catch(InterruptedException e) {
                return; // maybe log the exception as well
            }
        }
    };
}

The workers will consume requests to refresh cache keys; you'll need to put new requests on the queue from the code that finalizes a change to the xml file. To terminate the workers call executor.shutdownNow() which will break out of the while(true) loops with an InterruptedException


As for how to stop reading from the cache when somebody starts writing to the xml, you can do this with an "optimistic" read. Assign a version number to each xml file, and increment this version when you write to the file. When you start a read, store the file's version in a local variable. When the read finishes, compare the local version to the file's current version - if they match then return the read value, if they don't match then repeat the read including updating the local variable to the now-current file version. If need be you can have an "invalid" version (e.g. "valid" versions start at 0 and are incremented on each write, while an "invalid" version is negative-1) - if the reader reads that the file is "invalid" then it pauses for e.g. 5 seconds and then tries again. So one algorithm might be

public Object read(K key) {
    while(true) {
        int version = versionCache.get(key);
        if(version == -1) Thread.sleep(5000);
        else {
            Object returnVal = cache.get(key);
            if(version == versionCache.get(key)) 
                return returnVal;
        }
    }
}
Zim-Zam O'Pootertoot
  • 17,888
  • 4
  • 41
  • 69
  • This is kind of solution I am looking. But "Key" is not known to me. key is generated from the user requests coming to web. XML file does not have key. From the user request I will find the matching entries from the xml using some field in the xml(field is written in regx format). those matching elements with the key as the user request is put into the cache. So I don't have way to inform which key and value pair is updated – KItis May 23 '16 at 04:25
  • Another concern is I need to stop user from reading from the cache when somebody start writing to xml. because cache may have data which is not correct. – KItis May 23 '16 at 04:30
  • @KItis How are you storing/accessing/whatever the XML in ehcache, e.g. is the XML string associated with a filename key or something along those lines? – Zim-Zam O'Pootertoot May 23 '16 at 04:31
  • @KItis See the edit at the bottom of my answer for a way to stop reading from the cache while the xml is being updated - the solution isn't to interrupt the reader as this can be very complicated, but instead to have the reader validate that no changes were made to the file while the read was occurring. – Zim-Zam O'Pootertoot May 23 '16 at 04:35
  • Its a great solution to stop reading from cache if document is being updated. For your second question. There is a field in the xml, which contains some regular expression. So, when a user request comes to the web application, I will perform regular expression matching for all the entries in the xml again the user url to find the matching entries. So, what is the key. Key is the user request. What is the value. Value is the matching entries from the xml. Hope this clear out your question regarding how I am accessing the entries in the cache. – KItis May 24 '16 at 02:02
  • @KItis In that case I would enqueue one `RequestRefresh` per modified XML element, or per XML element if you don't have an easy way of tracking which elements were changed between versions. – Zim-Zam O'Pootertoot May 24 '16 at 02:15