5

Broad discussion question. Are there any libraries already which allow me to store the state of execution of my application in Java?

E.g I have an application which processes files, now the application may be forced to shutdown suddenly at some point.I want to store the information on what all files have been processed and what all have not been, and what stage the processing was on for the ongoing processes.

Are there already any libraries which abstract this functionality or I would have to implement it from scratch?

Neeraj
  • 8,408
  • 8
  • 41
  • 69
  • If it's a command line application [Spring Batch](http://static.springsource.org/spring-batch/) may help you, in certain way – yegor256 Feb 24 '12 at 07:52
  • Check this out :[persisting state of JVM][1] [1]: http://stackoverflow.com/questions/424341/are-there-any-java-vms-which-can-save-their-state-to-a-file-and-then-reload-that – NiranjanBhat Feb 24 '12 at 08:06
  • If the kind of recovery you are looking at is transactional, then storing the progress data in a DB might be a good option. Otherwise, I am not sure if there is any way of doing this, without loosing data in the event of a crash/shutdown. After all, how do you make a process (at the end of the day, JVM is a process) do the right thing when you issue, say, a "kill -9"? – Pavan Feb 24 '12 at 08:26
  • @PavanSudarshan If you are worried about a kill -9, then even this approach will not help :) – Neeraj Feb 24 '12 at 08:34
  • If that is not the case, then just have an internal state that gets persisted in a JVM shutdown hook. This should work just fine right? – Pavan Feb 24 '12 at 09:16

5 Answers5

3

It seems like what you are looking for is serialization which can be performed with the Java Serialization API.

You can write even less code if you decide to use known libraries such as Apache Commons Lang, and its SerializationUtils class which itself is built on top the Java Serialization API.

Using the latest, serializing/deserializing your application state into a file is done in a few lines.

The only thing you have to do is create a class holding your application state, let's call it... ApplicationState :-) It can look like that:

class ApplicationState {

 enum ProcessState {
  READ_DONE,
  PROCESSING_STARTED,
  PROCESSING_ENDED,
  ANOTHER_STATE;
 }

 private List<String> filesDone, filesToDo;
 private String currentlyProcessingFile;
 private ProcessState currentProcessState;
}

With such a structure, and using SerializationUtils, serializing is done the following way:

try {
      ApplicationState state = new ApplicationState();
      ...
      // File to serialize object to
      String fileName = "applicationState.ser";

      // New file output stream for the file
      FileOutputStream fos = new FileOutputStream(fileName);

      // Serialize String
      SerializationUtils.serialize(state, fos);
      fos.close();

      // Open FileInputStream to the file
      FileInputStream fis = new FileInputStream(fileName);

      // Deserialize and cast into String
      String ser = (String) SerializationUtils.deserialize(fis);
      System.out.println(ser);
      fis.close();
    } catch (Exception e) {
      e.printStackTrace();
    }
Jalayn
  • 8,934
  • 5
  • 34
  • 51
  • on sudden shutdown/crash, you wouldn't have time to serialize (or know when to serialize). – Nishant Feb 24 '12 at 07:59
  • @Nishant I don't see a difference with your solution ? What if the connection to the DB is severed ? I think both our solutions need something like a separate thread for persisting the state of the application whenever something changes, for example by implementing the Observable interface into the ApplicationState. What do you think ? – Jalayn Feb 24 '12 at 08:13
  • ah yeah, my bad, you'd be serializing after each module of processing is over. And probably, a List of files that have been processed will be serialized all along. don't mind my doubt above. – Nishant Feb 24 '12 at 08:33
2

It sounds like the Java Preferences API might be a good option for you. This can store user/system settings with minimal effort on your part and you can update/retrieve at any time. https://docs.oracle.com/javase/8/docs/technotes/guides/preferences/index.html

Moffee
  • 401
  • 5
  • 15
Paul Jowett
  • 6,513
  • 2
  • 24
  • 19
1

It's pretty simple to make from scratch. You could follow this:

  1. Have a DB (or just a file) that stores the information of processing progress. Something like:

     Id|fileName|status|metadata
    
  2. As soon as you start processing a file make a entry to this table. Ans mark status as PROCESSING, the you can store intermediate states, and finally when you're done you can set status to DONE.

    This way, on restart, you would know what are the files processed; what are the files that were in-citu when the process shutdown/crashed. And (obviously) where to start.

In large enterprise environment where applications are loosely coupled (and there is no guarantee if the application will be available or might crash), we use Message Queue to do something like the same to ensure reliable architecture.

Nishant
  • 54,584
  • 13
  • 112
  • 127
0

Apache Commons Configuration API: http://commons.apache.org/proper/commons-configuration/userguide/howto_filebased.html#File-based_Configurations

Sridhar Sarnobat
  • 25,183
  • 12
  • 93
  • 106
  • I think you understood the question incorrectly. Please read the whole description again. – Neeraj Aug 24 '13 at 02:42
  • Ooops, I was looking for some persistent configuration storage and found my answer here and forgot that this question is asking something different. Thanks. – Sridhar Sarnobat Aug 24 '13 at 04:40
0

There are almost too many ways to mention. I would choice the option you believe is simplest.

You can use;

  • a file to record what is done (and what is to be done)
  • a persistent queue on JMS (which support multiple processes, even on different machine)
  • a embedded or remote database.

An approach I rave about is using memory mapped files. A nice feature is that information is not lost if the application dies or is killed (provided the OS doesn't crash) which means you don't have to flush it, nor worry about losing data if you don't.

This works because the data is partly managed by the OS which means it uses little heap (even for TB of data) and the OS deals with loading and flushing to disk making it much faster (and making sizes much larger than your main memory practical).

BTW: This approach works even with a kill -9 as the OS flushes the data to disk. To test this I use Unsafe.getByte(0) which crashes the application with a SEG fault immediately after making a change (as in the next machine code instruction) and it still writes the change to disk.

This won't work if you pull the power, but you have to be really quick. You can use memory mapped files to force the data to disk before continuing, but I don't know how you can test this really works. ;)


I have a library which could make memory mapped files easier to use

https://github.com/peter-lawrey/Java-Chronicle

Its a not long read and you can use it as an example.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 1
    This is great!!Thanks a lot Peter.Worth a read and I would certainly try it. – Neeraj Feb 24 '12 at 09:37
  • Its worth a read, even if you don't use it. i.e. I don't want to be seen as pushing my own library. I just think its a cool/interesting approach that more people should know about. – Peter Lawrey Feb 24 '12 at 09:42