1

I have a following usecase.

  • A process serializes certain objects to a file using BufferedOutputStream.
  • After writing each object, process invokes flush()
  • The use case is that if the process crashes while writing an object, I want to recover the file upto the previous object that has been written successfully.

How can I deserialize such file? How will Java behave while deserializing such file.

  • Will it successfully deserialize upto the object that were written successfully before crash?
  • While reading the last partially written object, what will be the behavior. How can I detect that?

Update1 -

  • I have tried to simulate process crash via manually killing the process while objects are being written. I have tried around 10-15 times.Each time i am able to deserialize the file and file does not has any partial object.

I am not sure if my test is exhaustive enough and therefore need further advice.

Update2 - Adam had pointed a way which could simulate such test using truncating the file randomly. Following is the behavior observed for trying out around 100 iterations -

  • From the truncated file ( which should be equivalent to the condition of file when a process crashes), Java can read upto last complete object successfully.
  • Upon reaching the last partially written object, Java does not throw any StreamCorruptedException or IOException. It simply throws EOFException indicated EOF and ignores the partial object.
MoveFast
  • 3,011
  • 2
  • 27
  • 53
  • Yes, have you tried it? Does it do what you expect might happen? BTW ObjectOutputStream is already buffered so adding a second buffer might not do anything. – Peter Lawrey Sep 21 '12 at 09:20
  • updated the post. I have tried it but not sure if my test was exhaustive. – MoveFast Sep 21 '12 at 09:26
  • Think the problem with your tests is that you are buffering the output then writing... If you were to create a small buffer(or use a large object) and kill the process just after calling flush() you should see that the Serialized file becomes corrupt. Otherwise you are probably buffering the entire object and then flushing it, making error detection very hard. – David Sep 21 '12 at 09:33

4 Answers4

2

Each object is deserialized or not before reading the next one. It won't be impacted because a later object failed to be written or will fail to deserialize

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • What will be the behavior while reading the last partially written object. Which exception will I get? – MoveFast Sep 21 '12 at 09:47
  • You will either get EOFException if the file has ended or StreamCorruptedException like this http://stackoverflow.com/questions/2393179/streamcorruptedexception-invalid-type-code-ac Both are IOExceptions. – Peter Lawrey Sep 21 '12 at 09:49
  • How can I differentiate b/w the two cases for `StreamCorruptedException` if this is due to process crash vs any other reason. – MoveFast Sep 21 '12 at 09:53
  • What two cases? I can only see one. – Peter Lawrey Sep 21 '12 at 09:56
  • Case 1 - Process crash and therofore while reading last object I am getting `StreamCorruptedException` Case 2 - The same exception is due to some other reason. e.g. There was some error while writing object or class version being used for reading is different from one used for writing. – MoveFast Sep 21 '12 at 09:59
  • Why do you need to know the difference? You know know this at the time of writing the data and you should have to store this as well. – Peter Lawrey Sep 21 '12 at 10:01
  • I need to know the difference because my action is going to be different in each case. CAse 1 - I will read whatever Objects I have got and resume work. CAse 2 - I will discard serialized file and create objects from scratch. – MoveFast Sep 21 '12 at 10:04
  • In that case the writer should delete the file if it fails to be written for some other reason. The reader has no way of knowing why the stream stopped. – Peter Lawrey Sep 21 '12 at 10:09
  • can I detect if the object I am reading is the last object on file? then i can detect b/w case 1 and case 2 – MoveFast Sep 21 '12 at 10:09
1

I suspect you are misusing java serialization - it's not intended to be a reliable and recoverable means of permanent storage. Use a database for that. If you must, you can use a database to store the serialized form of java objects, but that would be pretty inefficient.

ddyer
  • 1,792
  • 19
  • 26
  • 2
    Serialization isn't a bad option for temporary persistence (e.g message queues/disk based cache) – Sami Korhonen Sep 21 '12 at 09:42
  • true, but then you wouldn't be worried about the behavior and reliability of recovering partially serialized (ephemeral) data in cases of a system crash. The OP's description sounds like he's concerned with protecting valuable data. – ddyer Sep 21 '12 at 09:53
0

Yeah, testing such scenario manually (by killing the process) may be difficult. I would suggest writing a test case, where you :

  1. Serialize a set of objects and write them to a file .
  2. Open the file and basically truncate it at random position.
  3. Try to load and deserialize (and see what happens)
  4. Repeat 1. to 3. with several other truncate positions.

This way you are sure that you are loading a broken file and that your code handles it properly.

Adam Dyga
  • 8,666
  • 4
  • 27
  • 35
  • 1
    the test i actually need is to test reading partial object. So I should be reading around object boundries rather than random position. – MoveFast Sep 21 '12 at 09:42
  • 2
    I didn't write anything about reading at random position. I wrote about truncating (breaking) the file at random positon. If you truncate the file at random position and repeat the test many times it's unlikely that the file will be truncated at object boundaries every time. – Adam Dyga Sep 21 '12 at 09:58
-1

Have you tried appending to ObjectOutputStream? You can find the solution HERE just find the post where explains how to create an ObjectOutputStream with append.

Community
  • 1
  • 1
israelC
  • 457
  • 4
  • 6