-1

I have an application of POJOs (plain old java objects) representing my data.

While running, the application manipulates and remembers data as desired.

Now I want to implement a save/load feature.


I am NOT asking about basic file I/O.

I am NOT asking whether ObjectOutputStream exists.


Options I have found are those such as:

1) JSON/XML/YAML libraries such as Gson, Jackson

2) Roll your own binary file format marking everything as Serializable with a Serialization Proxy pattern.

Option 1 is unsuitable because my data model can feature cyclic references. Gson resulted in a stack overflow.

Option 2 is unsuitable because the files should be cross platform and independent of JVM; it should work on desktop and android java.

A properties file is also obviously unsuitable due to the complexity of the model.


Please do not attack my use case; my data model is perfectly well designed. The example may not be.

I will now give example code of the kind of structure that needs to be saved.

class Application {

    //This College is my top level object. It could correspond to an individual save file.
    College college = new College();

    //I would love to be able to just throw this guy into a file.
    SomeLibrary.writeToFile(college);
    //And read another back.
    College college2 = SomeLibrary.readFromFile(anotherCollege);
}

class College {
    //The trees are implemented recursively, so this is actually just the root of each tree.
    Tree<Course> artCourseTree;
    Tree<Course> engineeringCourseTree;
    Tree<Course> businessCourseTree;

    List<Student> maleStudents;
    List<Student> femaleStudents;
}

class Course {
    //Each course only has 2 students in this example. Ignore.
    Student student1;
    Student student2;

    List<Exam> examsInCourse;
    LocalDate courseStartDate;
    Period duration;
}

class Student {
    String name;
    List<Exam> listOfExamsTaken;
}

class Exam {
    Student studentTakingIt;
    LocalDate dateTaken;
    BigDecimal score;
}

As you can see, Exams are intended to be the atomic object in this model at the bottom of the hierarchy. However, not only are they referenced by both Students and Courses, but they also refer back up to a Student and contain nonprimitives such as LocalDate and BigDecimal. The model is given meaning by referencing different subsets of Exams in different Courses and Students.

I need to save the relationships, the arrangement of these things, an arbitrary number of these things, as well as the data they hold.

What hope do I have of saving and loading such a model?

What options are there to implement a save/load feature on such a model, with such constraints?


Is it really industry standard for every java program to roll its own binary file format and create a monstrous apparatus to serialize and deserialize everything? It's that or JSON? What am I missing here? Do I have to just snapshot the VM somehow? Why is there not a standard practice for this?

650aa6a2
  • 172
  • 11
  • The "hard" way, might be to store the "data" with a generated "key" (doesn't matter so long it's valid for the file), then store the relationships using those keys as a seperate section...Have you looked at [JAXP](https://docs.oracle.com/javase/tutorial/jaxp/index.html)? – MadProgrammer Apr 12 '19 at 04:16
  • I didn't look at JAXP in detail because I understand it to be an XML utility, which I assumed would be subject to the same limitations as JSON regarding cyclic references. Is that wrong? – 650aa6a2 Apr 12 '19 at 04:25
  • I can’t be sure for your implementation, but it’s annotation based, so you might get more control over how the parsing gets done – MadProgrammer Apr 12 '19 at 04:26
  • I, too, was formulating some solution where I reduced references to some unique object identifier--like toString() or something--and stored the objects and their relationships separately. I will hold out for a simpler solution. – 650aa6a2 Apr 12 '19 at 04:26
  • Having “mucked” around with ms office, they tend to do something similar – MadProgrammer Apr 12 '19 at 04:28

1 Answers1

1

Circular references is a common use case and can be handled by providing @JsonManagedReference or @JsonBackReference. Check out this SO answer for more details. Another option is to implement custom serializer and solve circular references by yourself. Here is the example for the same.

However, do consider the following aspects before going ahead with using files as database

  • You will have to manage concurrent writes by yourself. If not correctly handled might result in corruption/loss of the data because files are not ACID compliant by nature.
  • The solution is not scalable as file size will grow. Time to serialize and deserialize will increase proportionately.
  • You won't be able to query easily on the data stored in the file. You will always have to deserialize data first and then query on POJOs.

I'll highly recommend checking SQLite which is small, fast, self-contained, high-reliability, full-featured, SQL database engine.

Yogesh Badke
  • 4,249
  • 2
  • 15
  • 23
  • I see now that Jackson does not have the same limitation as gson, and can handle circular references. You also included another option involving a custom serializer. For those reasons I will mark this as the answer after some testing. – 650aa6a2 Apr 12 '19 at 05:03
  • Notes on using files as a database are appreciated, though I don't know where that intention was signaled. – 650aa6a2 Apr 12 '19 at 05:04
  • So am I correct to conclude that in general, everyone either uses a custom serialization or goes with a json/xml library? – 650aa6a2 Apr 12 '19 at 05:05
  • Yes, for circular references above are solutions. Regarding the comment on files as db, the state is a data and dumping it to a file and reading back from file is kind of doing DB thing, no? :) – Yogesh Badke Apr 12 '19 at 08:19