17

I recently discovered GridFS which I'd like to use for file storage with metadata. I just wondered if it's possible to use a MongoRepository to query GridFS? If yes, can someone give me an example?

I'd also take a solution using Hibernate, if there is some.

The reason is: My metadata contains a lot of different fields and it would be much easier to query a repository than to write some new Query(Criteria.where(...)) for each scenario. And I hopefully could also simply take a Java object and provide it via REST API without the file itself.

EDIT: I'm using

  • Spring 4 Beta
  • Spring Data Mongo 1.3.1
  • Hibernate 4.3 Beta
Benjamin M
  • 23,599
  • 32
  • 121
  • 201

2 Answers2

16

There is a way to solve this:

@Document(collection="fs.files")
public class MyGridFsFile {

    @Id
    private ObjectId id;
    public ObjectId getId() { return id; }

    private String filename;
    public String getFilename() { return filename; }

    private long length;
    public long getLength() { return length; }

    ...

}

You can write a normal Spring Mongo Repo for that. Now you can at least query the fs.files collection using a Spring Data Repo. But: You cannot access the file contents this way.

For getting the file contents itself, you've got (at least) 2 options:

  1. Use file = gridOperations.findOne(Query.query(Criteria.where("_id").is(id))); InputStream is = file.getInputStream();

  2. Have a look at the source code of GridFSDBFile. There you can see, how it internally queries the fs.chunks collection and fills the InputStream.

(Option 2 is really low level, Option 1 is a lot easier and this code gets maintained by the MongoDB-Java-Driver devs, though Option 1 would be my choice).


Updating GridFS entries:

  • GridFS is not designed to update file content!
  • Though only updating the metadata field can be useful. The rest of the fields is kinda static.

You should be able to simply use your custom MyGridFsFileRepo's update method. I suggest to only create a setter for the metadata field.


Different metadata for different files:

I solved this using an abstract MyGridFsFile class with generic metadata, i.e.:

@Document(collection="fs.files")
public abstract class AbstractMyGridFsFile<M extends AbstractMetadata> {

    ...

    private M metadata;
    public M getMetadata() { return metadata; }
    void setMetadata(M metadata) { this.metadata = metadata; }

}

And of course each impl has its own AbstractMetadata impl associated. What have I done? AbstractMetadata always has a field called type. This way I can find the right AbstractMyGridFsFile impl. Though I have also a generic abstract repository.

Btw: In the meantime I switched here from using Spring Repo, to use plain access via MongoTemplate, like:

protected List<A> findAll(Collection<ObjectId> ids) {
    List<A> files = mongoTemplate.find(Query.query(Criteria
            .where("_id").in(ids)
            .and("metadata.type").is(type) // this is hardcoded for each repo impl
    ), typeClass); // this is the corresponding impl of AbstractMyGridFsFile
    return files;
}

Hope this helps. I can write more, if you need more information about this. Just tell me.

Benjamin M
  • 23,599
  • 32
  • 121
  • 201
  • Thanks for your help. But do you have the file as part of your MyGridFsFile class and how do you save it through repositories? – Sami Aug 05 '14 at 14:19
  • 2
    My repository has a custom method, which uses `gridFsOperations.save(...)` to save new files. The InputStream itself is not part of `MyGridFsFile`, I retrieve it via `myRepo.getInputStreamForFile(MyGridFsFile file)`. This method then calls `gridFsOperations.findOne(/* via file.getId() */).getInputStream()`. ... Of course you could inject the InputStream-retrieving mechanism into your `MyGridFsFile`, but then you'd have some code logic inside this POJO, which isn't nice, but it would work. – Benjamin M Aug 05 '14 at 14:28
  • 1
    What I am still not getting is how `file.getId()` will match the Id of the file in `fs.files`? Do you store your `MyGridFSFile` object in normal Mongo document and the file in GridFS? If yes, then how are they linked? – Sami Aug 06 '14 at 14:17
  • 2
    **PART 1:** Okay, let's start at the beginning! GridFS is basically just two MongoDB collections: `fs.files` and `fs.chunks`. `fs.files` stores things like `id`, `filename`, `md5`, etc. and `fs.chunks` stores the file contents. So when you use GridFS to store a file, it will simply create an entry in `fs.files` and (depending on file size) a few entries in `fs.chunks`. GridFS is no separate data store, it is just those two standard Mongo collections. Though it's no problem to save a file using `gridFsTemplate` and afterwards do a normal queries on `fs.files`. – Benjamin M Aug 06 '14 at 14:36
  • 2
    **PART 2:** Example: Save an image using `gridFsOperations.store(inputStream, filename, contentType, metadata);`. And afterwards query `fs.files` like: `mongoTemplate.find(new Query(), MyGridFsFile.class)`. It will return a list of all files stored in GridFS (it looks at the `@Document` annotation of `MyGridFsFile` to find the right collection to query). Now you can call `getId()` on the returned `MyGridFsFile`. And then you can do `GridFSDBFile file = gridFsOperations.findOne(Query.query(Criteria.where("_id").is(id)))` and call `file.getInputStream()` to retrieve the actual file contents. – Benjamin M Aug 06 '14 at 14:42
  • 1
    **PART 3:** The Mongo Java Driver: Have a look at the source code of `com.mongodb.gridfs.GridFS` class. There you can see how files get persisted. It uses `_bucketName+".files"` and `_bucketName+".chunks"`, where `_bucketName` by default equals `fs`. And if you look at `com.mongodb.gridfs.GridFSDBFile` source code, you can see how it splits a file into chunks and saves them (`writeTo` methods). And how it streams the chunks in the right order to generate an `InputStream` (`getInputStream` method). It's quite low level what happens in there `;)` – Benjamin M Aug 06 '14 at 14:54
  • @BenjaminM I'm using GridFS to store media files, I have used your guidelines, but finding it quite hard, I'm a quite new to Spring Data. Any chance you could help me out some time? – C_B Jun 18 '16 at 17:09
3

You can create a GridFS object with the database from your MongoTemplate, and then interact with that:

MongoTemplate mongoTemplate = new MongoTemplate(new Mongo(), "GetTheTemplateFromSomewhere");
GridFS gridFS = new GridFS(mongoTemplate.getDb());

The GridFS object lets you create, delete and find etc.

Trisha
  • 3,891
  • 1
  • 25
  • 39