2

Earlier, I asked a question https://stackoverflow.com/questions/35581090/can-i-use-resumable-upload-for-gae-blobstore-api about resumable uploading with Blobstire API. For myself, I decided that it is impossible to implement resumable uploading with Blobstire API. In this case i am trying to implement Google Cloud Storage with Java Client Library. At the moment I made the download my video file to bucket and serve video. My servlet look like in google example

   @Override
  public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException {
    GcsOutputChannel outputChannel =
        gcsService.createOrReplace(getFileName(req), GcsFileOptions.getDefaultInstance());
    copy(req.getInputStream(), Channels.newOutputStream(outputChannel));
  }

  private GcsFilename getFileName(HttpServletRequest req) {
    String[] splits = req.getRequestURI().split("/", 4);
    if (!splits[0].equals("") || !splits[1].equals("gcs")) {
      throw new IllegalArgumentException("The URL is not formed as expected. " +
          "Expecting /gcs/<bucket>/<object>");
    }
    return new GcsFilename(splits[2], splits[3]);
  }

  private void copy(InputStream input, OutputStream output) throws IOException {
    try {
      byte[] buffer = new byte[BUFFER_SIZE];
      int bytesRead = input.read(buffer);
      while (bytesRead != -1) {
        output.write(buffer, 0, bytesRead);
        bytesRead = input.read(buffer);
      }
    } finally {
      input.close();
      output.close();
    }
  }

Now I need to implement

  • resumable upload (due to poor internet on mobile devices)
  • uploading by chunck (due to limitation in size of one request with 32mb)

I realized, that serverside of resumable upload should be organized manually and my backend should be able to give me range of uploaded chunck and allow to continue booting in to OutputChannel.

The documentation for the GcsOutputChannel says:

This class is serializable, this allows for writing part of a file, serializing the GcsOutputChannel deserializing it, and continuing to write to the same file. The time for which a serialized instance is valid is limited and determined by the Google Cloud Storage service

I have not enough experience, so the question may be stupid: Please somebody tell me how to serialize my GcsOutputChannel? I do not understand where I can save the file containing the serialized object.

By the way, can anyone knows how long Google Cloud Storage service store that serialized object?

Community
  • 1
  • 1

1 Answers1

1

You can serialize your GcsOutputChannel using any Java serialization means (typically using ObjectOutputStream). If you run on AE you probably want to save that serialized bytes in the Datastore (as Datastore Blob). See this link for how to convert the serialized object to and from byte array.

Community
  • 1
  • 1
ozarov
  • 1,051
  • 6
  • 7
  • yes, you understood correctly, I run in GAE. I thought about using Blobstore API or even maintaining serialized object to the same, in the Cloud Storage. **But is it not excessive?** I mean, do Google API have not special mechanism for storing serialized GcsOutputChannel. It is written in the docks: _"The time for which a serialized instance is valid is limited and determined by the Google Cloud Storage service"_. Anyway, thank you very much, you've convinced me that a save serialized object in storage also possible solution – Yelizaveta Tilinina Mar 07 '16 at 10:12
  • No, there is not a specialized way to store GcsOutputChannel however there is a way to make it **storable**, so you can store it (basically the handle to the GCS resumable write and any content that could not be written [as GCS requires a minimum chunk of 256 KB except last]) any place you want. As said, a common place for that would be the Datastore. I forgot to mention that such handle is valid for one week (enforced by the service). – ozarov Mar 07 '16 at 17:29
  • ok thanck you. Last smoll question about that. Do I understand correctly that the serialized object will take more spase (because of the serialization algorithm) than the number of bytes written by me? – Yelizaveta Tilinina Mar 09 '16 at 14:46
  • I am not sure what you mean by "number of bytes written by me"? It is not going to hold all the bytes that were written to the GcsOutputChannel but rather only then ones that could not be flushed (reminder less than 256K). The overhead of the serialization is negligible in this case. – ozarov Mar 10 '16 at 00:31
  • Oh, sorry. I hope I began to understand.You mean that I have to save the only object GcsOutputChannel that hides a link to my data, not the data itself. Apparently, I have not properly understood what represents the channel. – Yelizaveta Tilinina Mar 10 '16 at 14:21
  • You write your data to GcsOutputChannel and when you are done writing *all* your data for the blob you close the channel (and then you are done with it). If you don't have all the data in one request (or can't have it in one request - per your "32 mb limit") and would like to do something like "write chunk[s] in request1", "write more chunks in request2", .... "write last chunk in requestN". then you can serialize the GcsOutputChannel in request X and continue using it after you Deserialize it in request X + 1. In that case you only close the channel after the last write of the last request. – ozarov Mar 10 '16 at 17:15
  • Yes, it is I understand. I don't understand where stored the data, which are written on requerst1...requestX before the serialized the GcsOutputChannel, if you say that the serialized GcsOutputChannel will fit in the Datastore – Yelizaveta Tilinina Mar 11 '16 at 10:52
  • when you are using GcsOutputChannel any writes on it are actually written to Google Cloud Storage (in blocks of 256K). If you decide that you can't write any more and would like to continue in some other request than you can serialize the GcsOutputChannel (which, as said, holds a handle to the blob you are writing to) and re-use it later (up to a week). Only when you close that channel the file will be finalized and visible for read operations. I am unfamiliar with your application and was assuming that you want to write the data from a client in smaller chunks and several requests. – ozarov Mar 12 '16 at 19:09
  • Ok, thanck you very match, I thinck I am understand – Yelizaveta Tilinina Mar 14 '16 at 11:16