0

I have a Java app on Openshift (Tomcat7) and I need a plenty of cheap storage (TBs). Obviously Openshift would be too expensive to use, so I was thinking of Amazon S3.

  1. What would be the optimal way of getting access to a plenty of storage while having an app on Openshift?

  2. Is it possible to somehow connect postgreSQL running on Openshift to Amazon S3, so that postgreSQL would run on Openshift but save everything on Amazon S3? Basically I am looking for what is cheaper to use and that's why I am not sure about setting up postgreSQL on AWS directly instead of having it on Openshift.

Basically the main issue is getting a plenty of storage having an app on Openshift (Or other cheap hosting for Java-Tomcat project). What DB, technology, service is used - does not matter as long as it is free or cheap.

Nikita Vlasenko
  • 4,004
  • 7
  • 47
  • 87

1 Answers1

-3

What would be the optimal way of getting access to a plenty of storage while having an app on Openshift?

AWS S3 is the best option for you as per the price goes and also from durability and reliability point of view.

Is it possible to somehow connect postgreSQL running on Openshift to Amazon S3, so that postgreSQL would run on Openshift but save everything on Amazon S3? Basically I am looking for what is cheaper to use and that's why I am not sure about setting up postgreSQL on AWS directly instead of having it on Openshift.

Yes it is definitely possible to use AWS S3 with postgresql running on Openshift while saving the data to S3 follow the steps in this blog to configure AWS S3 in open shift.

https://blog.openshift.com/how-to-configure-an-aws-s3-bucket-store-for-openshift/

From your comments if you use AWS Dynamo DB that will be very good choice and for huge data storage use S3 even AWS best practice suggest usage of S3 for huge data storage. Check this link http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForItems.html#GuidelinesForItems.StoringInS3

But looking at your use case cost is going to be the most important entity that you need to take care or it will shoot up so using S3 is the best option for keeping the cost in control.

Piyush Patil
  • 14,512
  • 6
  • 35
  • 54
  • 3
    I believe it's **highly** unlikely that Postgres will be happy with its data directory being on S3. Same reasons as http://stackoverflow.com/questions/33295619/mysql-data-directory-on-s3. – ceejayoz Jul 10 '16 at 22:25
  • 3
    Not to mention latency would be horrendous. – ceejayoz Jul 10 '16 at 22:27
  • I already read it, but it is unclear: would just adding environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`) to existing project and restarting it be enough for the postgreSQL running on Openshift to use Amazon S3? – Nikita Vlasenko Jul 10 '16 at 22:31
  • So, maybe using `MongoDB` separately for images and videos is better? Would it integrate better with `S3`. If yes, is it possible to connect `MongoDB` on Openshift to `S3`? – Nikita Vlasenko Jul 10 '16 at 22:39
  • I will suggest test it out in a dev environmnet if the files are getting copied correctly. If all is good in dev you can add that to the production. – Piyush Patil Jul 10 '16 at 22:39
  • 1
    @ceejayoz Please read the question the user is talking about Connecting AWS S3 to a App running in Openshift cloud using a Postgres DB your question is talking about a error in Mysql DB there is no connection in both questions. – Piyush Patil Jul 10 '16 at 22:43
  • 1
    DB is irrelevant you can use Mongdb or postgres doesn't matter AWS S3 integrates good with Openshift and many of our Openshift clients prefer AWS S3 as it is quiet fast and durable and also cheap. – Piyush Patil Jul 10 '16 at 22:46
  • S3 integrates good with OpenShift, it does not however integrate good with PostgreSQL or any other database. Not as anything other than a place to store backup files anyway. – Mark B Jul 10 '16 at 22:50
  • Well the user is going to run the DB in open shift and going to store files in S3 so i guess my answer is correct. – Piyush Patil Jul 10 '16 at 22:52
  • Also @MarkB and ceejayoz instead of down voting my answer :) suggest the best cheap way with huge storage with good durability and fast processing to achieve the user use case. Also Openshift is very expensive fyi – Piyush Patil Jul 10 '16 at 22:58
  • 2
    @error2007s I downvoted because you are giving bad information. You can't run a database using S3 as the "disk". – Mark B Jul 10 '16 at 22:59
  • What about shifting to Firebase? Does it integrate better with S3? – Nikita Vlasenko Jul 10 '16 at 23:02
  • I mean DB running on Firebase, storing its data on S3. Or is it not possible at all and the only solution is to use AWS's postgre? – Nikita Vlasenko Jul 10 '16 at 23:03
  • 2
    It's not possible for a running database to store the "active" data on S3. That's just not going to work. S3 is great for backups though. – Mark B Jul 10 '16 at 23:06
  • OK, so what storage would you propose to store TBs of video/photo data that users would be generating using an app? What in your opinion is optimal ,cheaper? – Nikita Vlasenko Jul 10 '16 at 23:07
  • What is the DB going to be used for? – Piyush Patil Jul 10 '16 at 23:08
  • For storing structured data, but it would have a plenty of videos, photos too. An app will sign up users, and then users will generate a plenty of data that all would need to be stored for years, so obviously the amount of data needs to be huge. – Nikita Vlasenko Jul 10 '16 at 23:10
  • Basically iOS app has a backend in Spring. The app needs a plenty of storage. – Nikita Vlasenko Jul 10 '16 at 23:11
  • And how frequently do you think will the data be accessed or retrieved ? – Piyush Patil Jul 10 '16 at 23:13
  • On the scale 1-10, maybe 6. There could be several thousands of users (at the beginning) who will store each around 50Gb of data, and access whenever they want to any data they generated. Not that they would use it every day. But maybe once a week for sure. If you take into account that there are thousands of them, it may turn out that every day someone accesses something. – Nikita Vlasenko Jul 10 '16 at 23:16
  • Ok so if you use AWS Dynamo DB that will be very good choice and for huge data storage use S3 even AWS best practice suggest usage of S3 for huge data storage. Check this link http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForItems.html#GuidelinesForItems.StoringInS3 But looking at your use case cost is going to be the most important entity that you need to take care or it will shoot up so using S3 is the best option for keeping the cost in control. – Piyush Patil Jul 10 '16 at 23:18
  • Also I edited my answer if you think it is correct please mark it correct. – Piyush Patil Jul 10 '16 at 23:21
  • But, how to make then postgreSQL to store data on S3 if any DB on Openshift can not be integrated with S3. I am confused. – Nikita Vlasenko Jul 10 '16 at 23:29
  • 3
    S3 is not suitable for a database's primary storage. It's fine for backups, but it won't work as primary storage for MySQL, Postgres, MongoDB, etc. That said, if the 50GB of data is files like movies or something, S3 is suitable for that, because you shouldn't be storing the raw file binaries in a DB anyways. – ceejayoz Jul 10 '16 at 23:31
  • OK, got it. Thank you! – Nikita Vlasenko Jul 10 '16 at 23:35
  • Also Check this question you can use AWS Data Pipeline to pipe data from AWS Postgres RDS to S3 if you decide AWS to host your app http://stackoverflow.com/questions/26781758/how-to-pipe-data-from-aws-postgres-rds-to-s3-then-redshift – Piyush Patil Jul 10 '16 at 23:37
  • AWS is too expensive. – Nikita Vlasenko Jul 10 '16 at 23:40
  • So, I will not use their postgre or hosting for sure. – Nikita Vlasenko Jul 10 '16 at 23:40
  • 2
    Giving users 50GB each won't be cheap *anywhere*. – ceejayoz Jul 11 '16 at 00:35