3

I have two questions:

  1. As storage bucket names are unique, how do I keep bucket name exactly same in development environment and production environment. Or what are best practice for dev and prod environment in data based environment?

  2. How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

PS: Storage transfer allows copying between 2 buckets within same project, not cross project. I was not able to find bucket from different project even with search option. I searched using gs://another-project-bucket

Darpan Patel
  • 71
  • 1
  • 3
  • What do you mean copy data from one project to another? Do you mean copy data from one GCS bucket to another? – Kolban Dec 18 '19 at 19:59
  • You might want to consider multiple questions? Stackoverflow questions are usually desired to be one question per question posting. – Kolban Dec 18 '19 at 20:00
  • Personnaly, I created two projets myapp-prod and mayapp-test. In every file or command, I pass the project name as a variable. I think it is the easiest way if you begin. What cloud product of GCP are you using? – ThisIsMyName Dec 18 '19 at 20:39
  • @Kolban Resources are organized in projects in GCP. To separate environment -dev and prod - separate projects are created. Both the projects has cloud storage buckets but they cannot communicate directly without granting some additional permission as the boundary is project. But bucket are global resource and so for naming convention, it has to be unique globally outside the boundary of the project. – Darpan Patel Dec 19 '19 at 20:38
  • @ThisIsMyName thanks for recommendation. Major functional components (apart from network) being used in projects are- storage, bigquery, pubsub, cloud function, dataprep, scheduler – Darpan Patel Dec 19 '19 at 20:45

4 Answers4

4

First question

As storage bucket names are unique, how do I keep bucket name exactly same in development environment and production environment. Or what are best practice for dev and prod environment in data based environment?

You are correct. As far as Google Cloud Storage buckets is concerned, bucket names reside in a single Cloud Storage namespace. As per the documentation, this means that:

Every bucket name must be unique. Bucket names are publicly visible. If you try to create a bucket with a name that already belongs to an existing bucket, Cloud Storage responds with an error message. However, once you delete a bucket, you or another user can reuse its name for a new bucket.

As for best practices for development and production environment, I would say that the so-called "separation of concerns" would be the best option here. Having one single project for development purposes, and having a different project for production purposes separetly would be the best fit . Nonetheless, you can have both environments, env and prod, running within a single project; although, this option is not ideal in some cases.


Second question

How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

The answer can vary for this question:

  1. You can copy GCS bucket objects across projects using the gsutil cp command, REST APIs, or GCS Client Libraries (Java, Node.js, Python). More info can be found here.
  2. You can also achieve this using the Cloud Storage Data Transfer Service to move data from one Cloud Storage bucket to another, so that it is available to different groups of users or applications. Check the link for more information.

An example using gsutil cp would be as follows:

gsutil cp gs://[SOURCE_BUCKET_NAME]/[SOURCE_OBJECT_NAME] gs://[DESTINATION_BUCKET_NAME]/[NAME_OF_COPY]

Where:

[SOURCE_BUCKET_NAME] is the name of the bucket containing the object you want to copy. For example, my-bucket.

[SOURCE_OBJECT_NAME] is the name of the object you want to copy. For example, pets/dog.png.

[DESTINATION_BUCKET_NAME] is the name of the bucket where you want to copy your object. For example, another-bucket.

[NAME_OF_COPY] is the name you want to give the copy of your object. For example, shiba.png.


IMPORTANT: Make sure that you have the correct set permissions to perform this type of operation

You can also check How can I move data directly from one Google Cloud Storage project to another?.

sllopis
  • 2,292
  • 1
  • 8
  • 13
  • 1
    Cloud Storage Data Transfer service withing project allows transferring within cloud storage of the same project and not cross project. However gsutil cp works and i have used that method. – Darpan Patel Dec 30 '19 at 17:09
  • 3
    @DarpanPatel With Cloud Storage Data Transfer service you can move data across different projects, just you've to give the correct permissions to do that. [Here](https://stackoverflow.com/a/56982037/8791788) is a post where you can see how to do that. – Nibrass H Dec 31 '19 at 14:16
1
  1. As a best practice I'd recommend using different buckets for production and development, to avoid potentially having untested dev code impact production data.

  2. Copying is efficient (metadata-only, no data copying) if the source and destination objects have the same location and storage class.

Mike Schwartz
  • 11,511
  • 1
  • 33
  • 36
0

How do i copy data from one project to other. I tried searching but i could not get efficient way to copy between 2 projects.

  1. Create two buckets:

    gcloud projects create env-proj
    gcloud projects create env-proj2
    
  2. Set project property to source project:

    gcloud config set project env-proj
    
  3. Create a file in source project:

    nano file
    cat file
    # This is a file 
    
  4. Create a bucket in source project:

    gsutil mb gs://testbucket-env
    
  5. Copy the file to the bucket created:

    gsutil cp file gs://testbucket-env
    
  6. Set project property to destination project:

    gcloud config set project env-proj2
    
  7. Copy the file to destination project:

    gsutil mv gs://testbucket-env/file  file
    
  8. Testing:

    cat file
    # This is a file 
    
marian.vladoi
  • 7,663
  • 1
  • 15
  • 29
  • thanks, i will definitely try this one as this is very comprehensive – Darpan Patel Dec 19 '19 at 20:41
  • your process copies only file from one bucket to another cloud shell environment and not to the bucket. However, as I have access to the both project i can directly address both the bucket with gs:// and copy from one bucket to other one with just cp command – Darpan Patel Dec 20 '19 at 20:32
0

Create an environment variable on each environment like env=prod and env=dev. Then prefix your folders with the variable name so in dev evrything will be devfiles devdata etc and in prod prodfiles proddata prodetc... do the same for your db tables and then use the env variable in the code when you are referring to it so it would be ie: f'{env}files' for the files direcrtory f'{env}users' for your users table etc.

Everything will use the proper file directory table etc depending on the environment that it is running on without making any changes. always!

and that my friend saves a Ton of headaches.

Rick Il Grande
  • 331
  • 2
  • 4