No matter if you are migrating from on-premises to cloud or between cloud providers, it is a universal truth what the most tedious part of the process is: transferring data from one storage repository to another.
Depending on the volume of the data, this process can take up to several hours or even several days. A traditional way to do it implies deploying servers, establishing connectivity, and writing some scripts. However, for users that want to transfer data to Google Cloud Storage from a source such as a server or a cloud repository such as Amazon S3 or Azure Blob, there is a better and more enjoyable way to do this.
In this post we’ll take a close look at how you can use Google Cloud Storage Transfer Service to move data between any server (whether on premises or with another cloud provider) and Google Cloud. This can be a big help to users who have decided to migrate to Google Cloud, or for use with Cloud Volumes ONTAP on Google Cloud.
What Is the Google Cloud Storage Transfer Service?
The Google Cloud Storage Transfer service is a built-in feature from Google Cloud Storage, GCP’s object storage service. It enables the capability to create and process transfer operations to the GCP Storage service in a fully managed and serverless way, without the need for any custom code or infrastructure. It provides options for both one-time and scheduled recurring transfers.
This powerful service makes the migration to Google Cloud Storage much faster and decreases the operational overhead. By supporting multiple sources of data such as Amazon S3 or Azure Blob, an HTTP/HTTPS location, or a Google Cloud Storage bucket, it provides a lot of flexibility. For users who are in AWS now but are switching to a deployment in Google Cloud, the Cloud Storage Transfer Service is an optimal tool to use.
How to Transfer Data to Google Cloud Storage Using the Cloud Storage Transfer Service
To get started with the Google Cloud Storage Transfer Service and initiate the transfer of data, we will need to have a couple of things in place. In addition to a Linux server with our files stored in it, we will need to make those files available via HTTP/HTTPS to be able to transfer them using Google Storage Transfer.
Make the Source Files Available via HTTP/HTTPS
Our first step is to make your source files available via the internet.
1. In your Linux server, make sure you have installed Curl, OpenSSL, and Python 3.0.
2. Create a file named generate.sh in the directory where your data is located. The file will be used to generate an index.tsv that contains a list of files to be transferred. Your generate.sh file should contain the following:
IP=$(curl http://ipecho.net/plain -s)
for i in `ls -I generate.sh -I index.tsv`; do
filesize=$(stat -c%s "$i")
MD5Hash=$(openssl md5 -binary $i | openssl enc -base64)
echo -e $TSV > index.tsv
3. The script can be customized if needed. When ready, execute the script using /bin/bash generate.sh, this will generate a file named index.tsv.
4. You can initiate a web server in the current local directory where your files are located, using the command sudo python3 -m http.server 80
Generate the tsv index file and start a webserver in the current directory.
Create a New Google Storage Transfer Operation
Now we need to transfer the data to Google Storage.
1. Start by navigating to the Storage dashboard in the Google Cloud Platform Console. In the left-hand panel, select the option “Transfer,” and initiate the process by clicking the “Create transfer” button.
The first Transfer screen in Google Cloud Platform Storage.
2. Now select the source of our data. In this case, we will choose “List of object URLs.”
Fill in the details of the URL of TSV file. In our example, that is http://18.104.22.168/index.tsv. Click on the “Continue” button to move forward.
The initial step of creating a transfer is to define the data source.
3. With the source defined and saved, we need to select the Google Cloud Storage bucket to use as a destination and then schedule of the transfer operation. Note that the schedule can be configured as either a one-time or a periodical operation. Keep the default setting of “Run now” and then click “Save” to finalize the transfer job creation and initiate the operation.
Selecting the destination and schedule for the transfer.
4. Since the transfer job was created and configured with a schedule to run once, the transfer operation will start immediately. The screen will show all the transfer operations and their current status. If all goes well, when the transfer is finished, you will see the status change to “Completed.”
Here you can see the progress status of the transfer operation.
Verifying the Transfer to Google Cloud Storage
Now that the transfer was completed, you can head to the Bucket Browser on the left-hand panel menu and see the contents of the bucket. All the files that are in your Linux server will be now visible in your GCP cloud storage bucket.
Under the bucket details we can view and verify the transferred files.
The possibility to use the Google Cloud Storage Transfer service to automate data transfer from any server, both on-premises or in another public cloud such as Azure Blob and Amazon S3, to a Google Cloud Storage bucket is a very useful and time-saving feature. As demonstrated, it is an out-of-the-box functionality that does not require custom infrastructure and code, therefore reducing significantly the operational overhead and accelerating your Google Cloud migration.
For more on how to use Google Cloud, check out our articles on how to use the gsutil tool, how to switch between Google Cloud Storage classes, and about the different storage offerings on Google Cloud.
Since you are learning about the Google storage services offers and built-in capabilities, you might be interested in exploring additional premium services from NetApp that can be an integral part to deploying in the Google Cloud. To move data from any repository to the cloud, there is the Cloud Sync data migration service tool, and for a complete data management platform there is Cloud Volumes ONTAP. The same storage management capabilities that have been widely used in Cloud Volumes ONTAP for AWS and Azure such as storage efficiency, data protection and cloning are now available for Google Cloud.
Want to try it out yourself? Sign up here to get a 30-day preview of Cloud Volumes ONTAP for Google Cloud.