Testing reveals a single cloud volume multi GCE instance upper limit of roughly 240,000 IOPS. These data points were generated based on the results of multiple Vdbench runs from fifteen GCE instances. The results are indicative of the performance a customer could expect to achieve against a single volume from applications run in a scale out (multiple compute node) configuration.
The sequential tests were done using the tools and configuration as procedures as above. In this case the maximum amount of throughput achievable against a single volume is ~3.4GiBps for both 100% reads and mixed 50% read/write workloads.
Testing reveals that the Cloud Volumes Service performs at minimal latency on Google Cloud Platform. Running in the us-central1 region, NFSv3 workloads drove ~168,000 IOPS before reaching the 2ms latency mark.
The table on the right pulls results generated during the testing for NFSv3 workload on Cloud Volumes Service for Google Cloud and information on Cloud Filestore listed by Google Cloud. Cloud Filestore enables customers to run smaller workloads they never thought possible in the cloud at a very low latency. Now, with the close partnership of NetApp and Google Cloud customers can also run their extremely demanding workloads at a low latency with Cloud Volumes Service for Google Cloud Platform!
Testing shows that when run against 15 n1-highcpu-16 GCE instances, a single cloud volume has an upper limit of roughly 306,000 IOPS in us-central1.
The same method was used to run sequential tests where the throughput reached ~3.1GiBps. However, the maximum amount of throughput that’s possible to generate against a single project is 3.3GiBps.
Windows applications can take advantage of excellent network latency across the board in Google Cloud. Testing shows that the us-central1 region achieved ~122,000 IOPS under 2ms latency whereas ~150,000 IOPS just pass the 2ms point.
The graphics to the right show performance of a synthetic EDA workload in NetApp Cloud Volumes Service for Google Cloud. Using 36 SLES15 n1 standard-8 GCE instances, the workload achieved:
Layout wise, the test generated 5.52 million files spread across 552K directories. The complete workload is a mixture of concurrently running frontend (verification phase) and backend workloads (tapeout phase) which represents the typical behavior of a mixture of EDA type applications.
The frontend workload represents frontend processing and as such is metadata intensive— think file stat and access calls—by majority metadata; this phase also includes a mixture of both sequential and random read and write operations. Though the metadata operations are effectively without size, the read and write operations range between sub 1K and 16K with the majority of reads between 4K and 16K and most of the writes 4K or less.
The backend workload, on the other hand, represents I/O patterns typical for the tapeout phase of chip design. It is this phase that produces the final output files from files already present on disk. Unlike the frontend phase, this workload is entirely comprised of sequential read and write operations, and a mixture of 32K and 64K OP size.
Graphically speaking, most of the throughput shown in the graph comes from the sequential backend workload and the I/O from the small random frontend phases–both of which happened in parallel.
The recently released nconnect mount option and NFSv3 was used in addition to the default mount options. Expect official NetApp support for the nconnect mount option and nfsv3 early 2020 and support for the same with nfsv4 early 2020.
For load testing MySQL in Cloud Volumes Service for Google Cloud, we selected an industry standard OLTP benchmarking tool and continued increasing user count until throughput reached flatline. By design, OLTP workload generators heavily stress the compute and concurrency limitations of the database engine–stressing the storage is not the objective. That said, the tool used, rather than the storage, was the limiting factor in the graphs.
The metrics in the graph, which maps out benchmark data for MySQL, are taken from nfsiostat on the database server, and, as such represent the perspective of the NFS client. We observed maximum throughput of 500MiB/s.
For this test, the following configuration was used: