Cloud File Sharing

NFS Storage Automation with AWS Lambda and Cloud Volumes ONTAP

Serverless cloud platforms such as AWS Lambda eliminate the need to manually deploy and scale the base infrastructure used by applications and services in the cloud. Instead of thinking about servers and docker containers, you simply write the code that should be executed and deploy it, leaving it to AWS to manage high availability and to respond appropriately to changes in user demand. It is for these reasons that the deployment of serverless microservices in cloud file sharing architectures is growing rapidly.

AWS Lambda functions will most often still require access to persistent data stores such as databases, object storage, and shared file systems. Amazon’s recent move to allow access to Amazon EFS file systems from AWS Lambda addresses the importance of shared files to various enterprise and data-focused applications. There are a number of use cases, however, where Amazon EFS may not provide an ideal solution for deploying NFS storage, and where Cloud Volumes ONTAP provides a compelling alternative. 

In this article we will look at the benefits of deploying NFS storage in AWS using Cloud Volumes ONTAP, and provide a practical example of accessing this storage from AWS Lambda functions.

Go directly to Worked Example: NFS Storage Access from AWS Lambda for the how-to section.

NFS Storage Integration with AWS Lambda

AWS Lambda allows development teams to productionize new code within minutes, while keeping the actual business requirements front-and-center. An AWS Lambda function is, as the name implies, simply an individual function that will be called into by AWS whenever the function’s triggering criteria has been met. These functions can be written in any one of a multitude of languages, such as Java/C#, JavaScript, Python, Ruby, or Go. Different AWS Lambda functions can also be connected together and integrated with other AWS services to build complex workflows using AWS Step Functions.

There are a number of options available when it comes to data storage for serverless deployments, and in the following sections we will examine the pros and cons of some of these approaches.

Amazon S3

Amazon S3 is an outstanding solution for durable and cost-effective capacity storage in the cloud. It is also possible to trigger AWS Lambda functions when a new file is uploaded to Amazon S3, thereby initiating a data pipeline. In order to access files within Amazon S3, you must use the standard AWS APIs, usually via the Amazon SDK, from within your function code. The IAM role associated with each function will govern the level of access granted to Amazon S3 buckets.

As Amazon S3 provides object storage, it is most often not an ideal solution for file storage. For example, a write to an Amazon S3 object is an all-or-nothing operation, you can’t simply do a partial update. Also, environments that require high performance access to data will encounter problems if Amazon S3 is the primary data store, as this type of data access is not what this storage is designed for.

Amazon EFS

Amazon EFS is a managed solution for deploying NFS storage within AWS, allowing you to very quickly deploy scalable file systems that natively integrate with AWS Lambda. Amazon EFS file systems that must be accessed by an AWS Lambda function must be registered in advance within the function’s configuration, along with a mount point. AWS will then ensure that the NFS file share is correctly mounted on any host that will execute the function. On execution, the function itself simply accesses the mount as a normal directory.

Although it can be very easy to get started with Amazon EFS, certain issues can present themselves after any period of real usage. Due to the system of burst credits used to determine AWS EFS performance, sustained high throughput activity can lead to a degradation in file access performance until credits have been replenished. This can be countered by either allocating more storage space, which would give access to a larger number of credits, or moving to provisioned mode. Both options mean incurring higher operational costs.

It should also be noted that Amazon EFS supports NFS v4, which may cause issues if the files being used by AWS Lambda must also be accessed by SMB/ CIFS or NFS v3 clients.

Cloud Volumes ONTAP

Using Cloud Volumes ONTAP, you can fully leverage NetApp’s NAS and SAN technology in any of the major hyperscaler environments, namely AWS, Azure, and Google Cloud Platform. NetApp has spent decades building NFS storage solutions that are recognized across the industry for their performance and scalability. Cloud Volumes ONTAP is an enterprise data management solution that provides consistency across multi-cloud and on-premises environments.

Cloud Volumes ONTAP builds upon the native compute and storage resources of the cloud environment in which it is situated. So, in AWS, Amazon EC2 and Amazon EBS are used to deploy NetApp storage services, with the additional option of using Amazon S3 as transparent capacity storage in a tiered storage architecture. For performance storage, Amazon EBS disks can also be combined together to form RAID groups.

On the NAS side, Cloud Volumes ONTAP supports both NFS v3 and v4, as well as SMB/ CIFS. All shared file systems deployed using Cloud Volumes ONTAP benefit from:

  • High Performance: Caching technology helps bring data closer to the client that needs to use the files.
  • Storage Efficiency: Reduce your cloud data storage costs and operational expenditure through transparent data compression, deduplication, and thin provisioning.
  • Highly Availability: Cloud Volumes ONTAP can be deployed in an AWS high availability configuration that provides zero downtime and zero data-loss across Availability Zones.
  • NetApp Snapshot™: Instantly create point-in-time copies of your files, either for backup or for data versioning.
  • FlexClone®: Create temporary writable copies of data from your snapshots, and dramatically simplify data-oriented testing with FlexClone data cloning.
  • Regional Replication: Data can be efficiently replicated between instances of Cloud Volumes ONTAP that may span AWS regions via SnapMirror®.

Worked Example: NFS Storage Access from AWS Lambda

In this section, we will provide a complete practical example of accessing NFS files hosted in Cloud Volumes ONTAP from AWS Lambda.

Deploy Cloud Volumes ONTAP

1. The first step in the process involves accessing NetApp Cloud Manager, which acts as the central web-based platform from which Cloud Volumes ONTAP deployments can be created and managed. You can do this by signing up for a NetApp Cloud Central account and starting a free 30-day trial of Cloud Volumes ONTAP.

2.After accessing Cloud Manager, we can now proceed to create a new instance of Cloud Volumes ONTAP. The wizard-style interface makes it very easy to configure the instance to our particular specifications, deploy an initial volume, and then share out this volume over NFS.

Add your first working environment

3. The Working Environments dashboard will update to show that our new Cloud Volumes ONTAP installation is ready for use after initialization is complete. We can use Cloud Manager to inspect the details of this deployment, manage volumes and file shares, and perform the majority of day-to-day administrative operations.

Working Environments Dashboard

4. In order for AWS Lambda to make a successful NFS connection to Cloud Volumes ONTAP, we must make an advanced configuration change that is only possible through an SSH connection to the Cloud Volumes ONTAP instance itself. We can find the IP address to connect to by inspecting the details of our instance in Cloud Manager.

Test - EC2 Instances

5. The configuration option that must be changed relates to allowing access to NFS shares over non-reserved client ports, i.e. from port numbers 1024 or higher. This is required as our AWS Lambda function will not be executed as root, and therefore will not be able to initiate a connection from a reserved port.

The following shows the commands necessary to perform this change, as per the NetApp documentation:

nfstest::> vserver show
                               Admin      Operational Root
Vserver     Type    Subtype    State      State       Volume     Aggregate
----------- ------- ---------- ---------- ----------- ---------- ----------
nfstest     admin   -         -         -          -         -
nfstest-01  node    -         -         -          -         -
svm_nfstest data   default   running   running     svm_       aggr1
                                                   nfstest_
                                                   root
3 entries were displayed.

nfstest::> vserver nfs modify -vserver svm_nfstest -mount-rootonly disabled

nfstest::>

 6. We’re now ready to connect to our NFS share. We can again use Cloud Manager to find the mount point for the share, which can be found within the Volumes section of the Cloud Volumes ONTAP instance.

Volumes - nfs_data

7. Hovering over the volume and selecting Mount Command will provide us the connection details we need to use from AWS Lambda.

Mount Volume nfs_data

Create AWS Lambda Function

The next step is to create a new AWS Lambda function using the AWS Console. As this function will need access to resources within our VPC, i.e. Cloud Volumes ONTAP, we will need to provide the IAM role used to execute the function with AWSLambdaVPCAccessExecutionRole permissions. After the function has been created, we must also provide our network configuration details under the VPC section of AWS Lambda function’s definition page.

nfs-access

We can now move to actually implementing our function. For this example, we will use Python, however, the same technique will translate to most other languages.

1. Using an NFS client library called libnfs, we will directly mount our NFS share from within the function code, and then proceed to perform file reads and writes.

The following shows the complete AWS Lambda function definition:

import json
import libnfs

def lambda_handler(event, context):
   # Initialize a connection to our NFS share
   nfs_share = libnfs.NFS('nfs://172.31.9.23/nfs_data')

   # Open a file for writing
   output_file = nfs_share.open('/test-file.txt', mode='w+')

   # Write some data into the file
   output_file.write("Hello from AWS Lambda!!\n")
   output_file.close()

   # Try opening the same file & reading it
   input_file = nfs_share.open('/test-file.txt', mode='r')
   file_contents = input_file.read()

   # Return the contents of the file
   return {
       "statusCode": 200,
       "body": file_contents
   }

2. To successfully deploy the function, we will need to package up our external dependencies along with the code, including the native libnfs.so.13 dependency used by the libnfs python library. This can be achieved by creating a zip archive containing all relevant files in a pre-defined directory structure, as described in the AWS documentation:

$ ls
lambda_function.py

$ mkdir package

$ pip install --target ./package libnfs
Collecting libnfs
Using cached libnfs-1.0.post4.tar.gz (48 kB)
Installing collected packages: libnfs
     Running setup.py install for libnfs ... done
     Successfully installed libnfs-1.0.post4

$ mkdir package/lib

$ cp /usr/lib64/libnfs.so.13.0.0 package/lib/libnfs.so.13

$ cd package

$ zip -r9 ../function.zip .
adding: libnfs/ (stored 0%)
adding: libnfs/_libnfs.cpython-38-x86_64-linux-gnu.so (deflated 73%)
adding: libnfs/__init__.py (deflated 74%)
adding: libnfs/libnfs.py (deflated 86%)
adding: libnfs/__pycache__/ (stored 0%)
adding: libnfs/__pycache__/libnfs.cpython-38.pyc (deflated 73%)
adding: libnfs/__pycache__/__init__.cpython-38.pyc (deflated 56%)
adding: lib/ (stored 0%)
adding: lib/libnfs.so.13 (deflated 67%)

$ cd ..

$ zip -g function.zip lambda_function.py
adding: lambda_function.py (deflated 35%)

$ aws lambda update-function-code --function-name nfs-access --zip-file fileb://function.zip
4iYqp+d2Bzvrfv2Pqxr17gJByEyGYufyP72IuTNzvRw=   298335         arn:aws:lambda:XXXXXXX:XXXXXXXXXX:function:nfs-access nfs-access     lambda_function.lambda_handler 2020-07-02T09:32:28.193+0000     Successful     128     f0ba0358-2f5d-4cb4-9e5a-333f2580a4f8   arn:aws:iam::XXXXXXXXXX:role/NetAppCVOLambda python3.8       Active 3       $LATEST
TRACINGCONFIG  PassThrough
VPCCONFIG       vpc-2fdfb248
SECURITYGROUPIDS       sg-0a065573
SUBNETIDS       subnet-fddc8ab4
SUBNETIDS       subnet-e92ec9b0
SUBNETIDS       subnet-ac2908cb

Running the AWS Lambda Function

We can now try executing the AWS Lambda function from within the AWS Console. As we can see from the screenshot below, the function is able to successfully connect to the NFS share hosted on Cloud Volumes ONTAP and read and write files as required. We can also mount this share normally from a Linux host to verify that the files have in fact been created as we would expect.

Running the AWS Lambda Function

Conclusion

AWS Lambda functions are able to integrate with various forms of AWS file storage, however, as we have demonstrated in this article, we can also use them to successfully access NFS shares hosted on Cloud Volumes ONTAP. This is achieved with minimal custom configuration and, due to the widespread availability of NFS client library implementations, should be compatible with most of the languages supported by AWS Lambda.

New call-to-action

-