Amazon Simple Storage Service (Amazon S3) is generally used as highly durable and scalable data storage for images, videos, logs, big data and other static storage files. In addition to its popularity as a static storage service, some users want to use S3 as a file system. Doing so is complex but possible, with the help of S3FS or other third-party tools.
In this article, we will show you how to mount Amazon S3 buckets as file storage and discuss its advantages and drawbacks. It is important to note that AWS does not recommend to use S3 as a block level file system.
Advantages of Mounting S3 as a File System
Mounting an Amazon S3 bucket as a file system means that you can use all your existing tools and applications to interact with the S3 Bucket to perform read/write operations on files and folders.
Any application interacting with the mounted drive doesn’t have to worry about transfer protocols, security mechanisms, or S3-specific API calls. In some cases, mounting S3 as drive on an application server can make creating a distributed file store extremely easy.
For example, when creating a photo upload application, you can have it store data on a fixed path in a file system and when deploying you can mount a S3 bucket on that fixed path. This way, the application will write all files in the bucket without you having to worry about S3 integration at the application level.
Another major advantage is to enable legacy applications to scale in the cloud since there are no source code changes required to use an S3 bucket as storage backend: the application can be configured to use a local path where the S3 bucket is mounted. This technique is also very helpful when you want to collect logs from various servers in a central location for archiving.
Scripting Options for Mounting a File System to S3
There are a few options you have for mounting S3 as a local drive.
S3FS-FUSE: This is a free, open-source FUSE plugin and an easy-to-use utility which supports major Linux distributions & MacOS. S3FS also takes care of caching files locally to improve performance. This plugin simply shows S3 bucket as a drive on your system.
ObjectiveFS: ObjectiveFS is a commercial FUSE plugin which supports S3 and Google Cloud Storage backends. It claims to offer a full POSIX-compliant file system interface, which means that appends don’t need to rewrite entire files. It also promises performance comparable to a local drive.
RioFS: RioFS is a lightweight utility written using C language. It is comparable to S3FS but has a few limitations: RioFS doesn’t support appending to file, doesn’t support fully POSIX-compliant file system interface, and it can’t rename folders.
How to Mount an S3 Bucket as a Drive with S3FS
Mounting an S3 bucket using S3FS is a simple process: by following the steps below, you should be able to start experimenting with using S3 as a drive on your computer immediately.
Step 1: Installation
First step is to get S3FS installed on your machine. please note that S3FS only supports Linux-based systems and MacOS.
- Installation steps for MacOS
The easiest way to set up S3FS-FUSE on a Mac is to install it via HomeBrew. To install HomeBrew:
1. ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
2. Brew install s3fs, as shown below.
- Installation steps for Ubuntu:
On Ubuntu 16.04, using apt-get, it can be installed by using the command below:
sudo apt-get install s3fs
Step 2: Configuration
1. Once S3FS is installed, set up the credentials as shown below:
echo ACCESS_KEY:SECRET_KEY > ~/.passwd-s3fs
2. Now we’re ready to mount the S3 bucket. Create a folder the S3 bucket will mount:
s3fs <bucketname> ~/s3-drive
You might notice a little delay when firing the above command: that’s because S3FS tries to reach S3 internally for authentication purposes. If you don’t see any errors, your S3 bucket should be mounted on the ~/s3-drive folder.
To verify if the bucket successfully mounted, you can type “mount” on terminal, then check the last entry, as shown in the screenshot below:
3. The previous command will mount the bucket on the S3-drive folder. Once mounted, you can interact with the S3 bucket same way as you would use any local folder.
In the gif below you can see the mounted drive in action:In the screenshot above, you can see a bidirectional sync between MacOS and S3. The folder “test folder” created on MacOS appears instantly on S3.
Considerations When Using S3 as a File System
Now that we’ve looked at the advantages of using S3 as a mounted drive, we should consider some of the points before using this approach.
- When you are using Amazon S3 as a file system, you might observe a network delay when performing IO centric operations such as creating or moving new folders or files. The performance depends on your network speed as well distance from S3 storage region.
- Since Amazon S3 is designed for atomic operations, files cannot be modified, they have to be completely replaced with modified files. This doesn’t impact your application as long as it’s creating or deleting files; however, if there are frequent modifications to a file, that means replacing the file on S3 repeatedly, which results in multiple put requests and, ultimately, higher costs.
- As files are transferred via HTTPS, whenever your application tries to access the mounted S3 bucket first time, there is noticeable delay. Future or subsequent access times can be delayed with local caching.
- Each object has a maximum size of 5GB. When considering costs, remember that Amazon S3 charges you for performing IO operations. The overall object might cost less, comparing General Purpose SSD to IOPS storage for example, but the cost for IO could be higher.
From the steps outlined above you can see that it’s simple to mount S3 as a drive on your server, laptop, or containers. The technique can be very useful in creating distributed file systems with minimal effort, and offers a very good solution for media content-oriented applications.
But since you are billed based on the number of GET, PUT, and LIST operations you perform on S3, mounted S3 file systems can have a significant impact on costs, if you perform such operations frequently. This mechanism can prove very helpful when scaling up legacy apps, since those apps run without any modification in their codebases. Having a shared file system across a set of servers can be beneficial when you want to store resources such as config files and logs in a central location.
However, AWS does not recommend this due to the size limitation, increased costs, and decreased IO performance. If you have a use case that requires very high durability and a distributed file system, using S3 as a file system might be a good option for you.