We get asked a lot of questions, which is great. However, below you can find the most frequently asked questions, either on the Global File Cache (GFC) solution itself, technical aspects, support, services-related queries, or how the solution is licensed.
Global File Cache is a software-based solution that extends your Cloud Volumes storage footprint to your distributed and branch offices. Global File Cache helps enterprises centralize their unstructured data, consolidate branch office storage and infrastructure, eliminate branch office backups, increase productivity, and enable global collaboration with distributed file-locking, all while guaranteeing a high performance experience for their end users.
GFC allows centralized data to be accessed real-time by users in any location, while maintaining optimal performance. The technology leverages intelligent file caching technology to cache the active data sets, keeping the most relevant for a given location data local to the users. Additionally, the software leverages compression, streaming, and delta differencing to move the data (increments) between the cloud and the enterprise locations.
The solution works in any industry or business that uses unstructured data or file shares. Some of the largest Global 2000 customers, across industries such as Architecture, Construction, Engineering, Distribution, Financial Services, Manufacturing, and Media leverage this solution. From traditional Microsoft Office applications to the most massive and complex AutoCAD, Civil3D, Revit BIM models, and image libraries, GFC allows the cloud to serve as the single version of truth for the enterprise.
The software is integrates easily into your Cloud Volumes environment, attaching to the shares you've defined and extending visibility/access to those shares out to your users. The SMB file shares are mounted by the GFC Core instance in your cloud of choice, and then seamlessely extended to all the remote sites in your organization that have a small-footprint GFC edge software instance.
By centralizing and consolidating your distributed unstructured data, i.e. file servers, NAS or other storage devices, to a central cloud location, you will manage your backups (and be able to ensure security, compliance, etc.) in one central place. GFC software provides a software-based cache at each location, which does not host or store any authoritative data, only caching data for often-used files for that office. If you were to lose your cache instances, all your data is centrally stored and protected.
No, each GFC instance only contains a cache of data relevant to often-used files for that location - yet still under centralized control - which eliminates the risk of data loss in branch office locations.
The software is rapidly deployed via a Microsoft Windows Server (2012 R2, 2016, or 2019) virtual machine image (VHD or OVA). In many cases, the installation of both the GFC core and edge can be done in 30 minutes or less. NetApp is known for world-class support and Professional Services, which can be engaged to guide customers deploying GFC if needed, especially in large, complex global deployments.
Global File Cache software deploys on commodity hardware or on top of a Windows Server 2012 R2, Windows Server 2016, or Windows 2019 instance, using either a physical or virtual server instance.
Global File Cache core instance is deployed in your cloud environment of choice where the authoritative data resides in Cloud Volumes file shares (Cloud Volumes ONTAP, Cloud Volumes Service, Azure NetApp Files, or Amazon FSx for NetApp ONTAP). Additionally, each branch office requires a GFC edge instance containing the intelligent file cache, which can be deployed on a virtual or physical Windows Server instance.
Note: the initial core instance is typically deployed as a License Management Server, which activates your site-based subscription with the the GFC subscription service. More about the License Management Service (LMS) can be found in the Pricing & Licensing FAQ or Global File Cache user guide available in our download portal.
The minimum recommended configuration consists of 4 (v)CPU cores and 16GB of RAM are for any GFC instance, either GFC Core or Edge, along with NTFS volume capacity for the D:\ volume hosting the Intelligent File Cache (edge only) of at least 250GB reserved disk space. The software is deployed using a standalone executable, readily available .OVA or .VHD templates, or even using scripting/automation through PowerShell Desired State Configuration and Azure Automation.
The software creates a Virtual File Share (\\Edge\FASTData) at each location which looks and feels like a traditional file share. This file share presents centrally provisioned file shares in real-time, while data is centrally stored on one or more file shares in Cloud Volumes. This share can also be accessed a DFS link target within a domain or standalone DFS root.
Each GFC edge instance contains a unique intelligent file cache, hosted on the local cache volume (D:\). This cache resides on an NTFS file system and not only caches the active data sets, files, and folders, but also meta data and NTFS permissions, etc. Once a user opens a file through the GFC virtual file share, i.e. \\Edge\FASTData\DC\FS1\Share\Folder\Doc1.docx, the file will be centrally locked (in Cloud Volumes) on the users behalf, either a) presented from cache or b) streamed from the cloud using a powerful compression protocol over the WAN, and stored in the edge's intelligent file cache, maintaining the original folder structures, meta data, and NTFS permissions. Any subsequent edits will be delta differenced back to the datacenter authoritative source, i.e. \\FS1\Share\Folder\Doc1.docx.
As each cache is unique, and requirements may vary per site based upon the local user population, the sizing of the cache is important. You may have 500TB of central storage, but for a small site you may only need 500GB worth of cache space to facilitate the active data set for your users in that particular site. For larger sites, with up to 250 users, best practices dictate that you provision a larger cache volume (approximated as 2X the regularly active dataset for that location); i.e. if the active dataset is 1TB for a large office of designers, recommended would be to provision 2TB of storage for that local cache volume.
Many customers are challenged understanding the actual data footprint. In order to identify the active data set for your environment, there are tools to leverage such as TreeSize-Pro to understand what the active data set looks like. Typically, clients provision <1TB of storage in a smaller office (< 50 users), while others may reserve up to 5TB for cache in the largest offices of heavy users with large file formats.
GFC is deployed as a Windows Server virtual machine, hence it adheres to the limitations of NTFS as the underlying storage subsystem and file system; this means you can have cache volumes as large as 16TB using the standard NTFS 4KB cluster size, and cache sizes up to 256TB using the maximum 64KB cluster size.
Unlike other solutions, GFC software does not have a need to constantly replicate either data or metadata between locations, meaning each cache is unique and only contains the relevant data for that local user population. By its nature, it stays current based upon the data accesses of that location population. However, if the workflow would benefit from 'pre-seeding' active projects or folders in specific offices prior to their request by a user, you can leverage the pre-population feature of GFC. Pre-population allows you to stage to cache, and/or update all the files for a specific project that have been modified in the last X days, which further allows you to enable higher-performance collaboration on a global scale.
As data gets cached on your GFC edge instance, either by accessing files on-demand or using pre-population to pre-stage certain shares, folders, or files, the cache volume will gradually grow in terms of storage utilization. Once the cache volume hits 80% utilized of the cache size provisioned/reserved, it will schedule a purging job to be executed at 9PM local time (or a time of your choosing) that day. The purging process will clear out the cache using algrothims such as the Least Recently Used (LRU) algorithm by categorizing data into buckets, i.e. older than 2yrs, 1yr, 6 months etc., to clean out older, de-prioritized data from the cache. This dynamic process frees up space for new active file sets to be cached, without requiring user or administrative intervention.
There are multiple ways to approach this. We would typically recommend bringing up a GFC instance in parallel to your existing file server, followed by moving that location's data to Cloud Volumes using either on-line (i.e. robocopy data directly to Cloud Volumes via the GFC virtual file share) or offline data migration tools (from your cloud provider). Your NetApp team can discuss the right approach for you, depending on the size of your dataset and infrastructure available to facilitate the data migration to the cloud. In some cases, it is beneficial to pre-seed the cache with the active, raw data set, which allows you to pre-provision the cache with the most recent project files while moving all data to Cloud Volumes.
When a user opens a file from the GFC edge instance, the edge communicates with the core which obtains a lock on the file stored on Cloud Volumes. Once the core locks the file on the user’s behalf, a “lease” for that file is established between the core and the requesting edge, and subsequently a commensurate lock is granted to the end user on the edge volume. Locking information is kept centrally on the authoritative file server, so no locking information is replicated to other branch offices, which allows massive scaling to a virtually unlimited amount of offices. The locking mechanism is fully centralized, which means that locks will be maintained consistently on the Cloud Volumes platform. This also means that the central Cloud Volumes shares can be directly accessed (by applications, VDI farms, etc.) in parallel with accesses occurring through the GFC fabric.
Global File Cache leverages SSL encryption communication between the Core and the Edge, and can be further secured utilizing existing MPLS or VPN connectivity within the customer environment. Data is secured using Microsoft ACLs for NTFS permissions and access controls applied to the Cloud Volumes shares, which are seamlessly extended to the distributed locations. Additionally, files within the local intelligent file cache can be encrypted using technologies such as BitLocker.
The software uses a wide variety of technologies to keep the active data set close to the users in each of your locations by leveraging the intelligent file cache, which caches data either on-demand or through pre-population. Additionally, the software overlays the SMB protocol within a Windows environment by providing a local virtual file share in each location. This directly interacts with the NTFS volume of the intelligent file cache and leverages mechanisms such as compression, streaming, and delta differencing to efficiently move data between the location and the central instance of data in the cloud.
With a GFC solution, active files remain persistently cached in the edge instance at each location, which eliminates most of the movement of data over the WAN, thus saving bandwidth. In addition, with the GFC technology, only the changes to a given cached file are sent back to the authoritative copy in the cloud, so that even changes to very large files can be moved efficiently. GFC's delta differencing mechanism means that only incremental updates are occurring, drastically limiting the total amount of data going over the WAN - and even those de minimis changes are compressed/streamed by GFC's protocol. This provides maximum efficiency in locations with network challenges, allowing the users optimal performance.
In case of the core instance being unreachable, users can still access and work on their locally cached data — they're not stuck waiting for recovery, whether that's network or a server rollback. If the edge instance is unavailable, users can still go directly over the network back to Cloud Volumes and access the data directly. Note that failback can be automated via DFS-N for a seamless experience.
Since GFC sits transparently on top of the Microsoft operating system and integrates with the complete AD infrastructure, all existing ACL's, permissions, AD based authentication, and any environmental controls are honored at all locations. This means there is no additional security settings to administer, creating an easy to manage and scale workload.
GFC provides administrators custom event logs on all system messages and file operations. This can be further streamlined with the available SCOM Management Pack and Dashboard, which gives administrators 'single-pane-of-glass insight and reporting on the overall health of the GFC environment, servers, services, cache utilization, etc.
Previous Versions are managed at the Cloud Volumes layer on the authoritative copy of the data. However, GFC gives users at the distributed locations the ability to transparently navigate through the virtual file share and restore a previous version as needed.
GFC can provide a seamless transition of unstructured data to the cloud, in line with your cloud strategy. GFC supports your cloud of choice (across AWS, Azure, or Google Cloud) including multi-cloud scenarios, with the flexibility of supporting all members of the Cloud Volumes family. All of the benefits of consolidation, centralization, scale, and flexibility with no change in workflow, applications, or user experience.
GFC is licensed as an annual subscription, with the pricing increment being the number of locations which are granted access to the centralized Cloud Volumes. There is no charge for GFC core instances, giving customers the flexibility to deploy as many GFC core instances in as many clouds as would be required for the optimal configuration. Likewise, there is no pricing based upon the number of users, storage capacities in the cloud, etc.
The cost is based specifically upon the number of locations accessing the Cloud Volumes shares; hence your subscription will only increase if the total number of locations/edge instances increases. This is independent of the cloud infrastructure, number of users, amount of storage, etc. This allows for flexibility in a changing organization as business needs evolve.
A GFC deployment includes a License Management Server (LMS) which registers your site-based subscription with the GFC subscription service. This LMS instance keeps track of the licensed edges in your enterprise and regularly updates your subscription details with the GFC subscription service, i.e. contract end date or amount of sites licensed for.
GFC licensing is an annual subscription, which by most accounting standards would be recognized as OPEX. We do offer multi-year subscriptions, as well, which are normally treated the same way.
GFC, is unique in that it's not a 'replication/synchronization' (R/S) architecture, but instead is based upon truly a single authoritative instance of data. Consider having a building with your users spread over 25 floors: would you have a file server on each floor, and try to replicate/synchronize the data across these? Or, would you have everyone (regardless of the floor they are on) use a single, secure file server in the basement? This is the GFC approach. Further, GFC is all-software, without the requirement for dedicated hardware at each location. GFC uniquely scales across limitless distributed office locations and user counts, and supports all major hyperscalers simultaneously.
Typically, more than 80% of the total data sets are considered inactive...we just never know which ones they are, hence data needs to be seamlessly available. The GFC solution allows for consolidation of data and integration into the cloud of your choice to reduce complexity, maximize data protection, reduce overall storage total cost of ownership (TCO) by 60-80%, and provision resources more rapidly.
With your data centralized into an authoritative single set in your cloud of choice, GFC allows you to collapse extensive and expensive storage, along with backups, to see immediate savings. Not only are you able to eliminate backups, but also BCDR planning/testing, audit/compliance work, etc. to reduce the overall server and storage footprint across all of your distributed locations.
No, GFC does not control or manage your data; that's the domain of your data storage/management platform, Cloud Volumes, based upon the industry-leading ONTAP operating system. ONTAP gives you complete control of where it resides, on which resources, and how it's backed up, archived, and secured to meet your organizations specific needs for RTO/RPOs, SLAs, BCDR strategies, etc.
Not quite. Remember, GFC is unique in its single instance of authoritative data architecture. With GFC's global file locking mechanism, there is no need for distributed locks to be kept in sync via metadata synching. This allows GFC to maintain strict data consistency and integrity regardless of which user is accessing what information. When any application/user accesses the central authoritative copy of the file directly or through a GFC edge, a file lock is placed immediately on the central authoritative file. Any other app/user requesting access to edit the file will receive notification that the file is locked and in use, just as if they were in the same location using the same file server. Since GFC doesn’t replicate locking information across all sites, it doesn't rely on maintaining lock synchronization databases, which introduces the opportunity for data inconsistency and loss.