Many workloads, such as genome analysis, training of machine learning models, High Performance Computing (HPC), and analytics applications depend on multiple compute instances accessing the same set of data. For these workloads, clusters of compute instances are commonly connected to a high-performance shared file system. Amazon FSx for Lustre makes it easy and cost-effective to launch and run the world’s most popular high-performance shared file system. And today we’re announcing new HDD storage options for FSx for Lustre that reduce storage costs by up to 80% for throughput-intensive workloads that don’t require the sub-millisecond latencies of SSD storage.
Customers can achieve up to tens of gigabytes of throughput per second while lowering their storage costs for workloads where throughput is the dominant performance attribute. Video rendering and financial simulations are two examples of these throughput-intensive workloads.
This announcement includes two new HDD-based storage options which are optimized for reading and writing sequential file data. One offers 12 MB/sec of baseline throughput per TiB of storage and the other offers 40 MB/sec of baseline throughput per TiB of storage, and both allow you to burst to six times those throughput levels. To increase performance for frequently accessed files, you can also provision an SSD cache that is automatically sized to 20% of your HDD file system storage capacity. On file systems that are provisioned with an SSD cache, files read from the cache are served with sub-millisecond latencies.
The new FSx file systems are comprised of multiple HDD-based storage servers and a single SSD-based metadata server. The SSD storage on the metadata servers ensures that all metadata operations, which represent the majority of file system operations, are delivered with sub-millisecond latencies.
HDD performance increases with storage capacity making it easy to scale out your storage solution without encountering file system bottlenecks. Here’s a summary of the performance specifications for both the new HDD storage options and the existing SSD storage options.
Traditionally, operating and scaling high performance file systems was costly and time consuming. Now with just a few clicks anyone can use FSx for Lustre for any compute workload. Launching the HDD-based file system is easy. Simply open the management console and click the Create file system button.
Chose FSx for Lustre and click Next.
FSx for Lustre offers two deployment types – Persistent and Scratch. HDD storage is available on persistent mode which is designed for longer-term storage and workloads. On persistent file systems, data is replicated and file servers are replaced if they fail whereas the scratch type are ideal for temporary storage and shorter-term processing of data. On scratch file systems, data is not replicated and does not persist if a file server fails. You can can find more detail on the difference between the two deployment options in this blog article.
Once you choose HDD as the Storage Type, you can select 12 or 40 MB/s per TiB for the Throughput per unit of storage. You can also add the SSD cache to accelerate file access by choosing “Read-only SSD cache” as Drive Cache Type.
You can also create a file system by CLI.
--storage-capacity <capacity> --storage-type HDD
--subnet-ids subnet-<your vpc subnet id>85b2c0ce --lustre-configuration
DeploymentType=PERSISTENT_1,PerUnitStorageThroughput=<12 or 40>,DriveCacheType=<NONE or READ>
For PerUnitStorageThroughput=12, acceptable values of storage capacity are multiples of 6000.
For PerUnitStorageThroughput=40, acceptable values of storage capacity are multiples of 1800.
Originally posted on AWS News Blog
Author: Harunobu Kameda