Browse Source

S3 object prefix randomization now unnecessary (#613)

* S3 object prefix randomization now unnecessary
Thanos Baskous 2 years ago
No account linked to committer's email address
1 changed files with 1 additions and 3 deletions
  1. 1

+ 1
- 3 View File

@@ -717,9 +717,7 @@ S3
- **Multi-part uploads:** For large objects you want to take advantage of the multi-part uploading capabilities (starting with minimum chunk sizes of 5 MB).
- **Large downloads:** Also you can download chunks of a single large object in parallel by exploiting the HTTP GET range-header capability.
- 🔸**List pagination:** Listing contents happens at 1000 responses per request, so for buckets with many millions of objects listings will take time.
- ❗**Key prefixes:** In addition, latency on operations is [highly dependent on prefix similarities among key names]( If you have need for high volumes of operations, it is essential to consider naming schemes with more randomness early in the key name (first 6 or 8 characters) in order to avoid “hot spots”.
- We list this as a major gotcha since it’s often painful to do large-scale renames.
- 🔸Note that sadly, the advice about random key names goes against having a consistent layout with common prefixes to manage data lifecycles in an automated way.
- ❗**Key prefixes:** Previously randomness in the beginning of key names was necessary in order to avoid hot spots, but that is [no longer necessary]( as of July, 2018.
- For data outside AWS, [**DirectConnect**]( and [**S3 Transfer Acceleration**]( can help. For S3 Transfer Acceleration, you [pay]( about the equivalent of 1-2 months of storage for the transfer in either direction for using nearer endpoints.
- **Command-line applications:** There are a few ways to use S3 from the command line:
- Originally, [**s3cmd**]( was the best tool for the job. It’s still used heavily by many.