Table of Contents
- S3 Consistency
- S3 Storage Classes
- Lifecycle Management
- S3 Performance and Optimization
- S3 Object Lock Modes:
- Operation and Administration
- Sample Python Code
Amazon Simple Storage Service (Amazon S3) is an object storage service. The key characteristics of S3 are
- It's an object-based storage system. A file is treated as an object.
- Files are stored in Buckets. We can think of Buckets as folders.
- It provides unlimited storage.
- It provides version control.
- It's possible to set MFA Delete.
An object has the following attributes
- Key: This is the name of the object (file).
- Value: This is the data.
- Version Id: The version ID of the object.
- Access Control Lists: This is used to manage bucket level or object level permissions.
S3 Consistency Model
According to this article, S3 is now strongly consistent for both new and existing objects.
Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent. What you write is what you will read, and the results of a LIST will be an accurate reflection of what’s in the bucket. This applies to all existing and new S3 objects, works in all regions, and is available to you at no extra charge!
S3 Storage Classes
S3 provides multiple tiers of storage
- S3 standard
- S3 - IA (Infrequently Accessed)
- S3 One Zone - IA
- S3 - Intelligent Tiering
- S3 Glacier
- S3 Glacier Deep Archive
The table below compares different storage tiers:
We can move objects to different storage tiers based on their age. For example, we can configure a bucket so that objects older than 30 days will be moved from S3 Standard to S3 IA.
S3 Performance and Optimization
There are two main techniques to optimize S3 performance
- Multipart Uploads
- Byte-Range Fetches
The idea is the transfer smaller chunks of data in parallel. Note that there is 5GB limit for a single PUT action. If we want to upload a file larger than 5GB we need to use multipart upload capability. In fact, it's recommended that we should use multipart upload for objects larger than 100 Mb.
Another way to optimize S3 performance is to distribute content to Amazon edge locations with CloudFront. In this way, uses will not connect to S3 buckets directly to retrieve data; instead, they can connect to the endpoints of edge locations to get the data. Note that CloudFront supports both read and write, which means users can make changes to S3 buckets through edge locations.
There are three types of encryption:
- Encryption In Transit
- Server Side Encryption: Encryption At Rest
- S3 Managed Keys - SSE-S3
- AWS Key Management Service, Managed Keys - SSE-KMS
- Server Side Encryption With Customer Provided Keys - SSE-C
- Client Side Encryption
S3 Object Lock Modes:
- Governance Mode
- Compliance Mode
- Retention Periods
- Legal Holds
Operation and Administration
There are three different ways to share S3 buckets across accounts
- Using Bucket Policies & IAM. Programmatic Access Only
- Using Bucket ACLs & IAM. Programmatic Access Only
- Cross-account IAM roles. Programmatic and Console access.
Sample Python Code
""" In this example, we will first upload myFile.txt to S3 bucket mybucket and then dowload it from S3. We assume myFile.txt uses utf-8 encoding. """ import boto3 s3 = boto3.resource('s3') bucket = s3.Bucket("mybucket") # Define the key of the object key = "subFolderName/myFile.txt" # put an object with open("myFile.txt", "rb") as f: bucket.put_object(Key = key, Body = f) # read an object data = bucket.Object(key).get() contentOfFile = data.get("Body").read().decode("utf-8")
----- END -----
©2019 - 2021 all rights reserved