Table of Contents
- Related Readings
- Primary Key
- Partition Key and Sort Key
- Recommendations for partition keys
- Secondary Index
- Global Secondary Index
- Local Secondary Index
- Limitation of secondary index
- Consistency Model
- Other Topics (not covered in details)
- Choosing the Right DynamoDB Partition Key
- Best Practices for Designing and Using Partition Keys Effectively
- Using Global Secondary Indexes in DynamoDB
- Local Secondary Indexes
Description copied from AWS website:
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance. All of your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability. You can use global tables to keep DynamoDB tables in sync across AWS Regions.
Partition is a physical space where DynamoDB data is stored. The partition where an item is stored is determined by the partition key of that item.
The number of partitions needed by a table depends on the size of the table and the read/write capacity unites. The size of a partition is 10GB and it follows that
The number of partition required by read/write capacity is determined by
(This formula seems to be derived from the maximum traffic a partition can support documented in Best Practices for Designing and Using Partition Keys Effectively.)
The total partitions needed by a table is the max of the two
The primary key is the only required attribute for items in a table. It is specified when the table is created. There are two types of primary keys
- Partition Key: Partition key is a unique identifier for each record; sometimes this is called a Hash Key.
- Partition Key + Sort Key (Composite Primary Key)
Sort key is also called Range Key. It defines the storage order in partitions.
When using the
Query API, we can get the items for a given partition key and the retured items are sorted based on the sort key. Here is the documentation:
For items with a given partition key value, DynamoDB stores these items close together, in sorted order by sort key value. In a Query operation, DynamoDB retrieves the items in sorted order, and then processes the items using KeyConditionExpression and any FilterExpression that might be present. Only then are the Query results sent back to the client.
A Query operation always returns a result set. If no matching items are found, the result set is empty.
Query results are always sorted by the sort key value. If the data type of the sort key is Number, the results are returned in numeric order. Otherwise, the results are returned in order of UTF-8 bytes. By default, the sort order is ascending. To reverse the order, set the ScanIndexForward parameter to false.
Recommendations for partition keys
- Use high-cardinality attributes
- Use composite attributes
- Cache the popular items
- Add random numbers of digits from a predetermined range for write-heavy use cases
There are two types of secondary index in DynamoDB and they are very similar.
- Global Secondary Index
- Local Secondary Index
Global Secondary Index (GSI) behaves like a copy of a subset of the base table. When we create a GSI, we also need to specify the partition key and/or sort key for the index. However, partition key + sort key in a GSI doesn't need to be unique.
There is a concept called Attribute Projections. A projection is basically the set of attributes that is copied from a table into a secondary index. The partition key and sort key of the table are always projected into the index.
When the base table is changed, DynamoDB automatically propagate the changes to the GSI. The synchronization between the base table and GSI uses eventual consistent model.
Local Secondary Index is a similar to concept. The difference between GSI and Local Secondary Index is that the partition key in Local Secondary Index is the same partition key used in the base table. The table and Local Secondary Index data for each item collection is stored in a single partition (because they all have the same partition key). That's why it's called local secondary index.
Limitation of secondary index
- there is a limit to the number of indexes and attributes per index
- indexes take up storage space
DynamoDB supports three different consistency models
- Eventually consistent reads (the default)
- Strongly consistent reads
- ACID transactions
- DynamoDB transactions provide developers atomicity, consistency, isolation, and durability (ACID) across one or more tables within a single AWS account and region.
Strongly consistent reads are more "expensive" than eventually consistent reads in two ways
- It takes longer to achieve a strongly consistent reads because it may involve more data transfer.
- In provisioned mode, the charging unit is read/write capacity unit and one read request unit represents one strongly consistent read request, or two eventually consistent read requests, for an item up to 4 KB in size.
- Global Tables
- Globally distributed applications
- Based on DynamoDB streams
- Multi-region redundancy for DR or HA
- No application rewrites
- Replication latency under one second
- DynamoDB Stream
- Advanced techniques
- DynamoDB Accelerator (DAX)
- On-Demand Capacity
- Sparse Indexes
- Using Target Tracking method to try to stay close to target utilization.
- Currently does not scale down if table's consumption drops to zero.
- Workaround 1: Send requests to the table until it auto scales down.
- Workaround 2: Manually reduce the max capacity to be the same as minimum capacity.
- Also supports global Secondary Indexes -- think of them like a copy of the table.
----- END -----
©2019 - 2022 all rights reserved