AWS - Introduction to DynamoDB

Subscribe Send me a message home page tags

#AWS  #introduction  #DynamoDB 

Table of Contents

Related Readings


Description copied from AWS website:

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance. All of your data is stored on solid-state disks (SSDs) and is automatically replicated across multiple Availability Zones in an AWS Region, providing built-in high availability and data durability. You can use global tables to keep DynamoDB tables in sync across AWS Regions.


Partition is a physical space where DynamoDB data is stored. The partition where an item is stored is determined by the partition key of that item.


The number of partitions needed by a table depends on the size of the table and the read/write capacity unites. The size of a partition is 10GB and it follows that

$$ \textrm{partitionBySize} = \textrm{total size} / 10 \textrm{GB} $$

The number of partition required by read/write capacity is determined by

$$ \textrm{partitionByCapacity} = (\textrm{total RCU} / 3000) + ( \textrm{total WCU} / 1000) $$

(This formula seems to be derived from the maximum traffic a partition can support documented in Best Practices for Designing and Using Partition Keys Effectively.)

The total partitions needed by a table is the max of the two

$$ \textrm{numOfPartition} = max(\textrm{partitionBySize}, \textrm{partitionsByCapacity}) $$

Primary Key

The primary key is the only required attribute for items in a table. It is specified when the table is created. There are two types of primary keys

Sort key is also called Range Key. It defines the storage order in partitions.


When using the Query API, we can get the items for a given partition key and the retured items are sorted based on the sort key. Here is the documentation:

For items with a given partition key value, DynamoDB stores these items close together, in sorted order by sort key value. In a Query operation, DynamoDB retrieves the items in sorted order, and then processes the items using KeyConditionExpression and any FilterExpression that might be present. Only then are the Query results sent back to the client.

A Query operation always returns a result set. If no matching items are found, the result set is empty.

Query results are always sorted by the sort key value. If the data type of the sort key is Number, the results are returned in numeric order. Otherwise, the results are returned in order of UTF-8 bytes. By default, the sort order is ascending. To reverse the order, set the ScanIndexForward parameter to false.

Recommendations for partition keys

Secondary Index

There are two types of secondary index in DynamoDB and they are very similar.

Global Secondary Index (GSI) behaves like a copy of a subset of the base table. When we create a GSI, we also need to specify the partition key and/or sort key for the index. However, partition key + sort key in a GSI doesn't need to be unique.

There is a concept called Attribute Projections. A projection is basically the set of attributes that is copied from a table into a secondary index. The partition key and sort key of the table are always projected into the index.

When the base table is changed, DynamoDB automatically propagate the changes to the GSI. The synchronization between the base table and GSI uses eventual consistent model.

Local Secondary Index is a similar to concept. The difference between GSI and Local Secondary Index is that the partition key in Local Secondary Index is the same partition key used in the base table. The table and Local Secondary Index data for each item collection is stored in a single partition (because they all have the same partition key). That's why it's called local secondary index.

Limitation of secondary index

Consistency Model

DynamoDB supports three different consistency models

Strongly consistent reads are more "expensive" than eventually consistent reads in two ways

Other Topics

----- END -----

Welcome to join reddit self-learning community.
Send me a message Subscribe to blog updates

Want some fun stuff?