AWS DynamoDB Transaction Test

If your application performs reads or writes at a higher rate than your table can support, DynamoDB begins to throttle those requests. The most important thing you can do to keep DynamoDB healthy is to make sure your provisioned throughput is always sufficient to meet your application needs. In order to correctly provision DynamoDB, and to keep your applications running smoothly, it is important to gather data for performance metrics like latency, request throughput, and throttling errors.

This test monitors the target server and reports the number of throttled requests, conditional check failed requests, transaction conflict, and read and write throttled events. In addition, this test also tracks the request latency and the number of online index throttled events. Thereby, the test helps administrators to optimize resource usage and improve the application performance of the target AWS DynamoDB server.

Target of the test : An AWS DynamoDB server

Agent deploying the test : A remote agent

Outputs of the test : One set of results for the target AWS DynamoDB server being monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the AWS DynamoDB server that is being monitored.

AWS Region

This test uses AWS SDK to interact with AWS DynamoDB and pull relevant metrics. To enable the test to connect to AWS, you need to configure the test with the name of the region to which all requests for metrics should be routed, by default. Specify the name of this AWS Region in this text box.

AWS Access Key ID, AWS Secret Access Key and Confirm Password

To monitor AWS DynamoDB, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm Password text box.

Timeout Seconds

Specify the maximum duration (in seconds) for which the test will wait for a response from the server. The default is 120 seconds.

DD Row Count

By default, the detailed diagnosis of this test, if enabled, will report only the top-10 tables. This is why, the DD Row Count parameter is set to 10 by default. If you want to include more or less tables in detailed diagnosis, then change the value of this parameter accordingly.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Maximum successful request latency

Indicates the maximum latency of successful requests to DynamoDB or Amazon DynamoDB Streams.

Seconds

Successful request latency can provide two different kinds of information:

  • The elapsed time for successful requests.

  • The number of successful requests.

Successful request latency reflects activity only within DynamoDB or Amazon DynamoDB Streams, and does not take into account network latency or client-side activity.

If the value of this measure is increasing above normal levels, you should quickly investigate since it can significantly impact your application’s performance. It can be caused by network issues, or requests taking too much time due to your table design.

The detailed diagnosis of Maximum successful request latency measure shows the Table name, Stream label, Operations, and Latency value.

Minimum successful request latency

Indicates the minimum latency of successful requests to DynamoDB or Amazon DynamoDB Streams.

Seconds

Throttled requests

Indicates the number of requests to DynamoDB that exceed the provisioned throughput limits on a resource (such as a table or an index).

Number

A throttled request will result in an HTTP 400 status code. It is essential to identify the throttling behaviors on time and take the necessary steps to prevent them. If not, continuous throttling can cause serious issues to your application.

  • Application performance will decrease due to the high number of retrying requests.

  • Users will receive outdated data if only the write requests are throttled.

  • The application can start losing data if it fails to retry the throttled write requests.

Correlate the throttled request with read/write throttle events to understand which event is throttling the request.

The detailed diagnosis of this measure shows the Table name, Operations, and Throttled requests.

Maximum conditional check failed request

Indicates the maximum number of failed attempts to perform conditional writes.

Number

The PutItem, UpdateItem, and DeleteItem operations let you provide a logical condition that must evaluate to true before the operation can proceed. If this condition evaluates to false, Conditional check failed requests is incremented by one. Conditional check failed requests is also incremented by one for PartiQL Update and Delete statements where a logical condition is provided and that condition evaluates to false.

A failed conditional write will result in an HTTP 400 error (Bad Request). An increase in failed requests will indicate an issue in the request sent.

The detailed diagnosis of Maximum conditional check failed request shows the Table name, and Conditional check failed requests.

Minimum conditional check failed request

Indicates the minimum number of failed attempts to perform conditional writes.

Number

Maximum transaction conflict

Indicates the maximum number of rejected item-level requests due to transactional conflicts between concurrent requests on the same items.

Number

A transactional conflict can occur during concurrent item-level requests on an item within a transaction. Transaction conflicts can occur in the following scenarios:

  • A PutItem, UpdateItem, or DeleteItem request for an item conflicts with an ongoing TransactWriteItems request that includes the same item.

  • An item within a TransactWriteItems request is part of another ongoing TransactWriteItems request.

  • An item within a TransactGetItems request is part of an ongoing TransactWriteItems, BatchWriteItem, PutItem, UpdateItem, or DeleteItem request.

Use the detailed diagnosis of Maximum transaction conflict to find out the Table name, and Transaction conflict details.

Minimum transaction conflict

Indicates the minimum number of rejected item-level requests due to transactional conflicts between concurrent requests on the same items.

Number

Read throttled events

Indicates the number of requests to DynamoDB that exceed the provisioned read capacity units for a table or a global secondary index.

Number

The value of these measures should always be equal to zero.

In a DynamoDB table, items are stored across many partitions according to each item’s partition key. Each partition has a share of the table’s provisioned RCU (read capacity units) and WCU (write capacity units). When a request is made, it is routed to the correct partition for its data, and that partition’s capacity is used to determine if the request is allowed, or will be throttled (rejected).

If the value of Read throttled events or Write throttled events are high, then, it indicates that the table’s consumed WCU or RCU is at or near the provisioned WCU or RCU, you can alleviate write and read throttles by slowly increasing the provisioned capacity.

Use the detailed diagnosis of Read throttled events and Write throttled events measures to find out the Table name, Global secondary index name, and Throttled events.

Write throttled events

Indicates the number of requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.

Number

Maximum online index throttle events

Indicates the maximum number of write throttle events that occur when adding a new global secondary index to a table.

Number

These events indicate that the index creation will take longer to complete, because incoming write activity is exceeding the provisioned write throughput of the index.

You can adjust the write capacity of the index using the UpdateTable operation, even while the index is still being built.

The WriteThrottleEvents metric for the index does not include any throttle events that occur during index creation.

The detailed diagnosis of Maximum online index throttle events shows the Table name, Global secondary index name, and Throttled events.

Minimum online index throttle events

Indicates the minimum number of write throttle events that occur when adding a new global secondary index to a table.

Number

Maximum throttled put record

Indicates the maximum number of records that were throttled by the Kinesis data stream due to insufficient Kinesis Data Streams capacity.

Number

A low value is desired for this measure.

If the value of this measure is high, then it indicates that the records are being throttled due to insufficient capacity.

Use the detailed diagnosis of this measure to find out the Table name, Delegated operation, and Record count.

Minimum throttled put record

Indicates the minimum number of records that were throttled by your Kinesis data stream due to insufficient Kinesis Data Streams capacity.

Number