AWS MSK Cluster Info Test
In a AWS MSK cluster, one of the brokers serves as the controller, which is responsible for managing the states of partitions and replicas and for performing administrative tasks like reassigning partitions.
This test monitors the partitions managed by the active controller in the broker and reports the partitions that are in offline state that leads to a lag in the read/write operations.
Target of the test : AWS Managed Service Kafka
Agent deploying the test : A remote agent
Outputs of the test : One set of results for each cluster executing in the target AWS Managed Service Kafka server.
Parameter | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The IP address of the AWS Managed Service Kafka Broker that is being monitored. |
Port |
Specify the port number at which the specified HOST listens. By default, this is NULL. |
AWS Default Region |
This test uses AWS CLI to interact with AWS Managed Service Kafka and pull relevant metrics. To enable the test to connect to AWS, you need to configure the test with the name of the region to which all requests for metrics should be routed, by default. Specify the name of this AWS Default Region, here. |
AWS Access Key ID, AWS Secret Access Key and Confirm Password |
To monitor AWS Managed Service Kafka, the eG agent has to be configured with the access key and secret key of a user with a valid AWS account. For this purpose, we recommend that you create a special user on the AWS cloud, obtain the access and secret keys of this user, and configure this test with these keys. The procedure for this has been detailed in the Obtaining an Access key and Secret key topic. Make sure you reconfirm the access and secret keys you provide here by retyping it in the corresponding Confirm Password text box. |
Timeout Seconds |
Specify the maximum duration (in seconds) for which the test will wait for a response from the server. The default is 10 seconds. |
Measurement | Description | Measurement Unit | Interpretation | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
State |
Indicates whether/not the cluster is active. |
|
The values reported by this measure and its numeric equivalents are mentioned in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate whether/not the broker cluster is active. |
||||||||||||||||||
Number of broker nodes |
Indicates the number of broker nodes. |
Number |
|
||||||||||||||||||
Enhanced monitoring type |
Indicates whether/not the target broker is ready for monitoring. |
|
The values reported by this measure and its numeric equivalents are mentioned in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate whether/not the target broker is ready for monitoring. |
||||||||||||||||||
Number of active controller |
Indicates the number of active controllers in the target broker. |
Number |
Only one controller per cluster should be active at any given time. |
||||||||||||||||||
Number of global partition |
Indicates the number of partitions across all topics in the cluster. |
Number |
Global Partition is updated when the Controller Event Thread gets a Topic Change, Topic Deletion, and Partition Reassignment request and is purged on Controller failover. Since Global Partition Count does not include replicas, the sum of the Partition Count values can be higher than Global Partition Count if the replication factor for a topic is greater than 1. |
||||||||||||||||||
Number of offline partition |
Indicates the number of partitions that are offline in the cluster. |
Number |
Alert will be given when this value is greater than 0. |
||||||||||||||||||
Number of global topic |
Indicates the number of topics across all brokers in the cluster. |
Number |
|
||||||||||||||||||
Percentage of disk space used for data logs |
Indicates the percentage of disk space used for data logs. |
Percent |
|